Troubleshooting tips

Problem description Solution
Poor-quality OCR results

Inaccuracies in the OCR process can have many causes. It is recommended that you perform an analysis of types of paper, scanners, and resolution levels to optimize your OCR results before setting up OCR processes.

The following are some common tips for increasing OCR accuracy.

  1. File format — Color documents do not capture image details accurately. When the process input is a color image, you achieve lower-quality OCR documents. Review your color document requirements and consider higher-resolution scanning to increase accuracy.
  2. Document quality — Low-quality paper documents are another major cause of lower OCR accuracy. Lower-quality documents generally increase the error rate for OCR. When working with such documents, consider the following factors to increase your OCR accuracy:
    • Try to discover ways to get higher-quality paper documents.
    • Consider a scanner with different scanner bulb color, which might work better with the paper color of your document.
    • Test a higher-level scan resolution.
    • Consider using the image process in advance of OCR to clean up the image.
When you export file to HTML format, the images are not displayed in the output file. This problem may appear when you use renaming schema. When HTML is used as the output file format, you get an HTML file and some images to which the HTML file references. If you rename the images then the internal links will be broken. Therefore, the rename schema should not be used when exporting to HTML format.
Some setting of the output document has a value different from the specified one. Make sure that this setting was specified correctly. If a setting was defined incorrectly or uses an RRT that was replaced with the incorrect value, the component replaces the incorrect value by the default value at run time, if the default value exists. 
When using the Zoned OCR Matches wildcard validation setting on a zip code with 5 numbers, the validation might fail. Use [#][#][#][#][#] to validate the zip code.
When using the Zoned OCR Matches regular expression validation setting on a zip code with 5 numbers, the validation fails if you use multipliers {...}.

Use the following to validate the zip code:

(0|1|2|3|4|5|6|7|8|9)(0|1|2|3|4|5|6|7|8|9)(0|1|2|3|4|5|6|7|8|9)(0|1|2|3|4|5|6|7|8|9)(0|1|2|3|4|5|6|7|8|9)