The OCR component is an optical text recognition system that recognizes text set in practically any font. The component accepts image files as input and creates searchable data files in one of the supported formats or passes the recognized text as RRTs to the next component in the process. In particular, this component can produce searchable PDF and PDF/A files using the PDF and PDF/A standard file definitions.
The component's characteristic features are high-recognition accuracy and low sensitivity to print defects. These features are the result of special recognition technology that is based on the principles of Integral Purposeful Adaptive (IPA) perception, which is fully implemented in the OCR component.
You can use various formatting and detection parameters to optimize OCR for your needs.