OCR Reader

Extracts text from a region near each output instance.

Remarks

Text can be extracted using an OCR Profile, or from the document's existing full page OCR results.

Inherits from: Result Processor

Properties

The following 11 properties are defined.

Property Name	Description
General
OCR Profile	Type: OCR Profile The OCR Profile to use for character extraction. If no OCR Profile is specified, then extraction will be performed from the document's full page OCR results.
Region	Type: Logical Rectangle Specifies a region, relative to each output instance, where OCR should be performed. If no value is specified, then the existing region of each instance will be used.
Relative To	Type: ContentAlignment, Default: TopLeft When a Region is specified, indicates which edge of the output value this region is relative to. Can be one of the following values: TopLeft - Content is vertically aligned at the top, and horizontally aligned on the left. TopCenter - Content is vertically aligned at the top, and horizontally aligned at the center. TopRight - Content is vertically aligned at the top, and horizontally aligned on the right. MiddleLeft - Content is vertically aligned in the middle, and horizontally aligned on the left. MiddleCenter - Content is vertically aligned in the middle, and horizontally aligned at the center. MiddleRight - Content is vertically aligned in the middle, and horizontally aligned on the right. BottomLeft - Content is vertically aligned at the bottom, and horizontally aligned on the left. BottomCenter - Content is vertically aligned at the bottom, and horizontally aligned at the center. BottomRight - Content is vertically aligned at the bottom, and horizontally aligned on the right.
Auto Snap Distance	Type: Logical Border Specifies the maximum distance for an auto snap operation, which automatically aligns the edges of the zone to lines on the document. An empty or zero value disables auto snap. If this value is set, then lines detected in the image will be used to adjust the position of the zone.
Auto Snap Margin	Type: Logical Border When the auto snap feature is in use, specifies an additional amount to shrink the zone on each edge.
Value Extractor	Type: Embedded Extractor An optional extractor to be executed against the OCR content.
Discard Misses	Type: Boolean, Default: False Determines what happens if the Value Extractor finds no matches.
Value Separator	Type: String Specifies the separator to be used in cases where the Value Extractor matches multiple values in the OCR data. The following special escape sequences may be used: \r - Carriage return \n - Line feed \t - Tab \f - Form feed \s - Space
Line Separator	Type: String When capturing multiple lines of text, specifies how line breaks will be represented in the output. The following special escape sequences may be used: \r - Carriage return \n - Line feed \t - Tab \f - Form feed \s - Space
Exclude Anchor	Type: Boolean, Default: False If enabled, the text of the label will be automatically excluded from the output.
Output Full Region	Type: Boolean, Default: False Specifies whether the highlight region of each output instance will reflect the full OCR area, or only the area containing text. If true, the full OCR area will be output. If false, the region will reflect the bounding box of all characters found during OCR. Turning this option on is normally most useful when tuning the region, as it displays the actual bounds where OCR is performed.

See Also

Embedded Extractor, Logical Border, Logical Rectangle, OCR Profile

Used By