Data Instance

A Data Instance represents a segment of text content within a document.

Remarks

Data Instances can represent anything from a single character to the entire content of a document. Data Instance objects are the input to and the output from all ESP™ extraction operations, and also represent the format in which document metadata is stored by the Extract activity.

Data Instances are created during the extraction process, when the raw OCR data for a Batch Folder object is loaded, forming a Document Instance which represents the entire content of the document. When Extraction is performed, this root instance becomes the source from which all data elements at the root of the Data Model will perform their extraction, and the extracted results will be saved as children of the Document Instance.

List of Data Instance Types

Name Description
 Document Instance Represents the entire content of a document, and serves as the root of the Data Instance hierarchies generated by the Extract activity.
 Field Class Instance Represent a DataInstance captured by a Field Class.
 Field Instance Represents the value of a Data Field object.
 Section Instance Represents the value of a Data Section object.
 Section Instance Collection Represents the value of a Data Section object.
 Table Cell Instance Represents the value of a table cell.
 Table Instance Represents an instance of a Data Table object on a document.
 Table Row Instance Represents a table row in a Table Instance.
Text Line Instance Represents a line of text.

Properties

The following 9 properties are defined.

Property Name Description
General
Type Display Name Type: String

The type display name of the Data Instance.

Value Type: String

The value of this data instance.

Name Type: String

The name of this data instance.

Confidence Type: Double, Default: 0%, Range: 0% - 100%

The confidence level assigned to this instance.

Content Type Type: Content Type

The Content Type associated with this Data Instance.

Location Type: Rectangle, Default: (0,0):(0,0)

The location of this instance on the page.

Source Information
Index Type: Int32, Default: 0

The starting index of this instance within the parent instance.

Length Type: Int32, Default: 0

The length of this instance, in characters, within the parent instance. Please note that length represents the length of the element within the parent instance. It does not necessarily indicate the number of characters in the Value property or the number of items in the OcrResults object.

Page No Type: Int32, Default: 0

The 1-based page number on which this item appears.