A Data Type defines extraction logic for a distinct type of data, such as a field value or a table row. Each data type defines one or more extractors, along with settings which control how the extractor results are transformed into a final result set.
At runtime, a Data Type will execute the following extractors, in the order shown.
Inherits from: Grooper Node
The following 16 properties are defined.
Property Name | Description |
---|---|
General | |
Value Type | Type: Storage Type, Default: String
Defines the type of data this extractor will capture. Can be one of the following values:
|
Culture Filter | Type: List of Culture Data
Defines a list of cultures supported by this extractor. If this value is empty, the extractor will execute against all documents. Otherwise, the extractor will only execute on documents which map to one of the specified cultures. |
Description | Type: String
Generic property allowing an administrator to document the purpose of this Grooper Node. |
Data Extraction | |
Pattern | Type: Data Pattern
Defines an internal Data Pattern which can be used in place of a child Data Format or Data Type. This property is useful for simple extractions where only one format needs to be defined. |
Referenced Extractors | Type: List of Grooper Node
Defines an optional list of external extractors to be executed. At runtime, referenced extractors execute after the internal pattern and the direct children have been executed. |
Input Filter | Type: Embedded Extractor
An optional extractor to be used for transforming input prior to extraction. Input filters are used to select a subset of the source content prior to running the extractors. In many cases extraction logic can be simplified if scope is limited to a small portion of the document. When an input filter is specified, it is executed against the source. The Data Type's extractors are then executed on each instance returned by the input filter. |
Exclusion Extractor | Type: Embedded Extractor
An optional extractor to be used for filtering undesirable results from the result set. Any output instances which overlap with an exclusion instance will be discarded. |
Subtraction Extractor | Type: Embedded Extractor
An optional extractor to be used for removing content from output values. If an extractor is specified, it will be executed against each final output value. Any content which matches the extractor will be removed from the output value. If the resulting output value is empty or contains only whitespace characters, the entire output value will be discarded. The extractor specified here MUST be match a contiguous sequence of characters within the text flow. As such, the extractor cannot use any Collation Provider Methods which combine instances geometrically. |
Output | |
Collation | Type: Collation Provider
Defines how instances from individual extractors are transformed into the final output. Can be one of the following values:
|
Order By | Type: SortOrder, Default: Position
Controls the output order of the result set. Can be one of the following values:
|
Direction | Type: SortDirection, Default: Ascending
Controls the output order of the result set. Can be one of the following values:
|
Result Filter | Type: Result Filter
Specifies options for filtering output instances. |
Result Options | Type: Result Options
Specifies optional processing for each output instance. |
Post Processing | Type: Result Processor
Specifies an optional post-processing operation to the applied to each output instance. Can be one of the following values:
|
Deduplication | |
Deduplicate Locations | Type: Boolean, Default: False
If True, instances with overlapping zones will be de-duplicated, with precedence given to larger data elements. |
Deduplicate Values | Type: Boolean, Default: False
If True, duplicate values will be eliminated, leaving only the first instance of the value. |
Command Name | Shortcut Keys | Description | |
---|---|---|---|
Add Multiple Items | Creates multiple items as children of the selected object. | ||
Clear Children | Deletes all children of the selected object(s). | ||
Export to Zip Archive | Exports a set of Grooper nodes to a ZIP archive. | ||
Publish to Grooper Repository | Publishes one or more Nodes to one or more Target Grooper Repositories. | ||
Unpublish | Unpublishes a set of Grooper Nodes to a Target Grooper Repository. |
Tab Name | Description |
---|---|
Data Type - General | Provides a user interface displaying the properties of a Data Type as well as an interface for testing the Data Type using test batch documents. |
Grooper Node - Scripting | Provides script viewing, compilation, management, and basic editing features. |
Grooper Node - Contents | Provides a user interface for viewing and managing the children of a Grooper Node. |
Grooper Node - Advanced | Displays detailed information about Grooper Node objects, and provides administrative functions for managing them. |
Collation Provider, Culture Data, Data Pattern, Embedded Extractor, Grooper Node, Result Filter, Result Options, Result Processor, Storage Type