Pattern-Based

Uses a regular expression to select a sequence of child extractor results.

Remarks

Child extractor results are referenced using @ExtractorName, where ExtractorName is the name of a child (or referenced) extractor. For example, consider the tabular data below, which represents information from a college transcript:

GE140 WORLD CIVILIZATION I 3.00 A 12.00
PSY212 GENERAL PSYCHOLOGY 3.00 A 12.00
GE185 HEALTH CONCEPTS 2.00 C 4.00

Three extractors are created for the Data Type:

Course No - Matches 'GE140', 'PSY212', etc.
Decimal - Matches '3.00', '12.00', etc.
Letter Grade - Matches 'A', 'C', etc.

The following collation expression could then be used to select the entire line:

@Course_No .*? @Decimal @Letter_Grade @Decimal

The expression can be further expanded to include group names, mapping values directly to table column names:

(?<Course_No>@Course_No)
(?<Description>[^\r]*?)
(?<Hours>@Decimal)
(?<Grade>@Letter_Grade)
(?<Points>@Decimal)

Inherits from: Collation Provider

Properties

The following 3 properties are defined.

Property Name	Description
General
Pattern	Type: String Defines the regular expression to be used for selecting output instances. Type '@' in the regular expression editor to view a list of child extractor names.
Case Sensitive	Type: Boolean, Default: False Indicates whether the regular expression should be evaluated in a case-sensitive manner.
Preprocessing Options	Type: Text Preprocessor Specifies options for processing text prior to running the regular expression.

Used By

Data Type

Pattern-Based

Remarks

Properties

See Also

Used By