Pattern-Based

Uses a regular expression to select a sequence of child extractor results.

Remarks

Child extractor results are referenced using @ExtractorName, where ExtractorName is the name of a child (or referenced) extractor. For example, consider the tabular data below, which represents information from a college transcript:

GE140 WORLD CIVILIZATION I 3.00 A 12.00
PSY212 GENERAL PSYCHOLOGY 3.00 A 12.00
GE185 HEALTH CONCEPTS 2.00 C 4.00

Three extractors are created for the Data Type:

The following collation expression could then be used to select the entire line:

@Course_No .*? @Decimal @Letter_Grade @Decimal

The expression can be further expanded to include group names, mapping values directly to table column names:

(?<Course_No>@Course_No)
(?<Description>[^\r]*?)
(?<Hours>@Decimal)
(?<Grade>@Letter_Grade)
(?<Points>@Decimal)

Inherits from: Collation Provider

Properties

The following 3 properties are defined.

Property Name Description
General
Pattern Type: String

Defines the regular expression to be used for selecting output instances. Type '@' in the regular expression editor to view a list of child extractor names.

Case Sensitive Type: Boolean, Default: False

Indicates whether the regular expression should be evaluated in a case-sensitive manner.

Preprocessing Options Type: Text Preprocessor

Specifies options for processing text prior to running the regular expression.

See Also

Text Preprocessor

Used By

Data Type