Detect Language

Detects the dominant language of each Batch Page of the Batch and marks the page with the appropriate language code identifier.

Inherits from: Unattended Activity

Properties

The following 7 properties are defined.

Property Name Description
General
Vocabulary Type: Embedded Lexicon

Specifies a multi-language vocabulary lexicon to be used for language detection.

Feature Extractor Type: Embedded Extractor

Defines an extractor to match features on the document.

Minimum Confidence Type: Double, Default: 25%

The minimum percentage of words which must match the detected language.

Detect Locale Type: Boolean, Default: False

If enabled, the Regional Culture Data will be detected. If not, only the Language will be detected.

Processing Options
Error Disposition Type: IssueDisposition, Default: Flag, Log

Determines what happens when an error occurs processing an activity.

Maximum Consecutive Errors Type: Int32, Default: 0

The maximum number of consecutive errors, after which a critical stop will be raised. A critical stop will cause services to stop running.

Concurrency Mode Type: ConcurrencyMode, Default: Multiple

Specifies the parallel processing mode for this activity. Can be one of the following values:

  • Multiple - Multiple instances can run concurrently.
  • PerMachine - Only a single instance can run per machine.
  • Single - Only a single instance can run per Grooper repository.
This value determines the type of Thread Pool on which the activity can be executed.

See Also

Embedded Extractor, Embedded Lexicon

Used By

Batch Folder - Apply Activity, Batch Process Step