Performs automated document classification using training and/or rules defined in a Content Model.
This activity attempts to assign a Document Type to each Batch Folder object in scope. The set of possible document types and the method of classication
are defined in a Content Model. To configure and use this activity, the following prerequisites must be performed:
In cases where a document cannot be classified confidently, the Default Content Type specified on the Content Model will be assigned. If
the Content Model has no Classify Method assigned, then the Default Content Type will be assigned to all documents.
Property Name |
Property Type |
Description |
ActivityStats |
Grooper.StatDictionary |
Dictionary of statistics for the batch processing activity. |
ClassifyLevel |
Grooper.Core.ContentType.ClassificationLevel |
The level within the Content Model where the set of allowed content types exist. In most cases, documents should be classified as document types. In some cases, however, it may be convenient to classify
documents into a category rather than as a specific document type. In such cases, a numeric level can be specified, which indicates a
specific number of levels below the Content Model Scope.Can be one of the following values:
- DocType: Classifies to the Document Type level. This setting is only valid when the task is running with Folder scope.
- Level1: Classifies to level 1 of the Content Model.
- Level2: Classifies to level 2 of the Content Model.
- Level3: Classifies to level 3 of the Content Model.
- Level4: Classifies to level 4 of the Content Model.
- Level5: Classifies to level 5 of the Content Model.
- Level6: Classifies to level 6 of the Content Model.
- Level7: Classifies to level 7 of the Content Model.
- Level8: Classifies to level 8 of the Content Model.
- PageType: Classifies to the Page Type level. This setting is only valid when the task is running with Page scope.
- FormType
|
ConcurrencyMode |
Grooper.ConcurrencyModeAttribute.ConcurrencyMode |
Specifies the parallel processing mode for this activity. This value determines the type of Thread Pool on which the activity can be executed.Can be one of the following values:
- Multiple: Multiple instances can run concurrently.
- PerMachine: Only a single instance can run per machine.
- Single: Only a single instance can run per Grooper repository.
|
ContentModelScope |
Grooper.Core.ContentType |
The Content Model or Content Category containing the set of allowed Document Types. |
ErrorDisposition |
Grooper.Core.UnattendedActivity.IssueDisposition |
Determines what happens when an error occurs processing an activity.A combination of the following flags:
- None: The issue will be ignored, and the item will complete successfully.
- Flag: The associated Batch Folder or Batch Page will be flagged.
- Log: The issue will be logged to the Grooper log. The log can be viewed from the Grooper Root node under the Batch Event Viewer tab.
- Stop: The Batch will stop processing, be set to an error state, and all pending tasks will be deleted.
|
HasReferenceProperties |
System.Boolean |
Returns true if the object has properties which reference Grooper Node objects. |
IsEmpty |
System.Boolean |
Returns true if all properties with a ViewableAttribute are set to their default value. |
IsWriteable |
System.Boolean |
Returns true if the object is writable, or false if it is not. |
MaximumConsecutiveErrors |
System.Int32 |
The maximum number of consecutive errors, after which a critical stop will be raised. A critical stop will cause services to stop running. |
ModelRefreshInterval |
System.Int32 |
The interval (in seconds) at which content model information will be refreshed from the repository. Controls how frequently a service running this activity will check for changes to the content model, such as new document types or additional training.
A value of 0 will disable automatic refresh, and newly-created document types will not be recognized until services are restarted. |
OutputLevel |
Grooper.Core.ContentType.ClassificationLevel |
Sets the level that classification will be output.Can be one of the following values:
- DocType: Classifies to the Document Type level. This setting is only valid when the task is running with Folder scope.
- Level1: Classifies to level 1 of the Content Model.
- Level2: Classifies to level 2 of the Content Model.
- Level3: Classifies to level 3 of the Content Model.
- Level4: Classifies to level 4 of the Content Model.
- Level5: Classifies to level 5 of the Content Model.
- Level6: Classifies to level 6 of the Content Model.
- Level7: Classifies to level 7 of the Content Model.
- Level8: Classifies to level 8 of the Content Model.
- PageType: Classifies to the Page Type level. This setting is only valid when the task is running with Page scope.
- FormType
|
Root |
Grooper.GrooperRoot |
Returns the root node |
StatNames |
System.Collections.Generic.IEnumerable(Of T) |
Returns all possible statistic names which could be logged for the Activity. Derived classed should override this method to return all stat names which
will be used in calls to AddCustomStatValue(). |
SupressCandidateList |
System.Boolean |
If set to true, disables saving of the candidate list. By default, the classification process saves a list of a potential classification candidates for each document. This
list is only useful if a review step using the Classification Viewer control is included in the Batch Process. |
Method Name |
Description |
AddDiagImage(Name As String, Image As GrooperImage, Annotations As IEnumerable(Of Annotation)) |
Parameters |
Name |
Type: String |
|
|
Image |
Type: GrooperImage |
|
|
Annotations |
Type: IEnumerable`1 |
|
|
EnableDiagMode() |
|
GetProperties() As PropertyDescriptorCollection |
|
GetReferences() As List(Of GrooperNode) |
Returns a list of GrooperNode objects referenced in the properties of this object. |
InsertDiagImage(Index As Int32, Name As String, Image As GrooperImage, Annotations As IEnumerable(Of Annotation)) |
Parameters |
Index |
Type: Int32 |
|
|
Name |
Type: String |
|
|
Image |
Type: GrooperImage |
|
|
Annotations |
Type: IEnumerable`1 |
|
|
IsPropertyEnabled(PropertyName As String) As Nullable(Of Boolean) |
Defines whether a property is currently enabled.
Parameters |
PropertyName |
Type: String |
The name of the property to determine the enabled state for. |
|
IsPropertyVisible(PropertyName As String) As Nullable(Of Boolean) |
Defines whether a property is currently visible.
Parameters |
PropertyName |
Type: String |
The name of the property to determine the visible state for. |
|
IsType(Type As Type) As Boolean |
Returns true if the object is of the type specified, or if it derives from the type specfied.
Parameters |
Type |
Type: Type |
The type to check. |
|
LogStatValue(Name As String, Value As Double) |
Adds a custom stat value to the Batch Processing Activity statistics.
Parameters |
Name |
Type: String |
|
|
Value |
Type: Double |
|
|
ProcessTask(CurNode As BatchObject) |
Mandatory override to implement processing logic.
Parameters |
CurNode |
Type: BatchObject |
The current batch object being processed. |
|
Serialize() As String |
Serializes the object. |
SetDatabase(Database As GrooperDb) |
Sets the database connection of the object.
Parameters |
Database |
Type: GrooperDb |
|
|
ToString() As String |
Returns the display name for this activity type. |
ValidateProperties() As ValidationErrorList |
Validates the properties of the object, returning a list of validation errors. |
Verify() |
|
WriteLogEntry(Message As String, pa() As Object()) |
Adds an entry to the Diagnostic Info Log.
Parameters |
Message |
Type: String |
|
|
pa |
Type: Object |
|
|
WriteLogEntry(TabLevel As Int32, Message As String, pa() As Object()) |
Adds an entry to the Diagnostic Info Log.
Parameters |
TabLevel |
Type: Int32 |
Level to indent the message within the log. |
|
Message |
Type: String |
|
|
pa |
Type: Object |
|
|