Workflows
Under Review
This article is currently under review. Some content may be incomplete or inaccurate.
Workflows define the processing pipeline for a project. They determine which activities run and in what order.
Workflow Structure
A workflow is a sequence of activities:
{
"activities": [
{ "type": "import", "enabled": true },
{ "type": "data_transform", "enabled": true },
{ "type": "split", "enabled": true },
{ "type": "classify", "enabled": true },
{ "type": "extract", "enabled": true }
]
}
Activity Types
| Type | Purpose | Input | Output |
|---|---|---|---|
import | Ingest uploaded files | Raw files | Registered documents |
data_transform | Convert to processable format | Documents | Page images + OCR |
split | Detect document boundaries | Pages | Document groupings |
classify | Categorize documents | Document text | Classification labels |
extract | Pull structured data | Document text + schema | Field values |
Configuring Workflows
Enabling/Disabling Activities
Toggle any activity on or off in the project settings. For example, if your documents are already single-page and pre-classified:
{
"activities": [
{ "type": "import", "enabled": true },
{ "type": "data_transform", "enabled": true },
{ "type": "split", "enabled": false },
{ "type": "classify", "enabled": false },
{ "type": "extract", "enabled": true }
]
}
Activity Configuration
Each activity can be configured independently:
Classification Config
| Setting | Description | Default |
|---|---|---|
model | Azure OpenAI model to use | gpt-4.1-mini |
temperature | Model creativity (0-1) | 0.0 |
confidence_threshold | Minimum confidence score | 0.7 |
Extraction Config
| Setting | Description | Default |
|---|---|---|
model | Azure OpenAI model to use | gpt-4.1-mini |
temperature | Model creativity (0-1) | 0.0 |
include_coordinates | Return bounding boxes | true |
confidence_threshold | Minimum confidence score | 0.7 |
Settings Inheritance
Configuration follows an inheritance chain:
Project Defaults → Workflow Activity Config → Runtime Overrides (per transaction)
This lets you set sensible defaults at the project level while allowing per-transaction customization when needed.
Workflow Execution
The workflow engine:
- Picks up transactions from the Redis queue
- Runs each enabled activity in sequence
- Tracks progress and status per activity
- Handles retries for transient failures
- Reports completion or failure
Monitoring
View workflow progress in the transaction detail view. Each activity shows:
- Status: Pending, running, completed, or failed
- Duration: How long the activity took
- Errors: Detailed error messages if something went wrong