Skip to main content

Webhooks: Get Results Without Polling

Webhooks let your application be notified the moment a transaction's results are ready, instead of polling GET /transactions/{id} in a loop. When the workflow finishes (or fails), DocAI Fabric POSTs a JSON payload to a URL you configure, including direct download links to the export files.

Use a webhook when:

  • You're processing many transactions and don't want to poll each one.
  • Latency matters: you want results pushed as soon as they exist.
  • Your receiver is reachable from the public internet (webhooks cannot deliver to private hosts, see Security).

If you can't accept inbound HTTP requests, stick with polling.


How It Works

The webhook is delivered by a Notification activity placed in your project's workflow. Add it once in the workflow designer; every transaction that reaches the node fires a POST to your endpoint.

Import → Convert → OCR → Classify → Extract → Export → Notification ▶ POST https://your-app.example.com/hooks/docai
  • New projects created in DocAI Fabric ship with a Notification activity already wired in at the end of the workflow; you only need to fill in the webhook URL to activate it.
  • Place the Notification activity after Export so the payload's export list is non-empty.
  • A single workflow can have multiple Notification nodes (e.g. one after classification for an early "ready for review" ping, another at the end for "results ready").

Two trigger events

EventFires whenPayload
reached (success)Workflow execution reaches the Notification nodeIncludes the export manifest with download URLs
failedThe workflow aborts before reaching the nodeIncludes an error field; exports is empty

Both are independently toggleable via the activity's events setting. By default, both are enabled.


Configuring the Notification Activity

Open your project in Project Settings → Workflow, click the Notification node (or drag one in from the activity palette and place it after Export), and fill in the configuration:

FieldDefaultPurpose
webhook_url(required)Endpoint that receives the POST. Must resolve to a public host.
events["reached", "failed"]Which triggers fire this webhook.
event_namenullOverrides the event field on the success payload. Defaults to transaction.step_reached.
http_methodPOSTPOST or PUT.
custom_headers{}Extra HTTP headers (e.g. Authorization: Bearer ... for your receiver).
hmac_secretnullIf set, the body is signed with HMAC-SHA256 (see Verifying the signature).
timeout_seconds15Per-attempt request timeout.
retry_max_attempts3Total delivery attempts (1 = no retry).
retry_backoff_seconds2.0Base delay between retries; doubles each attempt.
api_base_urlnullBase URL used to build absolute download links. Defaults to the deployment's public URL.
fail_workflow_on_errorfalseIf true, a delivery failure fails the workflow step (so the transaction shows failed).
Testing your receiver

While developing, point webhook_url at a service like webhook.site or requestbin.com to inspect payloads. Switch to your own endpoint once the schema is wired up.


Webhook Payload

The receiver gets a JSON POST with this shape:

Success (transaction.step_reached)

{
"event": "transaction.step_reached",
"tenant_id": "your-tenant-id",
"project_id": "your-project-id",
"transaction_id": "550e8400-e29b-41d4-a716-446655440000",
"transaction_name": "Invoice batch - 2026-05-19",
"occurred_at": "2026-05-19T12:00:00+00:00",
"exports": {
"manifest_url": "https://app.docaifabric.com/transactions/550e.../exports",
"total_files": 2,
"total_size_bytes": 51200,
"files": [
{
"profile_id": "default-json",
"profile_name": "Transaction JSON",
"output_format": "json",
"filename": "550e..._data.json",
"size_bytes": 4821,
"download_url": "https://app.docaifabric.com/transactions/550e.../exports/download/550e..._data.json"
},
{
"profile_id": "invoice-pdf",
"profile_name": "Invoice PDF",
"output_format": "pdf",
"filename": "INV-12345.pdf",
"size_bytes": 46379,
"download_url": "https://app.docaifabric.com/transactions/550e.../exports/download/INV-12345.pdf"
}
]
}
}

Failure (transaction.failed)

{
"event": "transaction.failed",
"tenant_id": "your-tenant-id",
"project_id": "your-project-id",
"transaction_id": "550e8400-e29b-41d4-a716-446655440000",
"transaction_name": "Invoice batch - 2026-05-19",
"occurred_at": "2026-05-19T12:00:03+00:00",
"exports": {},
"error": "Extraction step failed: model returned invalid JSON"
}

Field reference

FieldDescription
eventtransaction.step_reached (or your custom event_name) for success, transaction.failed on failure. Use this, not status, to branch.
tenant_id, project_id, transaction_idIdentifiers for the transaction. Use transaction_id to correlate with your original submission (or with the correlation_id you set when creating it).
transaction_nameDisplay name from the UI; may be null if you didn't set one.
occurred_atISO-8601 UTC timestamp when the notification was built.
exports.manifest_urlAPI URL to fetch the full export manifest (same as GET /transactions/{id}/exports).
exports.files[]One entry per generated export file (JSON, PDF, XLSX, CSV, XML). Each has download_url pointing at the file.
errorPresent only on transaction.failed. Short human-readable description of what failed.

Downloading the Results

The download_url and manifest_url values point at the API-key-authenticated endpoints; the webhook payload itself does not carry credentials. Your receiver must already hold a valid project API key and send it as X-API-Key when downloading.

import requests

def handle_webhook(payload):
if payload["event"] == "transaction.failed":
log.error("Transaction %s failed: %s",
payload["transaction_id"], payload.get("error"))
return

for file in payload["exports"]["files"]:
if file["output_format"] != "json":
continue
resp = requests.get(
file["download_url"],
headers={"X-API-Key": API_KEY},
)
resp.raise_for_status()
data = resp.json()
for doc in data["documents"]:
for field_id, field in doc["fields"].items():
print(f"{field['name']}: {field['value']}")

The downloaded JSON has the same schema as the polling-based flow (see JSON Results Schema).


Verifying the Signature

When hmac_secret is set on the Notification activity, every delivery includes:

X-DocAI-Signature: sha256=<hex digest>

The digest is computed over the exact raw request body with HMAC-SHA256 using your secret. Verify it before trusting the payload: anyone who knows the URL could otherwise forge requests.

The body is serialized compactly (separators=(",", ":")) with sorted keys, so the same bytes are signed and delivered.

Python (Flask)

import hmac, hashlib
from flask import Flask, request, abort

app = Flask(__name__)
SECRET = b"your-shared-secret"

@app.post("/hooks/docai")
def receive():
raw = request.get_data() # important: raw bytes, before any JSON parsing
sent = request.headers.get("X-DocAI-Signature", "")
expected = "sha256=" + hmac.new(SECRET, raw, hashlib.sha256).hexdigest()
if not hmac.compare_digest(expected, sent):
abort(401)

payload = request.get_json()
# ... handle payload
return "", 200

Node.js (Express)

const crypto = require("crypto");
const express = require("express");

const app = express();
const SECRET = "your-shared-secret";

// Capture the raw body so the signature can be verified
app.use(express.json({
verify: (req, _res, buf) => { req.rawBody = buf; }
}));

app.post("/hooks/docai", (req, res) => {
const sent = req.header("X-DocAI-Signature") || "";
const expected = "sha256=" + crypto
.createHmac("sha256", SECRET)
.update(req.rawBody)
.digest("hex");
if (!crypto.timingSafeEqual(Buffer.from(expected), Buffer.from(sent))) {
return res.status(401).end();
}
// ... handle req.body
res.status(200).end();
});
Use the raw body

Re-serializing req.body with your own JSON encoder will produce different bytes and the signature check will fail. Always sign-check against the bytes that arrived on the wire.


Delivery Semantics

  • DocAI Fabric expects a 2xx response within timeout_seconds. Anything else is treated as a failed attempt.
  • 5xx and 429 responses are retried up to retry_max_attempts times with exponential backoff (retry_backoff_seconds, doubling each attempt).
  • Other 4xx responses are terminal: they won't change on retry, so delivery gives up immediately.
  • A failed delivery is logged on the server but does not affect the transaction unless fail_workflow_on_error is enabled.
  • The same transaction may fire the webhook more than once in some failure modes (e.g. a Notification node placed mid-workflow before a step that later fails will fire both reached and failed). Make your receiver idempotent: key off transaction_id + event.
  1. Verify the HMAC signature.
  2. Read transaction_id and event.
  3. Look up whether you've already processed this (transaction_id, event) pair; if so, return 200 immediately.
  4. Otherwise, download the export files using your API key, persist the results, then return 200.
  5. If anything fails, return a 5xx so DocAI Fabric retries.

Security

Public host requirement (SSRF guard)

Webhook URLs are validated before every delivery. DocAI Fabric rejects:

  • Non-http/https schemes.
  • Hosts that resolve to a private, loopback, link-local, multicast, reserved, or unspecified address.

This blocks accidental (or malicious) requests to internal infrastructure and cloud metadata endpoints (e.g. 169.254.169.254). If you're running on a private network, expose a public ingress (a tunnel like ngrok during development, or a load balancer in production).

Defense-in-depth checklist

  • Always set hmac_secret in production: it's the only way to be sure the request actually came from DocAI Fabric.
  • Use HTTPS for webhook_url so the payload (and signature) can't be observed on the wire.
  • Rotate the secret periodically; update both ends in the same change window.
  • Restrict by IP at your edge if your deployment has a fixed egress IP range.
  • Don't rely on the URL being secret: treat it as a public endpoint and use the signature as the only authentication.

End-to-End Example

# 1. Submit a transaction normally (one-step API).
import requests, base64

with open("invoice.pdf", "rb") as f:
file_data = base64.b64encode(f.read()).decode()

requests.post(
f"{BASE_URL}/tenants/{TENANT}/projects/{PROJECT}/transactions/process",
headers={"X-API-Key": API_KEY, "Content-Type": "application/json"},
json={
"source_files": [{"filename": "invoice.pdf", "base64_data": file_data}],
"correlation_id": "order-9981",
},
)
# That's it: no polling. Your webhook receiver below will be called.
# 2. Webhook receiver: verifies the signature, downloads the JSON, prints fields.
import hmac, hashlib, requests
from flask import Flask, request, abort

app = Flask(__name__)
SECRET = b"your-shared-secret"
API_KEY = "your-api-key"

@app.post("/hooks/docai")
def receive():
raw = request.get_data()
expected = "sha256=" + hmac.new(SECRET, raw, hashlib.sha256).hexdigest()
if not hmac.compare_digest(expected, request.headers.get("X-DocAI-Signature", "")):
abort(401)

p = request.get_json()
if p["event"] == "transaction.failed":
print(f"Transaction {p['transaction_id']} failed: {p.get('error')}")
return "", 200

json_files = [f for f in p["exports"]["files"] if f["output_format"] == "json"]
for f in json_files:
data = requests.get(
f["download_url"], headers={"X-API-Key": API_KEY}
).json()
for doc in data["documents"]:
print(f"\n{doc.get('document_type', 'Unknown')}:")
for field_id, field in doc["fields"].items():
print(f" {field['name']}: {field['value']}")
return "", 200

Polling vs. Webhooks: Quick Comparison

PollingWebhooks
SetupNone, just call GET /transactions/{id}Add a Notification activity to the workflow
ReceiverAnything that can make HTTP callsMust accept inbound HTTPS on a public host
LatencyUp to your poll interval (or ?timeout=30 long-poll)Immediate when the workflow reaches the node
CostWasted requests on transactions still in progressOne delivery per transaction
Failure handlingYour code retries the GETDocAI Fabric retries delivery
Best forSmall volumes, scripts, dev/testingProduction, high-volume, async pipelines

You can use both: webhooks for the happy path, polling as a fallback if your receiver was offline.


Next Steps