Skip to main content

Mobile Capture (Phone Upload)

Mobile Capture lets a person photograph a document with their phone and submit it straight into a project for processing, with no app to install and no login on the phone. You request a QR code from the API, show it to the user (in your own product, on screen or printed), and the user scans it to open a lightweight camera page. The captured pages arrive as a single transaction in your project and flow through the normal pipeline (conversion, OCR, classification, extraction).

This is built for integration into your own application. Your app authenticates the user, requests a QR on their behalf, and tracks the result. The phone never sees an API key or a login screen.

When to use this

Use Mobile Capture when the document is physical and the user has a phone in hand: capturing a delivery note at a warehouse, a receipt in the field, or an ID at a front desk. For server-to-server uploads of files you already hold, use Process Documents via API instead.

How it works

  1. Your backend requests a capture token. The response includes a QR image and a capture URL.
  2. Your app shows the QR to the user.
  3. The user scans it with their phone camera, which opens a neutral capture page.
  4. The user photographs one or more pages and taps Send.
  5. The pages are submitted as one transaction. You track it with the transaction ID returned in step 1.

Each QR is for a single transaction. To capture another document, request a new QR.

Prerequisites

  • An API key with access to your tenant (see Authentication).
  • The API key must carry the Submit transactions grant (transaction.create), the same permission needed to create any transaction.
  • Your tenant ID and project ID. You can find these in the UI: the tenant ID under your user icon, and the project ID in Library under Project Properties.

Step 1: Request a capture QR

Call the capture token endpoint with your API key. This pre-allocates the transaction and returns a ready-to-display QR.

POST /tenants/{tenant_id}/projects/{project_id}/capture-tokens
curl -X POST "https://app.docaifabric.com/tenants/{tenant_id}/projects/{project_id}/capture-tokens" \
-H "X-API-Key: your-api-key" \
-H "Content-Type: application/json" \
-d '{
"dataset_id": "production",
"page_limit": 100,
"ttl_seconds": 900
}'

Response:

{
"token": "9f1c2e7a-4b8d-4a1e-9c2f-3d6a7b8c9d0e",
"transaction_id": "550e8400-e29b-41d4-a716-446655440000",
"capture_url": "https://app.docaifabric.com/m#9f1c2e7a-4b8d-4a1e-9c2f-3d6a7b8c9d0e",
"qr_svg": "<svg xmlns=\"http://www.w3.org/2000/svg\" ...>...</svg>",
"expires_at": "2026-06-17T12:30:00Z",
"max_pages": 100
}

Request fields

FieldTypeRequiredDescription
dataset_idstringNoWhere the captured transaction lands: production (Work Queue) or playground. Default production.
reviewer_idstringNoPre-assign a reviewer to the captured transaction. The user must be able to review in this project.
page_limitintegerNoMaximum pages to process from the captured file(s), starting at page 1. Default: all pages.
ttl_secondsintegerNoActive capture window in seconds, measured from the first scan. Default 900 (15 minutes), maximum 3600.
labelstringNoOptional human label for your own reference. Not shown to the person capturing.

Response fields

FieldTypeDescription
tokenstringThe opaque capture token. It is already embedded in capture_url.
transaction_idstringThe pre-allocated transaction ID. Use it to track the result immediately, before the user has even scanned.
capture_urlstringThe URL the QR encodes. Open it on a phone to start capturing.
qr_svgstringA self-contained SVG QR code for capture_url. Render it directly, or build your own QR from capture_url.
expires_atstringWall-clock expiry of the token. The 15-minute active window starts when the user first scans.
max_pagesintegerThe maximum number of pages this capture will accept.

Step 2: Show the QR

You have two options:

  • Render qr_svg directly. It is a complete SVG, so you can drop it into an img tag or inline it in your page.
  • Generate your own QR from capture_url using any QR library, if you prefer to control the styling.

The token travels in the URL fragment (after the #), so it is never sent to the server in request logs.

Step 3: The user captures on their phone

Scanning the QR opens a neutral, unbranded capture page. The user photographs one or more pages, can retake or remove any of them, and taps Send. Images are downscaled on the phone to a size that preserves OCR quality before upload, which keeps the transfer fast on mobile connections.

The capture page accepts images and PDFs only, and enforces the max_pages limit.

Step 4: Track the result

Use the transaction_id from Step 1. Because it is allocated up front, you can start tracking before the user finishes, or even before they scan.

Until the user submits, the transaction does not yet exist, so status calls may return not-found. That is expected; keep polling, or rely on the webhook.

Token lifecycle and security

  • One-shot. Each token creates exactly one transaction. After the user submits, the token is consumed and cannot be reused. Request a new QR for each capture.
  • Time-limited. The active window is 15 minutes by default (configurable up to 60), and the clock starts at the first scan, so a code that is shown but not scanned still expires.
  • Login-free by design. Your application authenticated the user when it requested the QR. The phone acts with exactly the transaction.create permission of the API key that minted the token, scoped to the one tenant and project.
  • Bounded cost. Captured transactions count against the project budget (see Cost and Budgets), and uploads are limited to images and PDFs within the page cap. A leaked link cannot exceed its single transaction or the project budget.
Treat the capture URL as a short-lived secret

Anyone holding the link can submit one capture until it expires or is used. That is the intended behavior, since you decide who to show it to. Keep the active window short and let each code be used once.

Try it in the UI

You do not need to write code to see the flow end to end:

  • API & Agents page, Mobile Capture tab: generate a QR, scan it, and watch the transaction appear. The tab also shows a copy-paste backend snippet.
  • Upload dialog, Capture with phone: any user can finish an upload from their phone by scanning a QR shown in the dialog.

Next steps