Skip to main content

Rate Limiting

DocAI Fabric enforces per-client rate limits to ensure fair usage and protect the platform from excessive traffic.

How It Works

Rate limits are applied per API key (or per tenant for session-authenticated users). Two types of limits are enforced:

ScopeApplies ToPurpose
Sync APIEvery authenticated requestProtects the API from excessive HTTP traffic
Async APIEndpoints that create processing tasksLimits task submission rate and queue depth

Default Limits

LimitDefault Value
Sync requests per hour18,000
Sync requests per minute (burst)600
Async task submissions per hour300
Max queued tasks1,000

These defaults can be customized per client by an administrator.

Response Headers

Every authenticated response includes rate limit headers:

X-RateLimit-Limit: 18000
X-RateLimit-Remaining: 17853
X-RateLimit-Reset: 1741730400
X-RateLimit-Window: 3600
HeaderDescription
X-RateLimit-LimitMaximum requests allowed in the current window
X-RateLimit-RemainingRequests remaining in the current window
X-RateLimit-ResetUnix timestamp when the window resets
X-RateLimit-WindowWindow duration in seconds

Rate Limit Exceeded (429)

When a limit is exceeded, the API returns 429 Too Many Requests with a Retry-After header.

Sync API

HTTP 429 Too Many Requests
Retry-After: 60

"Rate limit exceeded. Retry after 60s"

Async API

HTTP 429
Retry-After: 180
X-RateLimit-Limit: 300
X-RateLimit-Remaining: 0
X-RateLimit-Reset: 1741730400

{
"error": "rate_limit_exceeded",
"message": "Async API rate limit exceeded for client Acme Corp",
"retry_after": 180,
"limit": 300,
"window": "1 hour",
"reset_at": "2026-03-11T20:00:00Z"
}

Queue Full

HTTP 429

{
"error": "queue_full",
"message": "Too many pending tasks. Please wait for current tasks to complete.",
"current_depth": 50,
"max_depth": 50
}

Best Practices

  • Monitor headers: check X-RateLimit-Remaining to proactively throttle before hitting the limit.
  • Respect Retry-After: wait the indicated number of seconds before retrying.
  • Use exponential backoff: if you receive repeated 429s, increase the delay between retries.
  • Batch wisely: submit documents in reasonably sized batches rather than many individual requests.

Endpoints Not Rate-Limited

  • Health check: GET /health
  • Documentation: /docs/*
  • API Explorer: /api-docs/
  • Admin endpoints (authenticated via X-Admin-Key)