Rate Limiting

DocAI Fabric enforces per-client rate limits to ensure fair usage and protect the platform from excessive traffic.

How It Works

Rate limits are applied per API key (or per tenant for session-authenticated users). Two types of limits are enforced:

Scope	Applies To	Purpose
Sync API	Every authenticated request	Protects the API from excessive HTTP traffic
Async API	Endpoints that create processing tasks	Limits task submission rate and queue depth

Default Limits

Limit	Default Value
Sync requests per hour	18,000
Sync requests per minute (burst)	600
Async task submissions per hour	300
Max queued tasks	1,000

These defaults can be customized per client by an administrator.

Response Headers

Every authenticated response includes rate limit headers:

X-RateLimit-Limit: 18000
X-RateLimit-Remaining: 17853
X-RateLimit-Reset: 1741730400
X-RateLimit-Window: 3600

Header	Description
`X-RateLimit-Limit`	Maximum requests allowed in the current window
`X-RateLimit-Remaining`	Requests remaining in the current window
`X-RateLimit-Reset`	Unix timestamp when the window resets
`X-RateLimit-Window`	Window duration in seconds

Rate Limit Exceeded (429)

When a limit is exceeded, the API returns 429 Too Many Requests with a Retry-After header.

Sync API

HTTP 429 Too Many Requests
Retry-After: 60

"Rate limit exceeded. Retry after 60s"

Async API

HTTP 429
Retry-After: 180
X-RateLimit-Limit: 300
X-RateLimit-Remaining: 0
X-RateLimit-Reset: 1741730400

{
  "error": "rate_limit_exceeded",
  "message": "Async API rate limit exceeded for client Acme Corp",
  "retry_after": 180,
  "limit": 300,
  "window": "1 hour",
  "reset_at": "2026-03-11T20:00:00Z"
}

Queue Full

HTTP 429

{
  "error": "queue_full",
  "message": "Too many pending tasks. Please wait for current tasks to complete.",
  "current_depth": 50,
  "max_depth": 50
}

Best Practices

Monitor headers: check X-RateLimit-Remaining to proactively throttle before hitting the limit.
Respect Retry-After: wait the indicated number of seconds before retrying.
Use exponential backoff: if you receive repeated 429s, increase the delay between retries.
Batch wisely: submit documents in reasonably sized batches rather than many individual requests.

Endpoints Not Rate-Limited

Health check: GET /health
Documentation: /docs/*
API Explorer: /api-docs/
Admin endpoints (authenticated via X-Admin-Key)

How It Works​

Default Limits​

Response Headers​

Rate Limit Exceeded (429)​

Sync API​

Async API​

Queue Full​

Best Practices​

Endpoints Not Rate-Limited​