Rate Limiting
DocAI Fabric enforces per-client rate limits to ensure fair usage and protect the platform from excessive traffic.
How It Works
Rate limits are applied per API key (or per tenant for session-authenticated users). Two types of limits are enforced:
| Scope | Applies To | Purpose |
|---|---|---|
| Sync API | Every authenticated request | Protects the API from excessive HTTP traffic |
| Async API | Endpoints that create processing tasks | Limits task submission rate and queue depth |
Default Limits
| Limit | Default Value |
|---|---|
| Sync requests per hour | 18,000 |
| Sync requests per minute (burst) | 600 |
| Async task submissions per hour | 300 |
| Max queued tasks | 1,000 |
These defaults can be customized per client by an administrator.
Response Headers
Every authenticated response includes rate limit headers:
X-RateLimit-Limit: 18000
X-RateLimit-Remaining: 17853
X-RateLimit-Reset: 1741730400
X-RateLimit-Window: 3600
| Header | Description |
|---|---|
X-RateLimit-Limit | Maximum requests allowed in the current window |
X-RateLimit-Remaining | Requests remaining in the current window |
X-RateLimit-Reset | Unix timestamp when the window resets |
X-RateLimit-Window | Window duration in seconds |
Rate Limit Exceeded (429)
When a limit is exceeded, the API returns 429 Too Many Requests with a Retry-After header.
Sync API
HTTP 429 Too Many Requests
Retry-After: 60
"Rate limit exceeded. Retry after 60s"
Async API
HTTP 429
Retry-After: 180
X-RateLimit-Limit: 300
X-RateLimit-Remaining: 0
X-RateLimit-Reset: 1741730400
{
"error": "rate_limit_exceeded",
"message": "Async API rate limit exceeded for client Acme Corp",
"retry_after": 180,
"limit": 300,
"window": "1 hour",
"reset_at": "2026-03-11T20:00:00Z"
}
Queue Full
HTTP 429
{
"error": "queue_full",
"message": "Too many pending tasks. Please wait for current tasks to complete.",
"current_depth": 50,
"max_depth": 50
}
Best Practices
- Monitor headers: check
X-RateLimit-Remainingto proactively throttle before hitting the limit. - Respect
Retry-After: wait the indicated number of seconds before retrying. - Use exponential backoff: if you receive repeated 429s, increase the delay between retries.
- Batch wisely: submit documents in reasonably sized batches rather than many individual requests.
Endpoints Not Rate-Limited
- Health check:
GET /health - Documentation:
/docs/* - API Explorer:
/api-docs/ - Admin endpoints (authenticated via
X-Admin-Key)