429 Too Many Requests response. Understanding how limits work — and building your integration to handle them gracefully — keeps your application running smoothly even under heavy load.
Rate Limit Tiers
Your rate limit depends on your Vook plan. All limits are measured as requests per minute (rpm) per API key.| Plan | Rate Limit | Burst Allowance |
|---|---|---|
| Free | 100 req/min | Up to 120 req/min briefly |
| Pro | 1,000 req/min | Up to 1,200 req/min briefly |
| Business | 5,000 req/min | Up to 6,000 req/min briefly |
| Enterprise | Custom | Negotiated separately |
Rate limits are applied per API key, not per account. If you use multiple API
keys, each key gets its own independent quota. Contact
support@vook.ai to discuss Enterprise limits.
Rate Limit Headers
Every API response includes headers that tell you your current rate limit status. Read these proactively to throttle your requests before hitting the limit rather than reacting after the fact.| Header | Type | Description |
|---|---|---|
X-RateLimit-Limit | integer | Total requests allowed per minute for your plan |
X-RateLimit-Remaining | integer | Requests remaining in the current window |
X-RateLimit-Reset | integer | Unix timestamp (UTC) when the window resets |
X-RateLimit-Reset Unix timestamp:
Handling 429 Errors
When you exceed your rate limit, Vook returns:Retry-After header tells you the minimum number of seconds to wait before retrying. For robustness, implement exponential backoff with jitter — this prevents multiple clients retrying simultaneously and creating a thundering herd.
Best Practices
Apply these strategies to keep your integration well within its limits and resilient when they’re approached.Cache responses
Store API responses locally for data that doesn’t change frequently. If you’re displaying the same list to multiple users, fetch it once and serve from cache rather than calling the API for each user.
Use bulk endpoints
Where available, prefer bulk create/update endpoints over looping individual requests. A single call that operates on 50 records uses one request instead of 50.
Implement exponential backoff
Always retry 429 responses with exponential backoff and jitter — never in a tight loop. The code examples above provide a ready-to-use implementation.
Monitor headers proactively
Read
X-RateLimit-Remaining on every response. If it drops below 10% of your limit, slow down your request rate before hitting zero rather than reacting to a 429.