Understanding and Working with Vook API Rate Limits

Vook enforces rate limits on all API endpoints to ensure fair usage and keep the platform stable for everyone. When you exceed your plan’s limit, the API returns a 429 Too Many Requests response. Understanding how limits work — and building your integration to handle them gracefully — keeps your application running smoothly even under heavy load.

Rate Limit Tiers

Your rate limit depends on your Vook plan. All limits are measured as requests per minute (rpm) per API key.

Plan	Rate Limit	Burst Allowance
Free	100 req/min	Up to 120 req/min briefly
Pro	1,000 req/min	Up to 1,200 req/min briefly
Business	5,000 req/min	Up to 6,000 req/min briefly
Enterprise	Custom	Negotiated separately

Rate limits are applied per API key, not per account. If you use multiple API keys, each key gets its own independent quota. Contact support@vook.ai to discuss Enterprise limits.

Rate Limit Headers

Every API response includes headers that tell you your current rate limit status. Read these proactively to throttle your requests before hitting the limit rather than reacting after the fact.

Header	Type	Description
`X-RateLimit-Limit`	integer	Total requests allowed per minute for your plan
`X-RateLimit-Remaining`	integer	Requests remaining in the current window
`X-RateLimit-Reset`	integer	Unix timestamp (UTC) when the window resets

An example set of rate limit response headers:

HTTP/1.1 200 OK
X-RateLimit-Limit: 1000
X-RateLimit-Remaining: 847
X-RateLimit-Reset: 1705316460
Content-Type: application/json

To find out when the window resets in human-readable form, convert the X-RateLimit-Reset Unix timestamp:

import datetime
reset_timestamp = 1705316460
reset_time = datetime.datetime.fromtimestamp(reset_timestamp, tz=datetime.timezone.utc)
print(reset_time)  # 2024-01-15 10:41:00+00:00

Handling 429 Errors

When you exceed your rate limit, Vook returns:

HTTP/1.1 429 Too Many Requests
Retry-After: 14
Content-Type: application/json

{
  "error": {
    "code": "rate_limit_exceeded",
    "message": "You have exceeded your rate limit. Please retry after 14 seconds.",
    "retry_after": 14
  }
}

The Retry-After header tells you the minimum number of seconds to wait before retrying. For robustness, implement exponential backoff with jitter — this prevents multiple clients retrying simultaneously and creating a thundering herd.

import requests
import os
import time
import random

api_key = os.environ["VOOK_API_KEY"]
base_url = "https://api.vook.ai/v1"

headers = {
    "Authorization": f"Bearer {api_key}",
    "Content-Type": "application/json",
}

def request_with_backoff(method, url, max_retries=5, **kwargs):
    """Make a request with exponential backoff on 429 errors."""
    for attempt in range(max_retries):
        response = requests.request(method, url, headers=headers, **kwargs)

        if response.status_code != 429:
            response.raise_for_status()
            return response

        # Respect the Retry-After header if present
        retry_after = int(response.headers.get("Retry-After", 0))

        # Exponential backoff: 1s, 2s, 4s, 8s, 16s — plus random jitter
        backoff = max(retry_after, (2 ** attempt))
        jitter = random.uniform(0, 1)
        wait = backoff + jitter

        print(f"Rate limit hit. Attempt {attempt + 1}/{max_retries}. Waiting {wait:.1f}s...")
        time.sleep(wait)

    raise Exception(f"Request failed after {max_retries} retries due to rate limiting.")

# Use it just like requests.get / requests.post
response = request_with_backoff("GET", f"{base_url}/resources")
print(response.json())

Best Practices

Apply these strategies to keep your integration well within its limits and resilient when they’re approached.

Cache responses

Store API responses locally for data that doesn’t change frequently. If you’re displaying the same list to multiple users, fetch it once and serve from cache rather than calling the API for each user.

Use bulk endpoints

Where available, prefer bulk create/update endpoints over looping individual requests. A single call that operates on 50 records uses one request instead of 50.

Implement exponential backoff

Always retry 429 responses with exponential backoff and jitter — never in a tight loop. The code examples above provide a ready-to-use implementation.

Monitor headers proactively

Read X-RateLimit-Remaining on every response. If it drops below 10% of your limit, slow down your request rate before hitting zero rather than reacting to a 429.

Retrying immediately after a 429 error without waiting will not succeed — Vook will continue returning 429 until the rate limit window resets. Always wait at least the number of seconds specified in Retry-After before retrying.

​Rate Limit Tiers

​Rate Limit Headers

​Handling 429 Errors

​Best Practices

Cache responses

Use bulk endpoints

Implement exponential backoff

Monitor headers proactively

Rate Limit Tiers

Rate Limit Headers

Handling 429 Errors

Best Practices