Skip to main content

Overview

The Pure API uses a token bucket algorithm to protect the platform from excessive request volume and ensure fair access for all consumers. Every authenticated request counts against your organization’s rate limit budget. Rate limits are applied per API key. Unauthenticated requests are rejected at the authentication layer before rate limiting is evaluated.

Default Limits

ParameterValueDescription
Burst capacity120 requestsMaximum number of requests you can make in a single burst
Refill rate60 tokens/minTokens are replenished at this steady-state rate
Refill window1 minuteThe interval over which tokens are refilled
This means you can burst up to 120 requests instantly, and then sustain 60 requests per minute indefinitely. If you exhaust all tokens, you’ll need to wait for the bucket to refill before making additional requests.

How the Token Bucket Works

The token bucket algorithm works like a reservoir:
  1. Your bucket starts full at 120 tokens (the burst capacity)
  2. Each API request consumes 1 token
  3. Tokens are refilled at a rate of 60 per minute, continuously
  4. The bucket never exceeds its maximum capacity of 120 tokens
  5. If the bucket is empty (0 tokens), the request is rejected with 429 Too Many Requests
This design allows short bursts of traffic while enforcing a sustained rate over time.
Tokens
120 |████████████████████████████████  ← Full bucket (burst capacity)
    |████████████████████████          ← After a burst of requests
    |████████████████                  ← Continuing to send...
    |████████                         ← Getting low
    |                                 ← Empty → 429 Too Many Requests
    |████                             ← Refilling (60 tokens/min)
    +----------------------------------→ Time

Response Headers

Every rate-limited response includes headers so you can monitor your usage in real time:
HeaderTypeDescription
x-ratelimit-limitintegerYour bucket’s maximum capacity (e.g. 120)
x-ratelimit-remainingintegerTokens remaining in your bucket after this request
x-ratelimit-resetintegerSeconds until the next token is added to your bucket
retry-afterintegerSeconds to wait before retrying (only present on 429 responses)

Example Response Headers

Successful request (tokens available):
HTTP/1.1 200 OK
x-ratelimit-limit: 120
x-ratelimit-remaining: 85
x-ratelimit-reset: 1
Rate limited (bucket empty):
HTTP/1.1 429 Too Many Requests
x-ratelimit-limit: 120
x-ratelimit-remaining: 0
x-ratelimit-reset: 1
retry-after: 1

429 Response Body

When your rate limit is exceeded, the API returns a 429 status with the following JSON body:
{
  "error": "Too Many Requests",
  "code": 429,
  "suggestion": "Please try again later."
}

Per-Route Overrides

Some API endpoints may have different rate limits than the default. These overrides are applied transparently — the response headers always reflect the effective limits for the endpoint you called. When a route has a custom limit, your bucket for that route is separate from the default bucket. Two routes with different rate limit configurations do not share tokens.
The response headers on each request always reflect the correct limit for that specific endpoint. Use x-ratelimit-remaining to monitor your budget regardless of whether a route uses the default or a custom limit.

Best Practices

Monitor Your Usage

Check the x-ratelimit-remaining header on every response. When it drops below a threshold (e.g. 10), slow down your request rate.
const response = await fetch("https://api.collectpure.com/v1/products", {
  headers: { "x-api-key": API_KEY },
});

const remaining = parseInt(response.headers.get("x-ratelimit-remaining"), 10);

if (remaining < 10) {
  console.warn(`Rate limit running low: ${remaining} requests remaining`);
}

Implement Exponential Backoff

When you receive a 429, use the retry-after header to determine when to retry. Combine with exponential backoff for resilience:
async function fetchWithRetry(url, options, maxRetries = 3) {
  for (let attempt = 0; attempt <= maxRetries; attempt++) {
    const response = await fetch(url, options);

    if (response.status !== 429) {
      return response;
    }

    const retryAfter = parseInt(response.headers.get("retry-after"), 10) || 1;
    const backoff = retryAfter * Math.pow(2, attempt);
    console.log(`Rate limited. Retrying in ${backoff}s (attempt ${attempt + 1}/${maxRetries})`);
    await new Promise(resolve => setTimeout(resolve, backoff * 1000));
  }

  throw new Error("Max retries exceeded");
}
import time
import requests

def fetch_with_retry(url, headers, max_retries=3):
    for attempt in range(max_retries + 1):
        response = requests.get(url, headers=headers)

        if response.status_code != 429:
            return response

        retry_after = int(response.headers.get("retry-after", 1))
        backoff = retry_after * (2 ** attempt)
        print(f"Rate limited. Retrying in {backoff}s (attempt {attempt + 1}/{max_retries})")
        time.sleep(backoff)

    raise Exception("Max retries exceeded")

Batch Where Possible

Instead of making many individual requests, use batch or list endpoints where available to reduce the number of API calls. For example, use GET /products/get-products/v1 with multiple IDs instead of fetching products one at a time.

Cache Responses

For data that doesn’t change frequently (spot prices, product metadata), cache responses on your end to avoid unnecessary API calls. Check the Cache-Control header on responses for caching guidance.

Key Scoping

Rate limit buckets are scoped by two dimensions:
  1. Consumer identity — your API key determines which bucket is used. Each API key has its own independent set of buckets.
  2. Rate limit configuration — routes with different rate limit parameters use separate buckets. Two routes sharing the same limits share a single bucket.
This means:
  • Different API keys never share rate limit buckets
  • Two routes with identical limits share a bucket (a request to either counts against the same pool)
  • A route with custom limits has its own isolated bucket

FAQ

No. Rate limits only apply to REST API requests. WebSocket connections are not rate limited (though excessive connections may be throttled separately during the alpha period).
The API is configured to fail open — if the rate limiting infrastructure encounters an error, your requests are allowed through rather than being blocked. This ensures availability is not impacted by rate limiter outages.
Yes. If your use case requires higher throughput, contact us at [email protected] with details about your application and expected request volume.
Rate limits are applied globally across all endpoints that share the same rate limit configuration. Most endpoints use the default configuration (120 burst / 60 per minute), so requests to any of them draw from the same token bucket. Endpoints with custom limits have their own separate buckets.