Rate Limiting

Overview

The Pure API uses a token bucket algorithm to protect the platform from excessive request volume and ensure fair access for all consumers. Every authenticated request counts against your organization’s rate limit budget. Rate limits are applied per API key. Unauthenticated requests are rejected at the authentication layer before rate limiting is evaluated.

Default Limits

Parameter	Value	Description
Burst capacity	120 requests	Maximum number of requests you can make in a single burst
Refill rate	60 tokens/min	Tokens are replenished at this steady-state rate
Refill window	1 minute	The interval over which tokens are refilled

This means you can burst up to 120 requests instantly, and then sustain 60 requests per minute indefinitely. If you exhaust all tokens, you’ll need to wait for the bucket to refill before making additional requests.

How the Token Bucket Works

The token bucket algorithm works like a reservoir:

Your bucket starts full at 120 tokens (the burst capacity)
Each API request consumes 1 token
Tokens are refilled at a rate of 60 per minute, continuously
The bucket never exceeds its maximum capacity of 120 tokens
If the bucket is empty (0 tokens), the request is rejected with 429 Too Many Requests

This design allows short bursts of traffic while enforcing a sustained rate over time.

Tokens
120 |████████████████████████████████  ← Full bucket (burst capacity)
    |████████████████████████          ← After a burst of requests
    |████████████████                  ← Continuing to send...
    |████████                         ← Getting low
    |                                 ← Empty → 429 Too Many Requests
    |████                             ← Refilling (60 tokens/min)
    +----------------------------------→ Time

Response Headers

Every rate-limited response includes headers so you can monitor your usage in real time:

Header	Type	Description
`x-ratelimit-limit`	`integer`	Your bucket’s maximum capacity (e.g. `120`)
`x-ratelimit-remaining`	`integer`	Tokens remaining in your bucket after this request
`x-ratelimit-reset`	`integer`	Seconds until the next token is added to your bucket
`retry-after`	`integer`	Seconds to wait before retrying (only present on `429` responses)

Example Response Headers

Successful request (tokens available):

HTTP/1.1 200 OK
x-ratelimit-limit: 120
x-ratelimit-remaining: 85
x-ratelimit-reset: 1

Rate limited (bucket empty):

HTTP/1.1 429 Too Many Requests
x-ratelimit-limit: 120
x-ratelimit-remaining: 0
x-ratelimit-reset: 1
retry-after: 1

429 Response Body

When your rate limit is exceeded, the API returns a 429 status with the following JSON body:

{
  "error": "Too Many Requests",
  "code": 429,
  "suggestion": "Please try again later."
}

Per-Route Overrides

Some API endpoints may have different rate limits than the default. These overrides are applied transparently — the response headers always reflect the effective limits for the endpoint you called. When a route has a custom limit, your bucket for that route is separate from the default bucket. Two routes with different rate limit configurations do not share tokens.

The response headers on each request always reflect the correct limit for that specific endpoint. Use x-ratelimit-remaining to monitor your budget regardless of whether a route uses the default or a custom limit.

Best Practices

Monitor Your Usage

Check the x-ratelimit-remaining header on every response. When it drops below a threshold (e.g. 10), slow down your request rate.

const response = await fetch("https://api.collectpure.com/v1/products", {
  headers: { "x-api-key": API_KEY },
});

const remaining = parseInt(response.headers.get("x-ratelimit-remaining"), 10);

if (remaining < 10) {
  console.warn(`Rate limit running low: ${remaining} requests remaining`);
}

Implement Exponential Backoff

When you receive a 429, use the retry-after header to determine when to retry. Combine with exponential backoff for resilience:

async function fetchWithRetry(url, options, maxRetries = 3) {
  for (let attempt = 0; attempt <= maxRetries; attempt++) {
    const response = await fetch(url, options);

    if (response.status !== 429) {
      return response;
    }

    const retryAfter = parseInt(response.headers.get("retry-after"), 10) || 1;
    const backoff = retryAfter * Math.pow(2, attempt);
    console.log(`Rate limited. Retrying in ${backoff}s (attempt ${attempt + 1}/${maxRetries})`);
    await new Promise(resolve => setTimeout(resolve, backoff * 1000));
  }

  throw new Error("Max retries exceeded");
}

import time
import requests

def fetch_with_retry(url, headers, max_retries=3):
    for attempt in range(max_retries + 1):
        response = requests.get(url, headers=headers)

        if response.status_code != 429:
            return response

        retry_after = int(response.headers.get("retry-after", 1))
        backoff = retry_after * (2 ** attempt)
        print(f"Rate limited. Retrying in {backoff}s (attempt {attempt + 1}/{max_retries})")
        time.sleep(backoff)

    raise Exception("Max retries exceeded")

Batch Where Possible

Instead of making many individual requests, use batch or list endpoints where available to reduce the number of API calls. For example, use GET /products/get-products/v1 with multiple IDs instead of fetching products one at a time.

Cache Responses

For data that doesn’t change frequently (spot prices, product metadata), cache responses on your end to avoid unnecessary API calls. Check the Cache-Control header on responses for caching guidance.

Key Scoping

Rate limit buckets are scoped by two dimensions:

Consumer identity — your API key determines which bucket is used. Each API key has its own independent set of buckets.
Rate limit configuration — routes with different rate limit parameters use separate buckets. Two routes sharing the same limits share a single bucket.

This means:

Different API keys never share rate limit buckets
Two routes with identical limits share a bucket (a request to either counts against the same pool)
A route with custom limits has its own isolated bucket

FAQ

Do WebSocket connections count against rate limits?

No. Rate limits only apply to REST API requests. WebSocket connections are not rate limited (though excessive connections may be throttled separately during the alpha period).

What happens if the rate limiter is down?

The API is configured to fail open — if the rate limiting infrastructure encounters an error, your requests are allowed through rather than being blocked. This ensures availability is not impacted by rate limiter outages.

Can I request a higher rate limit?

Yes. If your use case requires higher throughput, contact us at [email protected] with details about your application and expected request volume.

Are rate limits per endpoint or global?

Rate limits are applied globally across all endpoints that share the same rate limit configuration. Most endpoints use the default configuration (120 burst / 60 per minute), so requests to any of them draw from the same token bucket. Endpoints with custom limits have their own separate buckets.

Get Started

Authentication

Real-Time

Execution - Buy

Execution - Sell

Sandbox

Rate Limiting

Overview

Default Limits

How the Token Bucket Works

Response Headers

Example Response Headers

429 Response Body

Per-Route Overrides

Best Practices

Monitor Your Usage

Implement Exponential Backoff

Batch Where Possible

Cache Responses

Key Scoping

FAQ

​Overview

​Default Limits

​How the Token Bucket Works

​Response Headers

​Example Response Headers

​429 Response Body

​Per-Route Overrides

​Best Practices

​Monitor Your Usage

​Implement Exponential Backoff

​Batch Where Possible

​Cache Responses

​Key Scoping

​FAQ

Overview

Default Limits

How the Token Bucket Works

Response Headers

Example Response Headers

429 Response Body

Per-Route Overrides

Best Practices

Monitor Your Usage

Implement Exponential Backoff

Batch Where Possible

Cache Responses

Key Scoping

FAQ