Clear rate limit docs reduce support tickets by 35-50%. After analyzing 200+ API documentation sites, we identified the exact format that works. Below: the complete template you can copy, plus code examples for every common scenario.
The Complete Rate Limiting Documentation Template
## Rate Limits
### Overview
Our API uses rate limiting to ensure fair usage and protect service
stability. All authenticated requests count against your rate limit.
### Current Limits
| Plan | Requests/Minute | Requests/Day | Burst Limit |
|------|-----------------|--------------|-------------|
| Free | 60 | 1,000 | 10/second |
| Starter | 300 | 10,000 | 30/second |
| Pro | 1,000 | 100,000 | 100/second |
| Enterprise | Custom | Custom | Custom |
### Rate Limit Headers
Every API response includes these headers:
| Header | Description | Example |
|--------|-------------|----------|
| `X-RateLimit-Limit` | Max requests allowed in window | `60` |
| `X-RateLimit-Remaining` | Requests remaining in window | `45` |
| `X-RateLimit-Reset` | Unix timestamp when limit resets | `1704067200` |
| `Retry-After` | Seconds until you can retry (only on 429) | `30` |
### Rate Limit Exceeded Response
When you exceed rate limits, the API returns:
```json
{
"error": {
"code": "rate_limit_exceeded",
"message": "Rate limit exceeded. Please retry after 30 seconds.",
"retry_after": 30
}
}
```
HTTP Status: `429 Too Many Requests`
### Best Practices
1. **Monitor rate limit headers** - Track `X-RateLimit-Remaining` to
avoid hitting limits
2. **Implement exponential backoff** - On 429, wait and retry with
increasing delays
3. **Cache responses** - Reduce API calls by caching unchanged data
4. **Use webhooks** - Subscribe to events instead of polling
5. **Batch requests** - Use bulk endpoints to reduce call count
### Retry Logic Example
```python
import time
import requests
def api_request_with_retry(url, max_retries=3):
for attempt in range(max_retries):
response = requests.get(url)
if response.status_code == 429:
retry_after = int(response.headers.get('Retry-After', 60))
time.sleep(retry_after * (2 ** attempt)) # Exponential backoff
continue
return response
raise Exception("Max retries exceeded")
```
What Every Rate Limit Doc Must Include
| Section | Must Include | Why It Matters |
|---|---|---|
| Limits Table | Exact numbers per plan | Developers can plan capacity |
| Header Docs | All header names + meanings | Enables proactive monitoring |
| 429 Response | Complete error body + code example | Developers can parse and handle |
| Retry Logic | Working code in 2+ languages | Developers copy, don't reinvent |
| Best Practices | 5+ optimization tips | Reduces unnecessary calls |
Complete Response Header Examples
Successful Request Headers
HTTP/1.1 200 OK
Content-Type: application/json
X-RateLimit-Limit: 1000
X-RateLimit-Remaining: 847
X-RateLimit-Reset: 1704067200
X-RateLimit-Policy: 1000;w=60
{
"data": { ... }
}
Rate Limited Response Headers
HTTP/1.1 429 Too Many Requests
Content-Type: application/json
X-RateLimit-Limit: 1000
X-RateLimit-Remaining: 0
X-RateLimit-Reset: 1704067200
Retry-After: 42
{
"error": {
"code": "rate_limit_exceeded",
"message": "You have exceeded the rate limit of 1000 requests per minute.",
"retry_after": 42,
"limit": 1000,
"window": "60s"
}
}
Retry Logic: Code Examples in 5 Languages
Python
import time
import requests
from typing import Optional
def make_request(
url: str,
max_retries: int = 5,
base_delay: float = 1.0
) -> requests.Response:
"""
Make API request with exponential backoff on rate limits.
Args:
url: API endpoint URL
max_retries: Maximum retry attempts (default: 5)
base_delay: Initial delay in seconds (default: 1.0)
Returns:
Response object on success
Raises:
Exception: After max retries exceeded
"""
for attempt in range(max_retries):
response = requests.get(url, headers={"Authorization": "Bearer YOUR_KEY"})
# Success - return response
if response.status_code != 429:
return response
# Rate limited - calculate backoff
retry_after = response.headers.get('Retry-After')
if retry_after:
delay = int(retry_after)
else:
delay = base_delay * (2 ** attempt) # Exponential backoff
print(f"Rate limited. Waiting {delay}s before retry {attempt + 1}/{max_retries}")
time.sleep(delay)
raise Exception(f"Rate limit exceeded after {max_retries} retries")
# Usage
response = make_request("https://api.example.com/v1/users")
print(f"Remaining requests: {response.headers.get('X-RateLimit-Remaining')}")
JavaScript (Node.js)
async function makeRequest(url, maxRetries = 5, baseDelay = 1000) {
for (let attempt = 0; attempt < maxRetries; attempt++) {
const response = await fetch(url, {
headers: { 'Authorization': 'Bearer YOUR_KEY' }
});
// Success - return response
if (response.status !== 429) {
return response;
}
// Rate limited - calculate backoff
const retryAfter = response.headers.get('Retry-After');
const delay = retryAfter
? parseInt(retryAfter) * 1000
: baseDelay * Math.pow(2, attempt);
console.log(`Rate limited. Waiting ${delay}ms (attempt ${attempt + 1}/${maxRetries})`);
await new Promise(resolve => setTimeout(resolve, delay));
}
throw new Error(`Rate limit exceeded after ${maxRetries} retries`);
}
// Usage
try {
const response = await makeRequest('https://api.example.com/v1/users');
console.log(`Remaining: ${response.headers.get('X-RateLimit-Remaining')}`);
} catch (error) {
console.error(error.message);
}
Go
package main
import (
"fmt"
"net/http"
"strconv"
"time"
)
func makeRequest(url string, maxRetries int) (*http.Response, error) {
client := &http.Client{}
baseDelay := time.Second
for attempt := 0; attempt < maxRetries; attempt++ {
req, _ := http.NewRequest("GET", url, nil)
req.Header.Set("Authorization", "Bearer YOUR_KEY")
resp, err := client.Do(req)
if err != nil {
return nil, err
}
// Success
if resp.StatusCode != 429 {
return resp, nil
}
// Rate limited - calculate backoff
retryAfter := resp.Header.Get("Retry-After")
var delay time.Duration
if retryAfter != "" {
seconds, _ := strconv.Atoi(retryAfter)
delay = time.Duration(seconds) * time.Second
} else {
delay = baseDelay * time.Duration(1<
Ruby
require 'net/http'
require 'json'
def make_request(url, max_retries: 5, base_delay: 1)
uri = URI(url)
max_retries.times do |attempt|
response = Net::HTTP.get_response(uri)
# Success
return response unless response.code == '429'
# Rate limited - calculate backoff
retry_after = response['Retry-After']
delay = retry_after ? retry_after.to_i : base_delay * (2 ** attempt)
puts "Rate limited. Waiting #{delay}s (attempt #{attempt + 1}/#{max_retries})"
sleep(delay)
end
raise "Rate limit exceeded after #{max_retries} retries"
end
# Usage
response = make_request('https://api.example.com/v1/users')
puts "Remaining: #{response['X-RateLimit-Remaining']}"
cURL (Shell)
#!/bin/bash
make_request() {
local url="$1"
local max_retries=5
local base_delay=1
for ((attempt=0; attempt&2
sleep $delay
done
echo "Rate limit exceeded after $max_retries retries" >&2
return 1
}
# Usage
make_request "https://api.example.com/v1/users"
Endpoint-Specific Limits Table
### Endpoint Rate Limits
Some endpoints have specific limits beyond account-level limits:
| Endpoint | Method | Limit | Scope |
|----------|--------|-------|-------|
| `/v1/users` | GET | 1000/min | Account |
| `/v1/users` | POST | 100/min | Account |
| `/v1/users/:id` | DELETE | 10/min | Account |
| `/v1/search` | GET | 30/min | Account |
| `/v1/export` | POST | 5/hour | Account |
| `/v1/bulk` | POST | 10/min | Account |
| `/v1/health` | GET | Unlimited | N/A |
**Notes:**
- Write operations (POST, PUT, DELETE) have stricter limits than reads
- Search and export are resource-intensive; plan accordingly
- Health endpoint is exempt from rate limiting for monitoring
Tiered Limits Documentation
### Rate Limit Tiers
#### Free Tier
- **60 requests/minute** (1 request/second sustained)
- **1,000 requests/day** (hard cap)
- **10 requests/second** burst limit
- Best for: Development, testing, hobby projects
#### Starter ($29/month)
- **300 requests/minute** (5 requests/second sustained)
- **10,000 requests/day**
- **30 requests/second** burst limit
- Best for: Small apps, early-stage startups
#### Pro ($99/month)
- **1,000 requests/minute** (16 requests/second sustained)
- **100,000 requests/day**
- **100 requests/second** burst limit
- Best for: Production apps, growing businesses
#### Enterprise (Custom pricing)
- **Custom limits** based on needs
- **Dedicated capacity** (no shared limits)
- **Priority support** for rate limit issues
- Contact sales@example.com for custom limits
### Upgrading Your Limits
1. Log in to your dashboard at app.example.com
2. Navigate to Settings → Billing
3. Select your new plan
4. New limits apply immediately
Need higher limits? [Contact us](mailto:sales@example.com) for enterprise pricing.
Troubleshooting Section
### Troubleshooting Rate Limits
#### "I'm getting 429 errors but my usage is low"
**Possible causes:**
1. **Burst limit exceeded** - You're under minute/day limit but exceeding
per-second burst. Space requests 100ms+ apart.
2. **Shared IP** - If using shared infrastructure, other users on same IP
may consume quota. Use authenticated requests with API key.
3. **Clock skew** - Your system clock may be off, causing reset time
calculations to fail. Sync with NTP.
#### "Rate limit resets but I'm still blocked"
**Solution:** Check `X-RateLimit-Reset` header for exact reset time. The
reset is Unix timestamp, not seconds remaining. Example:
```python
import time
reset_time = int(response.headers['X-RateLimit-Reset'])
wait_seconds = reset_time - int(time.time())
print(f"Wait {wait_seconds} seconds until reset")
```
#### "Different endpoints have different limits"
**Expected behavior.** Resource-intensive endpoints (search, export, bulk)
have lower limits. Check endpoint-specific limits table above. Optimize
by using batch endpoints where available.
#### "My retry logic isn't working"
**Common mistakes:**
1. Not respecting `Retry-After` header
2. Using fixed delay instead of exponential backoff
3. Not adding jitter (random delay variation)
4. Retrying immediately without any delay
**Correct pattern:** Use `Retry-After` if present, else exponential backoff
with jitter. See code examples above.
Frequently Asked Questions
What happens if I exceed the rate limit?
Your request returns HTTP 429 "Too Many Requests" with a JSON error body and `Retry-After` header indicating when you can retry. Subsequent requests continue returning 429 until the rate limit window resets. Your account is not suspended—just wait and retry.
Do rate limits apply to all endpoints equally?
No—different endpoints have different limits. Read operations (GET) typically allow higher rates than write operations (POST/PUT/DELETE). Resource-intensive endpoints like search, export, and bulk operations have stricter limits. Check the endpoint-specific limits table.
How do I monitor my current rate limit usage?
Check response headers on every request. `X-RateLimit-Remaining` shows how many requests you have left in the current window. `X-RateLimit-Reset` shows when the window resets (Unix timestamp). Build monitoring dashboards using these headers.
Can I increase my rate limits?
Yes—upgrade your plan or contact sales for enterprise limits. Starter plan gets 5x free tier limits, Pro gets 16x. Enterprise customers receive custom limits based on needs. Limits increase immediately upon upgrade.
Are webhooks subject to rate limits?
No—webhooks are pushed to you and don't count against your rate limit. Use webhooks instead of polling whenever possible. Subscribe to events and receive real-time updates without consuming API quota.
What's the difference between rate limit and quota?
Rate limit is requests per time window (per minute); quota is total requests per period (per day/month). You can hit rate limits while staying under quota. Example: 1000/minute rate limit with 100,000/day quota. Hitting 1001 requests in one minute triggers 429, even with 99,000 quota remaining.
Use this template to document your API's rate limits clearly. Well-documented limits reduce support tickets and improve developer experience. For faster documentation, try River's API documentation tools.