Rate Limits
Rate limits are applied per API key. When a limit is exceeded the API returns HTTP 429. AI-heavy endpoints have separate, stricter limits because each call consumes significant compute and token budget.
Token budget
In addition to rate limits, AI-heavy endpoints consume tokens from your workspace's token budget. If your budget is exhausted the API returns HTTP 402 with "error": "Insufficient token balance".
Endpoints that consume tokens: conversations (message), sim engine (advance, create), risk evaluator (generate), market analysis.
Handling 429 responses
When you receive a 429, the response may include a Retry-After header indicating how many seconds to wait. Implement exponential backoff for resilience:
async function callWithRetry(fn, maxRetries = 3) {
for (let i = 0; i < maxRetries; i++) {
const res = await fn()
if (res.status !== 429) return res
const wait = parseInt(res.headers.get('Retry-After') || '10') * 1000
await new Promise(r => setTimeout(r, wait * (i + 1)))
}
throw new Error('Rate limit exceeded after retries')
}