Rate Limits

Rate limits are applied per API key. When a limit is exceeded the API returns HTTP 429. AI-heavy endpoints have separate, stricter limits because each call consumes significant compute and token budget.

Endpoint	Limit	Note
All /v1/* endpoints	120 req / minute per API key	Shared bucket across all endpoints for a given key.
POST /v1/conversations/:id/message	20 req / minute per key	Debate rounds are compute-heavy; rate limited separately.
POST /v1/sim-engine/simulations/:id/advance	10 req / minute per key	Each advance triggers a multi-agent debate and world evolution.
POST /v1/risk-evaluator/sessions/:id/generate	10 req / minute per key	Tree generation is the most token-intensive operation.
POST /v1/market-analysis	5 req / minute per key	Each call runs a full market research pipeline (10–30 s).

Token budget

In addition to rate limits, AI-heavy endpoints consume tokens from your workspace's token budget. If your budget is exhausted the API returns HTTP 402 with "error": "Insufficient token balance".

Endpoints that consume tokens: conversations (message), sim engine (advance, create), risk evaluator (generate), market analysis.

Handling 429 responses

When you receive a 429, the response may include a Retry-After header indicating how many seconds to wait. Implement exponential backoff for resilience:

async function callWithRetry(fn, maxRetries = 3) {
  for (let i = 0; i < maxRetries; i++) {
    const res = await fn()
    if (res.status !== 429) return res
    const wait = parseInt(res.headers.get('Retry-After') || '10') * 1000
    await new Promise(r => setTimeout(r, wait * (i + 1)))
  }
  throw new Error('Rate limit exceeded after retries')
}