Skip to main content

Documentation Index

Fetch the complete documentation index at: https://docs.metabind.ai/llms.txt

Use this file to discover all available pages before exploring further.

Rate limiting protects your project (and the platform) from runaway tool-call traffic — accidental or otherwise. Two layers apply today: a per-project ceiling on Data Tool execution, and a per-IP ceiling on anonymous traffic to public MCP projects.

Default limits

ScopeLimitWhere it applies
Data Tool execution per project, per minute60 callsEvery Data Tool handler invocation across both production and draft endpoints, regardless of caller identity. Sum across all callers on the project.
Public anonymous MCP traffic per IP, per hour60 calls (default), burst 10Public-visibility MCP projects only. Authenticated requests (API key or JWT) are not subject to this limit.
Public anonymous MCP traffic — configurable maxup to 600 / hour, burst 60Per-project override. Set settings.mcp.publicRateLimit.requestsPerHour and .burst. The platform refuses values higher than the published max to keep “public” from meaning “uncapped.”
Limits apply at the host edge. Once you’re rate-limited, additional calls return 429 Too Many Requests until the window rolls over. The audit log captures rejections for diagnosis.

What gets rate-limited

The per-project Data Tool minute cap counts every Data Tool execution, regardless of where it came from:
  • AI-initiated calls from MCP hosts (Claude Desktop, ChatGPT, and any custom host).
  • Calls from the Assistant SDK in your iOS, Android, or React app.
  • Test-panel runs in MCP App Studio (yes, even your own testing).
  • Programmatic calls via the REST API.
What doesn’t count toward the Data Tool minute cap:
  • tools/list calls — host capability discovery is uncapped under normal usage.
  • Interactive Tool renders — the tool-execution rate limiter is scoped to the Data Tool sandbox.
  • MCP App Studio editor operations.
  • Audit log queries.
The per-IP public limit applies only to anonymous traffic against public-visibility MCP projects. As soon as a request authenticates with an API key or JWT, the per-IP limit is not consulted.

Per-project public rate-limit configuration

For a public-visibility project, override the default in Settings → MCP → Public rate limit or via the API:
{
  "settings": {
    "mcp": {
      "visibility": "public",
      "publicRateLimit": {
        "requestsPerHour": 300,
        "burst": 30
      }
    }
  }
}
The platform clamps requestsPerHour to [1, 600] and burst to [1, 60]. Higher values require a support conversation — the published max is enforced in code, not policy. For private projects, the per-IP public limit is irrelevant — every request is authenticated.

Authenticated traffic

There is no documented per-token or per-user rate limit on authenticated calls today. Authenticated traffic is bounded only by:
  • The per-project Data Tool minute cap (60/min) when a tool execution is involved.
  • Underlying infrastructure protections (timeouts, sandbox concurrency, AWS-side throttling).
If you’re planning a high-throughput deployment that you expect to exceed the per-project minute cap, contact support early — capacity-planning conversations with concrete QPS numbers move faster than runtime escalations.

Rate-limit responses

When a call is rejected, the response is HTTP 429 with a structured body:
{
  "error": {
    "code": "rate_limit_exceeded",
    "message": "Rate limit exceeded",
    "retry_after_seconds": 12
  }
}
retry_after_seconds is the wait time until the bucket has capacity again. Hosts and clients should respect it — retrying immediately just spins on the rejection.

What the AI does on rate limit

Most AIs handle 429 responses gracefully:
  • Anthropic’s Claude pauses and retries after the indicated delay.
  • OpenAI’s GPT models log the error and may surface it to the user if retries fail.
  • Custom hosts: implement retry-with-backoff yourself.
The AI sees a structured error, so it can also explain the situation to the user: “I’m being rate-limited; let me wait a moment and try again.” Whether the user sees this depends on the host’s UX.

Long-running operations

A Data Tool that runs near the 60-second sandbox limit consumes one call’s budget while running. It does not keep counting against the minute cap second by second — but it does take an execution slot until it returns. For long-running flows, design with task support so the client polls without burning the rate budget on retries. See Sandboxed execution: task support.

Client-side throttling

Even with server-side limits, throttle on the client for hosts you control:
  • Assistant SDK. Already queues over-limit calls rather than failing.
  • Custom MCP clients. Implement a token bucket sized to match the server limit — for the Data Tool cap, that’s 60/minute project-wide.
  • Batch when possible. A Data Tool that takes a list of IDs costs one call; one call per ID costs N.

Monitoring

The audit log captures every 429. Key signals:
  • A spike of 429s against one project. Either traffic outgrew the per-project cap, or a misbehaving caller is hammering one tool.
  • Public-anonymous 429s. A public project is hot; either the default is too tight for the use case, or someone’s running a script. Inspect the IP distribution in the audit log before raising.
  • Sustained 429s. Plan capacity with support; the per-project Data Tool cap is the limit you’ll need to discuss raising.

Audit logs

Where rate-limit rejections are recorded.

Project visibility

Public projects, the kill-switch, and the per-IP limit.

Sandboxed execution

Concurrency and time limits inside the sandbox.

Schema validation

Server-side gates on tool calls.