Skip to main content

Rate Limits

The MCP server enforces rate limits to ensure fair usage across tenants.

Default limits

Parameter	Value
Sustained rate	10 requests/second
Burst capacity	20 requests

These limits apply per-tenant, identified by the tenant_id in your authentication token.

Rate limit response

When you exceed the limit, the server returns HTTP 429 Too Many Requests:

{
  "error": "rate_limit_exceeded",
  "message": "Too many requests. Please retry after a brief delay.",
  "retry_after_seconds": 1
}

Recommendations

Batch when possible: Use get_shipment_details with multiple IDs in a single call instead of one call per shipment.
Use results_per_page: Fetch larger pages (up to 100) to reduce the number of list_shipments_summary calls.
Cache responses: Shipment status doesn't change every second. Cache tool results for 30-60 seconds in your agent's context.
Implement backoff: On 429, wait the retry_after_seconds value before retrying.

Default limits
Rate limit response
Recommendations