Skip to main content

Rate Limits

The MCP server enforces rate limits to ensure fair usage across tenants.

Default limits

ParameterValue
Sustained rate10 requests/second
Burst capacity20 requests

These limits apply per-tenant, identified by the tenant_id in your authentication token.

Rate limit response

When you exceed the limit, the server returns HTTP 429 Too Many Requests:

{
"error": "rate_limit_exceeded",
"message": "Too many requests. Please retry after a brief delay.",
"retry_after_seconds": 1
}

Recommendations

  • Batch when possible: Use get_shipment_details with multiple IDs in a single call instead of one call per shipment.
  • Use results_per_page: Fetch larger pages (up to 100) to reduce the number of list_shipments_summary calls.
  • Cache responses: Shipment status doesn't change every second. Cache tool results for 30-60 seconds in your agent's context.
  • Implement backoff: On 429, wait the retry_after_seconds value before retrying.