Rate Limits
The MCP server enforces rate limits to ensure fair usage across tenants.
Default limits
| Parameter | Value |
|---|---|
| Sustained rate | 10 requests/second |
| Burst capacity | 20 requests |
These limits apply per-tenant, identified by the tenant_id in your authentication token.
Rate limit response
When you exceed the limit, the server returns HTTP 429 Too Many Requests:
{
"error": "rate_limit_exceeded",
"message": "Too many requests. Please retry after a brief delay.",
"retry_after_seconds": 1
}
Recommendations
- Batch when possible: Use
get_shipment_detailswith multiple IDs in a single call instead of one call per shipment. - Use
results_per_page: Fetch larger pages (up to 100) to reduce the number oflist_shipments_summarycalls. - Cache responses: Shipment status doesn't change every second. Cache tool results for 30-60 seconds in your agent's context.
- Implement backoff: On 429, wait the
retry_after_secondsvalue before retrying.