ADR-0006: Gateway & Tool-Level Rate Limiting¶
- Status: Accepted
- Date: 2025-02-21
- Deciders: Core Engineering Team
Context¶
The MCP Gateway may serve hundreds of concurrent clients accessing multiple tools. Without protection, a single client or misbehaving tool could monopolize resources or overwhelm upstream services.
The configuration includes:
TOOL_RATE_LIMIT
: default limit in requests/min per tool/client- Planned support for Redis-based or database-backed counters
Current implementation is an in-memory token bucket.
Decision¶
Implement a rate limiter at the tool invocation level, keyed by:
- Tool name
- Authenticated user / client identity (JWT or Basic)
- Time window (per-minute by default)
Backend options:
- Memory (default for dev / single instance)
- Redis (planned for clustering / shared limits)
- Database (eventually consistent fallback)
Consequences¶
- β Prevents abuse, controls cost, and provides predictable fairness
- π Failed requests return
429 Too Many Requests
with retry headers - β Memory backend does not scale across instances; Redis required for HA
- π Optional override of limits via config/env for testing
Alternatives Considered¶
Option | Why Not |
---|---|
No rate limiting | Leaves gateway and tools vulnerable to overload or accidental DoS. |
Global rate limit only | Heavy tools can starve lightweight tools; no fine-grained control. |
Proxy-level throttling (e.g. NGINX, Envoy) | Can't distinguish tools or users inside payload; lacks granularity. |
Status¶
Rate limiting is implemented for tool routes, with TOOL_RATE_LIMIT
as the default policy.