Skip to content

ADR-0006: Gateway & Tool-Level Rate Limiting

  • Status: Accepted
  • Date: 2025-02-21
  • Deciders: Core Engineering Team

Context

The MCP Gateway may serve hundreds of concurrent clients accessing multiple tools. Without protection, a single client or misbehaving tool could monopolize resources or overwhelm upstream services.

The configuration includes:

  • TOOL_RATE_LIMIT: default limit in requests/min per tool/client
  • Planned support for Redis-based or database-backed counters

Current implementation is an in-memory token bucket.

Decision

Implement a rate limiter at the tool invocation level, keyed by:

  • Tool name
  • Authenticated user / client identity (JWT or Basic)
  • Time window (per-minute by default)

Backend options:

  • Memory (default for dev / single instance)
  • Redis (planned for clustering / shared limits)
  • Database (eventually consistent fallback)

Consequences

  • βœ… Prevents abuse, controls cost, and provides predictable fairness
  • πŸ“‰ Failed requests return 429 Too Many Requests with retry headers
  • ❌ Memory backend does not scale across instances; Redis required for HA
  • πŸ”„ Optional override of limits via config/env for testing

Alternatives Considered

Option Why Not
No rate limiting Leaves gateway and tools vulnerable to overload or accidental DoS.
Global rate limit only Heavy tools can starve lightweight tools; no fine-grained control.
Proxy-level throttling (e.g. NGINX, Envoy) Can't distinguish tools or users inside payload; lacks granularity.

Status

Rate limiting is implemented for tool routes, with TOOL_RATE_LIMIT as the default policy.