ADR-0021: Built-in Proxy Capabilities vs Service MeshΒΆ
- Status: Accepted
- Date: 2025-10-27
- Deciders: Core Engineering Team
ContextΒΆ
Modern distributed applications often use service mesh infrastructure (Envoy, Istio, Linkerd) to handle cross-cutting concerns: - Load balancing and traffic routing - mTLS and authentication - Observability (metrics, tracing, logging) - Rate limiting and circuit breaking - Request/response transformation - Compression and caching
ContextForge must support diverse deployment scenarios: - Standalone execution: Single Python module (python -m mcpgateway) - Serverless platforms: AWS Lambda, Google Cloud Run, IBM Cloud Code Engine - Container orchestration: Kubernetes, OpenShift - Multi-regional deployments: Cross-region federation - Edge deployments: Minimal resource footprint
We needed to decide whether to: 1. Require external service mesh (Envoy/Istio) for all deployments 2. Build proxy capabilities directly into the application 3. Support both approaches with optional composition
DecisionΒΆ
We will embed proxy and gateway capabilities directly into the ContextForge application with support for optional service mesh composition when needed.
Built-in capabilities: - MCP-aware routing - Protocol-specific routing for tools, resources, prompts, servers - Response compression - Brotli, Zstd, GZip middleware (30-70% bandwidth reduction) - Caching - Pluggable cache backend (memory, Redis, database) - Observability - Embedded OpenTelemetry (Prometheus metrics, Jaeger/Zipkin tracing) - Authentication - JWT, Basic Auth, OAuth 2.0/OIDC - Rate limiting - Per-tool and gateway-level rate limits - Health checks - /health (liveness), /ready (readiness) - Federation - mDNS auto-discovery, peer gateway federation
Service mesh optional: - ContextForge works standalone without Envoy/Istio - Each of the 14 independent modules can integrate with service mesh when needed - Example: ContextForge translate utility behind Envoy for mTLS
Key Architectural Decision: Application-level intelligence (MCP protocol routing, tool invocation, resource management) is embedded in ContextForge modules, not delegated to infrastructure proxies. Infrastructure concerns (mTLS between all services, canary deployments, complex traffic routing) can optionally be handled by service mesh.
ConsequencesΒΆ
PositiveΒΆ
- π― Maximum deployment flexibility - From
python -m mcpgatewayto multi-regional K8s - π Serverless-native - Works on Lambda, Cloud Run, Code Engine without infrastructure
- π Zero infrastructure dependency - Runs with SQLite + memory cache
- π Modular composition - 14 independent modules, each can integrate with Envoy independently
- β‘ Application-level routing - MCP-aware, not just HTTP
- π¦ Embedded observability - OpenTelemetry built-in, no sidecar required
- ποΈ Native compression - No external proxy needed for bandwidth reduction
- π° Lower operational cost - No mandatory service mesh infrastructure
NegativeΒΆ
- π§ Feature overlap - Some capabilities duplicate what service mesh provides
- π Maintenance burden - Must maintain compression, caching, observability code
- π Configuration complexity - More application-level configuration vs. infrastructure
NeutralΒΆ
- π Optional composition - Can use both ContextForge + service mesh when needed
- π Different abstraction level - Application (MCP) vs. Infrastructure (HTTP/TCP)
When to Use WhatΒΆ
Use ContextForge Standalone When:ΒΆ
β Lightweight deployments - Development, testing, single-node production β Serverless platforms - AWS Lambda, Google Cloud Run, IBM Cloud Code Engine β Edge deployments - Minimal resources, no Kubernetes β Embedded use cases - Imported as Python module in other applications β No existing infrastructure - Starting fresh without service mesh
Use Envoy/Istio Service Mesh When:ΒΆ
β Enterprise Kubernetes - Existing service mesh infrastructure β Polyglot microservices - Need unified traffic management across languages β Advanced traffic routing - Canary deployments, A/B testing, complex routing rules β Compliance requirements - mTLS mandated across all services β Centralized policy enforcement - External proxy for all traffic
Use Both Together When:ΒΆ
β Enterprise Kubernetes with MCP requirements - ContextForge modules handle MCP protocol concerns - Envoy/Istio handle infrastructure concerns (mTLS, observability, traffic routing) - Example: ContextForge gateway behind Istio ingress with mTLS between services
β Hybrid deployment model - Core gateway in Kubernetes with Istio - Standalone utilities (translate, wrapper) on edge devices - Each module integrates with Envoy independently as needed
Architecture ComparisonΒΆ
Service Mesh Approach (Envoy/Istio)ΒΆ
βββββββββββββββββββββββββββββββββββββββββββββββ
β Client Request β
βββββββββββββββ¬ββββββββββββββββββββββββββββββββ
β
βββββββββββββββΌββββββββββββββββββββββββββββββββ
β Istio Ingress Gateway (Envoy) β
β - mTLS termination β
β - Load balancing β
β - HTTP routing β
βββββββββββββββ¬ββββββββββββββββββββββββββββββββ
β
βββββββββββββββΌββββββββββββββββββββββββββββββββ
β ContextForge Pod β
β βββββββββββββββββββββββββββββββββββββββ β
β β Envoy Sidecar β β
β β - mTLS β β
β β - Metrics β β
β β - Compression β β
β βββββββββββββββ¬ββββββββββββββββββββββββ β
β β β
β βββββββββββββββΌββββββββββββββββββββββββ β
β β ContextForge Gateway β β
β β - MCP routing β β
β β - Tool invocation β β
β β - Resource management β β
β ββββββββββββββββββββββββββββββββββββββββ β
βββββββββββββββββββββββββββββββββββββββββββββββ
Trade-offs: - β Infrastructure-level mTLS, observability, traffic management - β Additional network hop (sidecar latency) - β Resource overhead (Envoy sidecar per pod: ~50-100MB memory) - β Requires Kubernetes + service mesh infrastructure - β Doesn't work for serverless, standalone, embedded deployments
ContextForge Built-in ApproachΒΆ
βββββββββββββββββββββββββββββββββββββββββββββββ
β Client Request β
βββββββββββββββ¬ββββββββββββββββββββββββββββββββ
β
βββββββββββββββΌββββββββββββββββββββββββββββββββ
β ContextForge Gateway β
β - MCP-aware routing β
β - Response compression (Brotli/Zstd/GZip) β
β - Caching (memory/Redis/database) β
β - OpenTelemetry observability β
β - Authentication (JWT/Basic/OAuth) β
β - Rate limiting β
β - Tool invocation β
β - Resource management β
βββββββββββββββββββββββββββββββββββββββββββββββ
Trade-offs: - β Zero infrastructure dependency - β Works standalone, serverless, containers, Kubernetes - β MCP-aware routing (not just HTTP) - β Lower latency (no sidecar hop) - β Lower resource usage (no sidecar overhead) - β Application must handle cross-cutting concerns - β No infrastructure-level mTLS (use HTTPS + JWT instead)
Hybrid Approach (Both Together)ΒΆ
βββββββββββββββββββββββββββββββββββββββββββββββ
β Istio Ingress Gateway β
β - External mTLS β
β - Infrastructure load balancing β
βββββββββββββββ¬ββββββββββββββββββββββββββββββββ
β
βββββββββββββββΌββββββββββββββββββββββββββββββββ
β ContextForge Gateway (no sidecar) β
β - MCP routing (application intelligence) β
β - Tool invocation β
β - Resource management β
β - Compression + caching (app-level) β
βββββββββββββββ¬ββββββββββββββββββββββββββββββββ
β
ββ PostgreSQL (via Istio mTLS)
ββ Redis (via Istio mTLS)
ββ MCP Peers (via Istio mTLS)
Trade-offs: - β Best of both worlds: MCP intelligence + infrastructure mTLS - β ContextForge handles application concerns - β Istio handles infrastructure concerns - β οΈ More complex configuration - β οΈ Requires understanding both systems
Modular Composition ExampleΒΆ
Each ContextForge module can integrate with Envoy independently:
# Example: ContextForge Translate utility behind Envoy for mTLS
apiVersion: v1
kind: Service
metadata:
name: mcp-translate
spec:
selector:
app: mcp-translate
ports:
- port: 80
targetPort: 9000
---
# Envoy handles external mTLS, rate limiting, load balancing
# ContextForge Translate handles MCP protocol bridging (stdio β SSE β HTTP)
The translate utility has zero gateway dependencies and can run: - Standalone: python -m mcptranslate --stdio "uvx mcp-server-git" --port 9000 - Behind Envoy: Envoy terminates mTLS, forwards to ContextForge translate - In Kubernetes: With or without Istio sidecar
Why This Decision MattersΒΆ
Problem: Service mesh architectures assume: - Container infrastructure (no standalone mode) - Kubernetes control plane (overhead for simple deployments) - Polyglot microservices (need HTTP-level abstraction)
Solution: ContextForge needs to work everywhere: - Development: python -m mcpgateway with zero dependencies - Serverless: AWS Lambda without sidecar infrastructure - Edge: Raspberry Pi running standalone utilities - Enterprise K8s: Multi-regional deployment with Istio (optional)
Implementation DetailsΒΆ
Built-in proxy capabilities implemented in: - Response compression: mcpgateway/main.py:888-907 - Caching: mcpgateway/services/cache_service.py - Observability: mcpgateway/observability/ (OpenTelemetry) - Authentication: mcpgateway/auth/ (JWT, Basic, OAuth) - Rate limiting: mcpgateway/middleware/rate_limit.py - Health checks: GET /health, GET /ready
Service mesh integration points: - Helm chart supports Istio annotations - Network policies compatible with service mesh - Prometheus metrics compatible with Istio telemetry - Can disable built-in compression if Envoy handles it
Alternatives ConsideredΒΆ
| Option | Why Not |
|---|---|
| Require Envoy/Istio for all deployments | Breaks standalone, serverless, edge use cases |
| No proxy capabilities (external only) | Poor developer experience, incompatible with serverless |
| Gateway-only mode (no MCP logic) | Loses MCP-aware routing, tool invocation intelligence |
| Implement full service mesh in Python | Duplicates Envoy/Istio, massive scope, poor performance |
StatusΒΆ
This decision is implemented. ContextForge provides built-in proxy capabilities and optionally integrates with service mesh infrastructure.
ReferencesΒΆ
- Architecture overview:
docs/docs/architecture/index.md:163-288 - Compression middleware:
mcpgateway/main.py:888-907 - Caching backend: ADR-007 (Pluggable Cache Backend)
- Observability: ADR-010 (Observability via Prometheus)
- Scaling guide:
docs/docs/manage/scale.md - Modular architecture: ADR-019 (Modular Architecture Split)