Skip to content

ADR-0021: Built-in Proxy Capabilities vs Service MeshΒΆ

  • Status: Accepted
  • Date: 2025-10-27
  • Deciders: Core Engineering Team

ContextΒΆ

Modern distributed applications often use service mesh infrastructure (Envoy, Istio, Linkerd) to handle cross-cutting concerns: - Load balancing and traffic routing - mTLS and authentication - Observability (metrics, tracing, logging) - Rate limiting and circuit breaking - Request/response transformation - Compression and caching

ContextForge must support diverse deployment scenarios: - Standalone execution: Single Python module (python -m mcpgateway) - Serverless platforms: AWS Lambda, Google Cloud Run, IBM Cloud Code Engine - Container orchestration: Kubernetes, OpenShift - Multi-regional deployments: Cross-region federation - Edge deployments: Minimal resource footprint

We needed to decide whether to: 1. Require external service mesh (Envoy/Istio) for all deployments 2. Build proxy capabilities directly into the application 3. Support both approaches with optional composition

DecisionΒΆ

We will embed proxy and gateway capabilities directly into the ContextForge application with support for optional service mesh composition when needed.

Built-in capabilities: - MCP-aware routing - Protocol-specific routing for tools, resources, prompts, servers - Response compression - Brotli, Zstd, GZip middleware (30-70% bandwidth reduction) - Caching - Pluggable cache backend (memory, Redis, database) - Observability - Embedded OpenTelemetry (Prometheus metrics, Jaeger/Zipkin tracing) - Authentication - JWT, Basic Auth, OAuth 2.0/OIDC - Rate limiting - Per-tool and gateway-level rate limits - Health checks - /health (liveness), /ready (readiness) - Federation - mDNS auto-discovery, peer gateway federation

Service mesh optional: - ContextForge works standalone without Envoy/Istio - Each of the 14 independent modules can integrate with service mesh when needed - Example: ContextForge translate utility behind Envoy for mTLS

Key Architectural Decision: Application-level intelligence (MCP protocol routing, tool invocation, resource management) is embedded in ContextForge modules, not delegated to infrastructure proxies. Infrastructure concerns (mTLS between all services, canary deployments, complex traffic routing) can optionally be handled by service mesh.

ConsequencesΒΆ

PositiveΒΆ

  • 🎯 Maximum deployment flexibility - From python -m mcpgateway to multi-regional K8s
  • πŸš€ Serverless-native - Works on Lambda, Cloud Run, Code Engine without infrastructure
  • 🐍 Zero infrastructure dependency - Runs with SQLite + memory cache
  • πŸ”Œ Modular composition - 14 independent modules, each can integrate with Envoy independently
  • ⚑ Application-level routing - MCP-aware, not just HTTP
  • πŸ“¦ Embedded observability - OpenTelemetry built-in, no sidecar required
  • πŸ—œοΈ Native compression - No external proxy needed for bandwidth reduction
  • πŸ’° Lower operational cost - No mandatory service mesh infrastructure

NegativeΒΆ

  • πŸ”§ Feature overlap - Some capabilities duplicate what service mesh provides
  • πŸ”„ Maintenance burden - Must maintain compression, caching, observability code
  • πŸ“š Configuration complexity - More application-level configuration vs. infrastructure

NeutralΒΆ

  • 🌐 Optional composition - Can use both ContextForge + service mesh when needed
  • πŸ“Š Different abstraction level - Application (MCP) vs. Infrastructure (HTTP/TCP)

When to Use WhatΒΆ

Use ContextForge Standalone When:ΒΆ

βœ… Lightweight deployments - Development, testing, single-node production βœ… Serverless platforms - AWS Lambda, Google Cloud Run, IBM Cloud Code Engine βœ… Edge deployments - Minimal resources, no Kubernetes βœ… Embedded use cases - Imported as Python module in other applications βœ… No existing infrastructure - Starting fresh without service mesh

Use Envoy/Istio Service Mesh When:ΒΆ

βœ… Enterprise Kubernetes - Existing service mesh infrastructure βœ… Polyglot microservices - Need unified traffic management across languages βœ… Advanced traffic routing - Canary deployments, A/B testing, complex routing rules βœ… Compliance requirements - mTLS mandated across all services βœ… Centralized policy enforcement - External proxy for all traffic

Use Both Together When:ΒΆ

βœ… Enterprise Kubernetes with MCP requirements - ContextForge modules handle MCP protocol concerns - Envoy/Istio handle infrastructure concerns (mTLS, observability, traffic routing) - Example: ContextForge gateway behind Istio ingress with mTLS between services

βœ… Hybrid deployment model - Core gateway in Kubernetes with Istio - Standalone utilities (translate, wrapper) on edge devices - Each module integrates with Envoy independently as needed

Architecture ComparisonΒΆ

Service Mesh Approach (Envoy/Istio)ΒΆ

β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
β”‚  Client Request                             β”‚
β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜
              β”‚
β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β–Όβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
β”‚  Istio Ingress Gateway (Envoy)              β”‚
β”‚  - mTLS termination                         β”‚
β”‚  - Load balancing                           β”‚
β”‚  - HTTP routing                             β”‚
β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜
              β”‚
β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β–Όβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
β”‚  ContextForge Pod                           β”‚
β”‚  β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”   β”‚
β”‚  β”‚ Envoy Sidecar                       β”‚   β”‚
β”‚  β”‚ - mTLS                               β”‚   β”‚
β”‚  β”‚ - Metrics                            β”‚   β”‚
β”‚  β”‚ - Compression                        β”‚   β”‚
β”‚  β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜   β”‚
β”‚                β”‚                             β”‚
β”‚  β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β–Όβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”   β”‚
β”‚  β”‚ ContextForge Gateway                β”‚   β”‚
β”‚  β”‚ - MCP routing                        β”‚   β”‚
β”‚  β”‚ - Tool invocation                    β”‚   β”‚
β”‚  β”‚ - Resource management                β”‚   β”‚
β”‚  β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜   β”‚
β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜

Trade-offs: - βœ… Infrastructure-level mTLS, observability, traffic management - ❌ Additional network hop (sidecar latency) - ❌ Resource overhead (Envoy sidecar per pod: ~50-100MB memory) - ❌ Requires Kubernetes + service mesh infrastructure - ❌ Doesn't work for serverless, standalone, embedded deployments

ContextForge Built-in ApproachΒΆ

β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
β”‚  Client Request                             β”‚
β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜
              β”‚
β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β–Όβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
β”‚  ContextForge Gateway                       β”‚
β”‚  - MCP-aware routing                        β”‚
β”‚  - Response compression (Brotli/Zstd/GZip)  β”‚
β”‚  - Caching (memory/Redis/database)          β”‚
β”‚  - OpenTelemetry observability              β”‚
β”‚  - Authentication (JWT/Basic/OAuth)         β”‚
β”‚  - Rate limiting                            β”‚
β”‚  - Tool invocation                          β”‚
β”‚  - Resource management                      β”‚
β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜

Trade-offs: - βœ… Zero infrastructure dependency - βœ… Works standalone, serverless, containers, Kubernetes - βœ… MCP-aware routing (not just HTTP) - βœ… Lower latency (no sidecar hop) - βœ… Lower resource usage (no sidecar overhead) - ❌ Application must handle cross-cutting concerns - ❌ No infrastructure-level mTLS (use HTTPS + JWT instead)

Hybrid Approach (Both Together)ΒΆ

β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
β”‚  Istio Ingress Gateway                      β”‚
β”‚  - External mTLS                            β”‚
β”‚  - Infrastructure load balancing            β”‚
β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜
              β”‚
β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β–Όβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
β”‚  ContextForge Gateway (no sidecar)          β”‚
β”‚  - MCP routing (application intelligence)   β”‚
β”‚  - Tool invocation                          β”‚
β”‚  - Resource management                      β”‚
β”‚  - Compression + caching (app-level)        β”‚
β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜
              β”‚
              β”œβ”€ PostgreSQL (via Istio mTLS)
              β”œβ”€ Redis (via Istio mTLS)
              └─ MCP Peers (via Istio mTLS)

Trade-offs: - βœ… Best of both worlds: MCP intelligence + infrastructure mTLS - βœ… ContextForge handles application concerns - βœ… Istio handles infrastructure concerns - ⚠️ More complex configuration - ⚠️ Requires understanding both systems

Modular Composition ExampleΒΆ

Each ContextForge module can integrate with Envoy independently:

# Example: ContextForge Translate utility behind Envoy for mTLS
apiVersion: v1
kind: Service
metadata:
  name: mcp-translate
spec:
  selector:
    app: mcp-translate
  ports:
    - port: 80
      targetPort: 9000
---
# Envoy handles external mTLS, rate limiting, load balancing
# ContextForge Translate handles MCP protocol bridging (stdio ↔ SSE ↔ HTTP)

The translate utility has zero gateway dependencies and can run: - Standalone: python -m mcptranslate --stdio "uvx mcp-server-git" --port 9000 - Behind Envoy: Envoy terminates mTLS, forwards to ContextForge translate - In Kubernetes: With or without Istio sidecar

Why This Decision MattersΒΆ

Problem: Service mesh architectures assume: - Container infrastructure (no standalone mode) - Kubernetes control plane (overhead for simple deployments) - Polyglot microservices (need HTTP-level abstraction)

Solution: ContextForge needs to work everywhere: - Development: python -m mcpgateway with zero dependencies - Serverless: AWS Lambda without sidecar infrastructure - Edge: Raspberry Pi running standalone utilities - Enterprise K8s: Multi-regional deployment with Istio (optional)

Implementation DetailsΒΆ

Built-in proxy capabilities implemented in: - Response compression: mcpgateway/main.py:888-907 - Caching: mcpgateway/services/cache_service.py - Observability: mcpgateway/observability/ (OpenTelemetry) - Authentication: mcpgateway/auth/ (JWT, Basic, OAuth) - Rate limiting: mcpgateway/middleware/rate_limit.py - Health checks: GET /health, GET /ready

Service mesh integration points: - Helm chart supports Istio annotations - Network policies compatible with service mesh - Prometheus metrics compatible with Istio telemetry - Can disable built-in compression if Envoy handles it

Alternatives ConsideredΒΆ

Option Why Not
Require Envoy/Istio for all deployments Breaks standalone, serverless, edge use cases
No proxy capabilities (external only) Poor developer experience, incompatible with serverless
Gateway-only mode (no MCP logic) Loses MCP-aware routing, tool invocation intelligence
Implement full service mesh in Python Duplicates Envoy/Istio, massive scope, poor performance

StatusΒΆ

This decision is implemented. ContextForge provides built-in proxy capabilities and optionally integrates with service mesh infrastructure.

ReferencesΒΆ

  • Architecture overview: docs/docs/architecture/index.md:163-288
  • Compression middleware: mcpgateway/main.py:888-907
  • Caching backend: ADR-007 (Pluggable Cache Backend)
  • Observability: ADR-010 (Observability via Prometheus)
  • Scaling guide: docs/docs/manage/scale.md
  • Modular architecture: ADR-019 (Modular Architecture Split)