Architecture OverviewΒΆ
The MCP Gateway (ContextForge) is a production-grade gateway, proxy, and registry for Model Context Protocol (MCP) servers and A2A Agents. It acts as a unified entry point for tools, resources, prompts, and servers, federating local and remote nodes into a coherent MCP-compliant interface.
High-Level Architecture SummaryΒΆ
MCP Gateway (ContextForge) is a comprehensive production-grade gateway built on modern Python technologies with a performance-first approach:
Core Technology StackΒΆ
1. Python FastAPI Application with Modern Stack
- Built with FastAPI (async web framework) for high-performance REST/JSON-RPC endpoints
- Uses Pydantic 2.11+ for runtime validation and Pydantic Settings for environment-based configuration
- Requires Python 3.11-3.13 with full async/await support throughout
- Deployed via Uvicorn (dev) or Gunicorn (production) ASGI servers
2. Multi-Database ORM Layer with SQLAlchemy 2.0
- SQLAlchemy 2.0 ORM with async support for database operations
- Supports PostgreSQL (via psycopg2), SQLite (default, file-based), and MariaDB/MySQL (via pymysql)
- Alembic for schema migrations and version control
- Connection pooling with configurable pool sizes (200 default), overflow (10), and recycling (3600s)
3. Multi-Transport Protocol Gateway
- Native MCP (Model Context Protocol) server implementation supporting protocol version 2025-03-26
- Transport mechanisms: HTTP/JSON-RPC, Server-Sent Events (SSE) with keepalive, WebSocket, stdio (for CLI integration), and streamable-HTTP
- JSON-RPC 2.0 compliant message handling with bidirectional communication
4. Federation & Registry Architecture
- Acts as an MCP Registry that federates multiple peer gateways
- Auto-discovery via mDNS/Zeroconf or manual configuration
- Redis-backed caching and federation for multi-cluster deployments (optional, can use memory or database caching)
- Health checking with configurable intervals (60s default) and failure thresholds
5. Virtual Server Composition System
- Wraps non-MCP REST/gRPC services as virtual MCP servers
- Composes tools, prompts, and resources from multiple backends into unified virtual servers
- Supports REST-to-MCP tool adaptation with automatic JSON Schema extraction
- Tool, resource, and prompt registries with versioning and rollback capabilities
6. Multi-Tenant RBAC & Authentication
- Email-based authentication with Argon2id password hashing (time_cost=3, memory_cost=65536 KiB)
- JWT authentication (HS256/RS256) with configurable expiration and audience verification
- SSO integration: GitHub OAuth, Google OAuth, Microsoft Entra ID, IBM Security Verify, Okta, Keycloak, generic OIDC
- OAuth 2.0 with Dynamic Client Registration (DCR) per RFC 7591 and RFC 8414 discovery
- Teams and RBAC: Personal teams, team invitations, role-based permissions (global/team/personal scopes)
7. Plugin Framework
- Extensible plugin system with pre/post request/response hooks
- Built-in plugins: PII filter, deny filter, regex filter, resource filter
- Plugin configuration via YAML with hot-reload support
- CLI tools for plugin management (
mcppluginscommand)
8. Admin UI & Observability
- HTMX + Alpine.js web UI for real-time management and configuration
- Real-time log viewer with filtering, search, and export (in-memory buffer with 1MB default size)
- OpenTelemetry observability with support for Jaeger, Zipkin, Phoenix, and OTLP backends
- Support bundle generation for troubleshooting (logs, config, system stats - auto-sanitized)
9. Agent-to-Agent (A2A) Integration
- Integrates external AI agents (OpenAI, Anthropic, custom) as tools within virtual servers
- Auto-tool creation for associated A2A agents with invocation routing
- Comprehensive metrics collection for agent interactions
- Configurable timeouts (30s default), retries (3 max), and agent limits (100 max)
10. Security & Rate Limiting
- Configurable authentication schemes: Basic Auth, JWT Bearer, custom headers
- Rate limiting with configurable tool rate limits (100 req/min default) and concurrent limits (10)
- Security headers (HSTS, X-Frame-Options, CSP, X-Content-Type-Options, X-XSS-Protection, Referrer-Policy), CORS with domain whitelisting
- Input validation with JSON Schema, length limits, and dangerous pattern detection
- mTLS support for plugin client-server communication
11. Resource & Content Management
- URI-based resource access with MIME detection and content negotiation
- Resource caching (1000 items, 3600s TTL) with size limits (10MB default)
- Support for text, markdown, HTML, JSON, XML, images (PNG/JPEG/GIF)
- Jinja2 template rendering for prompts with multimodal support
12. Development & Testing Infrastructure
- Comprehensive test suite: unit, integration, e2e, security, fuzz, Playwright UI tests
- VS Code Dev Container support with pre-configured environment
- Hot-reload development mode with debug logging
- Extensive linting: Black, isort, Ruff, Flake8, Bandit, Pylint, mypy (strict mode)
- Coverage tracking with HTML reports and pytest-cov integration
13. Deployment & Scalability
- Docker/Podman container images with rootless support
- Kubernetes-ready with Redis-backed federation for multi-cluster deployments
- IBM Cloud Code Engine deployment automation via Makefile targets
- Environment-based configuration (
.envfiles) with 100+ configurable parameters - Production-ready logging (JSON/text formats) with rotation support
14. Well-Known URI & Standards Compliance
- Implements
.well-known/mcpendpoint for MCP discovery - Configurable
robots.txt,security.txt, and custom well-known files - Standards compliance: RFC 5424 (syslog), RFC 7591 (DCR), RFC 8414 (OAuth discovery), JSON-RPC 2.0
15. MCP Server Catalog & Service Discovery
- MCP Server Catalog feature for centralized server registry
- YAML-based catalog configuration with auto-health checking
- Pagination support (100 items/page default) with caching (3600s TTL)
Performance-Optimized FoundationΒΆ
ContextForge leverages cutting-edge Rust-powered components for maximum throughput and minimal latency:
-
orjson (Rust-Powered JSON Serialization): High-performance JSON parsing and serialization using Rust internals, delivering 5-6x faster serialization and 1.5-2x faster deserialization compared to Python's standard library, with 7% smaller output size. Enables sub-millisecond JSON-RPC responses even for large payloads.
-
Pydantic V2 (Rust-Core Validation): Runtime validation and data serialization powered by Pydantic 2.11+ with its Rust-based pydantic-core engine, providing 5-50x performance improvements over Pydantic V1 for schema validation, type coercion, and model serialization.
-
PyO3 Rust Plugins: Extensible plugin framework supporting Rust modules via PyO3, enabling native-speed Python extensions for performance-critical operations. Rust plugins provide near-native performance for compute-intensive tasks while integrating seamlessly with the Python ecosystem.
This performance-first architecture enables ContextForge to handle high-throughput workloads while maintaining low latency for tool invocations, resource access, and federation operations.
CI/CD Pipeline with GitHub ActionsΒΆ
Automated Quality Assurance & Security
The project maintains production-grade quality through comprehensive GitHub Actions workflows:
Build & Package Workflows:
- Python Package Build (
python-package.yml): Multi-version builds (Python 3.10-3.12) with wheel/sdist creation, metadata validation (twine), manifest checking, and package quality assessment (pyroma) - Docker Release (
docker-release.yml): Automated container image releases to GitHub Container Registry (GHCR) with semantic versioning
Testing & Coverage:
- Tests & Coverage (
pytest.yml): Comprehensive test suite across Python 3.11-3.12 with pytest, branch coverage measurement (70% threshold), doctest validation (40% threshold), and coverage reporting - Playwright UI Tests: End-to-end browser automation testing for admin UI workflows
Security Scanning:
- Bandit Security (
bandit.yml): Python static analysis for security vulnerabilities (MEDIUM+ severity, HIGH confidence), SARIF upload for GitHub Security tab, weekly scheduled scans - CodeQL Advanced (
codeql.yml): Multi-language analysis (JavaScript/TypeScript, Python, GitHub Actions), security vulnerability detection, code quality checks, weekly scheduled scans (Wednesday 21:15 UTC) - Dependency Review (
dependency-review.yml): Automated dependency vulnerability scanning on pull requests
Container Security:
- Secure Docker Build (
docker-image.yml): - Dockerfile linting with Hadolint (SARIF reports)
- Image linting with Dockle (SARIF reports)
- SBOM generation with Syft (SPDX format)
- Vulnerability scanning with Trivy and Grype (CRITICAL CVEs)
- Image signing and attestation with Cosign (keyless OIDC)
- BuildKit layer caching for faster rebuilds
- Weekly scheduled scans (Tuesday 18:17 UTC)
Code Quality & Linting:
- Lint & Static Analysis (
lint.yml): Comprehensive multi-tool linting matrix - Syntax & Format: yamllint, JSON validation (jq), TOML validation (tomlcheck)
- Python Analysis: Flake8, Ruff, Unimport, Vulture (dead code), Pylint (errors-only), Interrogate (100% docstring coverage), Radon (complexity metrics)
- Web Assets: HTML/CSS/JS linting (
lint-web.yml) - Each linter runs in isolated matrix jobs for fast-fail visibility
Deployment:
- IBM Cloud Code Engine (
ibm-cloud-code-engine.yml): Automated deployment to IBM Cloud with environment configuration and health checks
Key CI/CD Features:
- SARIF Integration: All security tools upload findings to GitHub Security tab for centralized vulnerability management
- Artifact Management: Build artifacts (wheels, coverage reports, SBOMs) uploaded and versioned
- Fail-Safe Design: Continue-on-error for non-blocking scans with final quality gates
- Scheduled Scans: Weekly security scans to catch newly disclosed CVEs
- Multi-Version Testing: Matrix builds across Python 3.10-3.13 to ensure compatibility
- Cache Optimization: Pip cache, BuildKit cache, and dependency caching for faster runs
This architecture supports both small single-instance deployments (SQLite + memory cache) and large-scale multi-cluster deployments (PostgreSQL + Redis + federation), making it suitable for development, staging, and production environments.
Deployment Flexibility: Why Not Envoy/Istio?ΒΆ
ContextForge is architected for maximum deployment flexibility, from standalone Python modules to multi-regional container orchestration. This design philosophy fundamentally differs from service mesh architectures like Envoy/Istio.
Modular Standalone ExecutionΒΆ
The ContextForge ecosystem consists of independently deployable modules that can run standalone or be composed together:
Core Gateway:
mcp-contextforge-gateway-core- FastAPI gateway with 33 services, 11 routers (~150K lines)mcp-contextforge-gateway-ui- HTMX + Alpine.js admin interface
Independent Utilities (Zero Gateway Dependencies):
mcp-contextforge-translate- Protocol bridge: stdio β SSE β HTTP β gRPCmcp-contextforge-wrapper- MCP client wrappermcp-contextforge-reverse-proxy- NAT/firewall traversal proxy
Plugin Ecosystem:
mcp-contextforge-plugins-python- 40+ Python pluginsmcp-contextforge-plugins-rust- High-performance PyO3 plugins
MCP Servers (Zero Gateway Dependencies):
Sample servers:
mcp-contextforge-mcp-servers-python- 4 Python serversmcp-contextforge-mcp-servers-go- 5 Go servers (static binaries, 5-15 MB)mcp-contextforge-mcp-servers-rust- Rust servers (static binaries, 3-10 MB)
Agent Runtimes:
Sample runtimes:
mcp-contextforge-agent-runtimes- LangChain + future runtimes
Infrastructure:
mcp-contextforge-helm- Kubernetes Helm charts (OCI registry)mcp-contextforge-deployment-scripts- Terraform, Ansible, Docker Compose
Deployment SpectrumΒΆ
Standalone Execution:
- Single Python module:
python -m mcpgateway - CLI tools:
mcptranslate,mcpwrapper,mcpreverseproxy - No external dependencies (SQLite + memory cache)
- Can be imported and embedded in other Python applications
Serverless-Native:
- IBM Cloud Code Engine (native deployment automation via Makefile)
- AWS Lambda (event-driven functions)
- Google Cloud Run (containerized serverless)
- Azure Container Apps
Container Orchestration:
- Kubernetes (vanilla, HPA, StatefulSets)
- Red Hat OpenShift (enterprise K8s)
- Docker Compose (local multi-container)
- Podman (rootless containers)
- Multi-arch container images (amd64, arm64, s390x)
Multi-Regional Deployments:
- Federation across geographic regions
- Redis Cluster for distributed caching
- PostgreSQL HA with replication
- Multi-cluster service mesh (optional)
Why This Matters: Envoy/Istio ComparisonΒΆ
Envoy/Istio Service Mesh Requires:
- Container infrastructure (no standalone mode)
- Kubernetes control plane (overhead for simple deployments)
- Service mesh complexity (sidecar injection, mTLS configuration)
- External proxy layer (additional network hop, increased latency)
- Minimum resource overhead (sidecar per pod)
ContextForge Provides:
- Built-in proxy/gateway capabilities - No external proxy needed
- Application-level MCP routing - Protocol-aware, not just HTTP
- Embedded observability - OpenTelemetry, Prometheus built-in
- Native compression and caching - No sidecar required (Brotli/Zstd/GZip)
- Zero-infrastructure dev mode - SQLite + memory cache
- Modular composition - 14 independently deployable modules
- Multi-format packaging - PyPI packages, container images with multi-arch support, Helm charts, and binaries
When to Use WhatΒΆ
Use Envoy/Istio When:
- You need advanced service mesh features across ALL services (mutual TLS, canary deployments, complex traffic routing)
- You have existing service mesh infrastructure
- You're running polyglot microservices requiring unified traffic management
- Compliance requires external traffic control
Use ContextForge Standalone When:
- Lightweight MCP gateway without infrastructure overhead
- Development, testing, serverless deployments
- Edge deployments with minimal resources
- Embedded use cases (Python application integration)
- Single-node or small-scale deployments
Use Both Together When:
- Running in enterprise Kubernetes with service mesh requirements
- ContextForge modules handle MCP protocol concerns
- Envoy/Istio handle infrastructure concerns (mTLS, observability, traffic routing)
- Example: ContextForge gateway behind Istio ingress with mTLS between services
Modular Composition with EnvoyΒΆ
Each ContextForge module can integrate with Envoy independently:
# Example: ContextForge Translate + Envoy
apiVersion: v1
kind: Service
metadata:
name: mcp-translate
spec:
selector:
app: mcp-translate
---
# Envoy handles external mTLS, rate limiting, load balancing
# ContextForge Translate handles MCP protocol bridging
The key architectural decision is application-level intelligence (MCP-aware routing, tool invocation, resource management) embedded in ContextForge modules, not delegated to infrastructure proxies. This enables:
- Protocol intelligence: MCP-specific routing, federation, tool registry
- Deployment flexibility: From
python -m mcpgatewayto multi-regional K8s - Packaging options: PyPI, containers, binaries, Helm charts
- Optional composition: Works standalone OR with Envoy/Istio when needed
This architecture supports both small single-instance deployments (SQLite + memory cache) and large-scale multi-cluster deployments (PostgreSQL + Redis + federation), making it suitable for development, staging, and production environments.
System ArchitectureΒΆ
graph TD
subgraph Clients
ui["Admin UI (Browser)"]
cli["CLI Tools"]
sdk["SDK / Scripts"]
end
subgraph Gateway
app["FastAPI App"]
auth["Auth Middleware<br/>(JWT + Basic)"]
router["Transport Router<br/>(HTTP / WS / SSE / STDIO)"]
services["Service Layer<br/>(Tool / Resource / Prompt / Server)"]
db["Async DB<br/>(SQLAlchemy + Alembic)"]
cache["Cache Backend<br/>(memory / redis / db)"]
metrics["Metrics Exporter<br/>(/metrics Prometheus)"]
end
subgraph Federation
discovery["Discovery Service<br/>(DNS-SD + Static Peers)"]
peers["Remote Gateways"]
end
ui --> app
cli --> router
sdk --> router
app --> auth --> router
router --> services
services --> db
services --> cache
services --> metrics
services --> discovery
discovery --> peers
Each service (ToolService, ResourceService, etc.) operates independently with unified auth/session/context layers.
Additional Architecture DocumentationΒΆ
- Export/Import System Architecture - Technical design of configuration management system
- Plugin Framework Specification - Technical design of the gateway plugin system
ADRs and Design DecisionsΒΆ
We maintain a formal set of Architecture Decision Records documenting all major design tradeoffs and rationale.
π See the full ADR Index β