ContextForge Modular Runtime Architecture¶

Status: Proposed target architecture and implementation entry point

This document defines the target-state modular runtime architecture for ContextForge.

It is intended to support:

the existing MCP gateway
the existing A2A gateway
future LLM gateway runtimes
future REST and gRPC gateway runtimes
implementations in different languages, including Python, Rust, and Go

Purpose¶

ContextForge already contains multiple protocol-facing runtime paths inside one Python application. The Rust MCP runtime proves that a protocol runtime can be split out into a separate implementation while the core platform remains the system of record for security and catalog state.

This specification generalizes that idea into a reusable architecture:

a core platform that owns policy, persistence, catalogs, plugins, admin UI, and observability
one or more protocol modules that own protocol wire behavior and transport or runtime semantics

The goal is not to rewrite the product. The goal is to create a stable architecture that can evolve incrementally from the current codebase.

Scope¶

This document is about runtime decomposition, not package layout.

It is complementary to ADR-019: Modular Architecture Split (14 Independent Modules), which is about packaging and repository structure.

This document deliberately distinguishes between:

implemented precedent The current Rust MCP runtime sidecar and existing external plugin runtimes.
target architecture The longer-term modular contract that future MCP, A2A, LLM, and REST or gRPC runtimes should follow.
migration guidance The phased path from the current monolith to that target.

Non-Goals¶

This specification does not:

require a ground-up rewrite
require all protocols to be extracted at once
require all modules to be sidecars immediately
freeze final protobuf package names or generated SDK layout
replace protocol-specific documents such as Rust MCP Runtime

Relationship to Existing Architecture Docs¶

Document	Role
Rust MCP Runtime	Describes the currently implemented MCP sidecar/runtime path and rollout modes
ADR-043	Records the implemented Rust MCP sidecar and mode model
Multitenancy	Defines team scoping and visibility rules that remain core-owned
OAuth Design	Defines auth and credential handling that remain core-owned
Plugin Framework	Defines plugin behavior that remains centrally configured and enforced by the core

How to Use This Specification¶

Use the documents in this order:

this page for the architectural rules and operating model
Core SPI for the module-to-core contract
Module Descriptor and Module Lifecycle for module registration and startup behavior
Error Model and Conformance for compatibility and release requirements
the protocol profile for the module being implemented:
MCP Module Profile
A2A Module Profile
LLM Module Profile
REST/gRPC Module Profile

Key Decisions¶

Decision	Summary	Source
Core owns policy; modules own protocol	Auth, RBAC, catalogs, plugins, persistence, and admin UI stay in the core; wire protocol and transport behavior move to modules	This document
Process boundary first	Sidecars are the default modular boundary; embedded runtimes are an optimization	This document
Default IPC transport	gRPC over Unix Domain Socket is the target-state module-to-core transport; HTTP/JSON remains an explicit fallback	ADR-044
Auth remains in core	Modules consume authenticated context; they do not become independent auth authorities	ADR-045
Shared-nothing between modules	Modules do not import or call each other directly; cross-protocol behavior is mediated by the core	ADR-046
Incremental migration	The architecture is adopted by refactoring the current system in phases, not by rewrite	ADR-047

Current Implemented Precedent¶

ContextForge today is primarily a monolithic Python application, but two existing patterns already prove the modular direction:

Rust MCP runtime sidecar The MCP streamable HTTP public path can run through a Rust sidecar while Python remains authoritative for auth, token scoping, and RBAC.
External plugin runtimes Plugins can already run out of process behind a language-neutral transport.

These are important precedents, but they are not yet the full target modular contract.

In particular, the current Rust MCP runtime is a transition architecture:

it is a real external runtime
it proves sidecar deployment, direct ingress, and mode-based rollout
it still contains performance-oriented implementation details that are more specific than the long-term generic module boundary

This document defines the steadier target boundary that future modules should converge on.

Implementation Status¶

The modular architecture is no longer purely speculative. One protocol module is already implemented and validated.

Protocol family	Module status	Notes
MCP	Implemented	Rust MCP runtime sidecar exists today with mode-based rollout and direct-ingress support
A2A	Not yet extracted	Current A2A runtime remains embedded in Python
LLM	Not yet extracted	Current LLM proxy and chat flows remain embedded in Python
REST/gRPC	Not yet extracted	Current virtualization and service-management flows remain embedded in Python

The important consequence is that this spec is grounded in a working MCP module rather than a hypothetical first extraction.

Current Precedent vs Target State¶

The spec must be explicit about what is implemented today versus what future modules should target.

Topic	Implemented today	Target-state default
First extracted runtime	Rust MCP sidecar	Additional protocol modules, potentially in Rust, Go, or Python
Sidecar transport to core	Narrow internal HTTP over local/private transport, including UDS or loopback depending on path	gRPC over UDS
Fallback transport	HTTP/JSON	HTTP/JSON
Ingress ownership	Both valid today: Python-owned ingress and direct Rust ingress depending on mode	Both valid patterns remain acceptable
Auth authority	Python core	Core platform
Plugin parity	Achieved through a mix of direct core-sensitive handling and selective delegation	Explicit SPI or core-delegation contract
Data-path optimizations	Rust MCP keeps targeted fast paths	Allowed, but must preserve contract and rollback behavior

Architecture Principles¶

Core owns policy; modules own protocol. The core platform owns security, persistence, catalogs, plugins, configuration, observability, and admin UI. Modules own transport, wire format, session/runtime semantics, capability negotiation, and upstream protocol behavior.
Process boundary first. The default modular boundary is a sidecar or sibling process. Embedded in-process runtimes are allowed where justified, but they are not the default design center.
Language-neutral contracts. Contracts between the core and modules must not depend on Python object identity, ORM models, or framework internals.
Shared-nothing between modules. Modules do not import or call one another directly. Cross-protocol behavior flows through the core.
Compatibility and rollback first. Each extraction step must preserve an operational rollback path.
Incremental migration over rewrite. The existing codebase remains the migration source; tests and behavior remain the regression oracle.

Reference Model¶

The following diagram is logical, not strictly physical. A module may sit behind core-managed routing or may own direct public ingress while still using the core for policy and catalog decisions.

flowchart TD
    client[Client]
    ingress[Ingress / Proxy / TLS termination]
    core[Core Platform]
    mcp[MCP Module]
    a2a[A2A Module]
    llm[LLM Module]
    rest[REST or gRPC Module]
    upstream[Upstream service or core-owned catalog action]

    client --> ingress
    ingress --> core
    ingress --> mcp
    core --> mcp
    core --> a2a
    core --> llm
    core --> rest
    mcp --> core
    a2a --> core
    llm --> core
    rest --> core
    mcp --> upstream
    a2a --> upstream
    llm --> upstream
    rest --> upstream

Two ingress patterns are valid:

client -> ingress -> core -> module Use when the core remains the public edge and the module is an internal runtime behind it.
client -> ingress -> module -> core SPI Use when the module owns the public protocol edge directly, as the Rust MCP runtime already does in edge and full mode.

Responsibilities by Plane¶

flowchart LR
    subgraph ControlPlane[Core platform control and policy plane]
        auth[Authentication and RBAC]
        scope[Token scoping and visibility]
        catalog[Catalogs and CRUD]
        plugin[Plugin policy]
        config[Config and secrets]
        admin[Admin UI and observability]
    end

    subgraph RuntimePlane[Protocol runtime plane]
        wire[Wire parsing and serialization]
        transport[Transport ownership]
        session[Session or task runtime]
        caps[Capability negotiation]
        upstream[Upstream protocol behavior]
    end

Core Platform Responsibilities¶

The core platform remains the common control plane and policy plane. The table below is representative, not exhaustive.

Responsibility	Notes
Authentication	JWT verification, SSO integration, token normalization, revocation checks
Authorization	RBAC, team scoping, visibility filtering, deny-path behavior
Persistence	Database models, migrations, consistency, ownership metadata
Catalogs	Tools, resources, prompts, servers, gateways, agents, providers, and other core-owned records
CRUD and admin flows	Registration, update, delete, import/export, admin workflows
Plugin policy and configuration	Central plugin config, hook selection, hook execution policy
Prompt/completion/roots business services	Prompt rendering policy, completion services, roots services, and other catalog-backed non-wire operations
LLM and upstream provider control plane	Provider credentials, model configuration, policy-aware routing metadata
gRPC and REST control surfaces	Core-owned registration, exposure metadata, and governance for virtualized services
Observability	Traces, logs, metrics, audit signals, support bundles
Configuration and secrets	Global config precedence, secret resolution, encryption
Cross-protocol routing	Mediate calls between protocol modules through core-owned catalogs and services
Admin UI	Platform UI remains core-owned even when runtimes are modularized

The core does not own protocol wire parsing, protocol transport semantics, or protocol-specific session state machines once those are extracted into a module.

Protocol Module Responsibilities¶

Each protocol module owns protocol-facing runtime behavior for one protocol family.

Responsibility	Example
Wire parsing and serialization	MCP JSON-RPC, A2A request envelopes, future LLM request formats
Protocol transport	streamable HTTP, SSE, WebSocket, stdio, long-poll, push channels
Runtime/session semantics	MCP session lifecycle, A2A task state handling, LLM chat session flow
Capability negotiation	MCP `initialize`, A2A capability advertisement, future provider capability declarations
Upstream protocol behavior	MCP upstream client pooling, A2A invocation behavior, LLM provider relay logic
Protocol-specific health and stats	Runtime-owned counters, transport stats, protocol-specific readiness

Modules should not become independent sources of truth for security policy, catalog ownership, or long-term persistence rules.

Module Runtime Contract¶

The contract has three parts:

module identity
module lifecycle
core SPI service families

Those concrete documents live under Modular Runtime Specification.

Module Identity¶

Every module should declare a stable descriptor with fields equivalent to:

module id
protocol family
implementation language
module version
supported SPI version(s)
runtime mode
embedded
sidecar
exposed capabilities
health and stats endpoints or RPCs

The exact wire schema is implementation detail. The architectural requirement is that the core can discover what a module is, what contract version it supports, and how to talk to it.

Module Lifecycle¶

Every module should support these lifecycle phases:

register The module is discovered and its descriptor is loaded.
initialize The core provides configuration, scoped dependencies, and any required bootstrap state.
ready The module can accept live traffic.
drain The module stops accepting new work and lets in-flight work complete.
shutdown The module releases resources and exits cleanly.

At minimum, the core must be able to ask a module for:

readiness
liveness
version and capability metadata
runtime stats

Core SPI Service Families¶

The exact API surface will evolve, but the module contract should be organized around stable service families rather than one-off internal endpoints.

Service family	What it provides
Auth and policy	Resolve caller context, validate authenticated identity, check permissions, enforce token-scoped visibility
Catalog read and invoke	List, fetch, and invoke tools, resources, prompts, agents, servers, gateways, providers, and related core-owned records through policy-aware services
Session and event services	Session lookup, ownership checks, replay/event access where the protocol requires shared session or event semantics
Plugin services	Execute or delegate plugin-sensitive pre/post operations under core-owned plugin policy
Observability	Trace context propagation, structured logging, audit events, module metrics publication
Configuration and secrets	Scoped config delivery, secret references, feature flags, core-provided defaults
Admin and health integration	Module stats, health, and optional descriptors the core UI can surface

Two constraints are intentionally fixed:

the architecture does not freeze exact final RPC names yet
the architecture does not require one giant interface; multiple smaller service definitions are preferred

Communication Model¶

Default Transport¶

The target-state default module-to-core transport is:

gRPC over Unix Domain Socket

Why:

language-neutral
streaming support
well understood code generation story for Python, Rust, and Go
suitable for host-local sidecar communication

This is the target-state default, not a claim about every implemented module today. The current Rust MCP runtime is the main precedent and still uses a mix of narrow internal HTTP over local/private transport depending on the path.

Fallback Transport¶

The fallback transport is:

HTTP/JSON over loopback or internal network

This is acceptable when:

a gRPC toolchain is undesirable
a runtime only needs request/response behavior
an operator environment prefers plain HTTP for debugging or policy reasons

Embedded Mode¶

Embedded modules may bypass serialization and call the same conceptual contract directly in-process.

This is an optimization, not a different architecture.

No Direct Module-to-Module Calls¶

Modules do not import or invoke each other directly.

Cross-protocol behavior must be mediated by the core.

Example:

sequenceDiagram
    participant MCP as MCP module
    participant Core as Core platform
    participant A2A as A2A module

    MCP->>Core: Invoke tool by catalog entry
    Core->>Core: Apply auth, visibility, RBAC, plugin policy
    Core->>A2A: Dispatch invoke to owning protocol runtime
    A2A-->>Core: Structured result
    Core-->>MCP: Structured result

That preserves language independence and keeps routing policy in one place.

Security and Trust Model¶

The modular architecture does not distribute trust equally.

Core-Owned Security Responsibilities¶

The core remains the source of truth for:

authentication
token scoping
RBAC
secret storage and decryption
rate limiting policy
audit and security logging

Modules may enforce the outcome of a core decision, but they do not become independent security authorities.

What Modules Receive¶

Modules should receive:

a typed authenticated context
permission decisions or permission-check APIs
scoped resource visibility through catalog calls

Modules should not be expected to:

interpret raw JWT claims as the source of truth
fetch and decrypt stored credentials
invent their own team-scoping semantics

Trust Boundary for Sidecars¶

Sidecars must communicate with the core over a trusted local or private channel. The deployment mechanism may vary, but the architectural requirements are:

the core can authenticate the module channel
arbitrary external clients cannot call privileged core-internal module APIs
channel permissions or network policy are explicit

Cross-Protocol Mediation¶

ContextForge already has cross-protocol behaviors:

A2A agents exposed as MCP tools
LLM chat invoking MCP tools
REST and gRPC services exposed as virtual servers or tools

In the modular architecture, those behaviors stay possible, but the routing belongs to the core.

The core is responsible for:

deciding which catalog entry is being invoked
determining the owning protocol/runtime
applying policy, plugin rules, and observability
dispatching to the appropriate module

The key consequence is that modules remain isolated, while the product keeps a single coherent governance model.

Plugin Model¶

Plugins remain a core-owned concern.

That means:

plugin configuration stays centralized
the core defines which hooks run
modules must preserve plugin parity on plugin-sensitive flows

There are two acceptable implementation patterns:

the module explicitly calls a core plugin SPI around the relevant operation
the module delegates a plugin-sensitive flow back to the core when parity requires it

This keeps plugin behavior consistent even when the fast path moves into a different language.

Deployment Patterns¶

The architecture supports three deployment patterns.

1. Monolithic / Embedded¶

The core and modules run in one process.

Use when:

minimizing operational complexity
migrating incrementally
performance-sensitive in-process execution is justified

2. Hybrid¶

Some protocols remain embedded while others move into sidecars.

This is the current precedent with the Rust MCP runtime:

Python remains the core
MCP may run through a Rust sidecar
A2A and other runtime paths remain embedded in Python

3. Full Sidecar Model¶

Multiple protocol runtimes run as separate processes, possibly in different languages, while the core remains the shared control plane.

This is the long-term extensibility model for future A2A, LLM, and REST/gRPC modules.

Configuration Model¶

Configuration remains layered and core-owned.

The architecture should distinguish:

core-global settings Shared platform settings such as auth, database, Redis, plugin config, and observability.
module-scoped settings Protocol-specific runtime settings such as protocol version behavior, transport tuning, or runtime-specific timeouts.

Module-scoped settings should use explicit namespacing and should be delivered through the module runtime contract rather than by relying on unrestricted global process imports.

Health, Failure, and Fallback¶

Every module should define:

how it reports readiness and liveness
how it reports degraded mode
what happens if the core becomes unavailable
what happens if the module becomes unavailable
whether traffic can fall back to an embedded or legacy path

This is especially important for incremental rollout.

The current Rust MCP runtime already demonstrates this pattern through mode-based rollout and rollback. Future modules should preserve the same operational discipline.

Testing and Release Requirements¶

Every protocol module should be expected to prove:

contract compatibility with the core SPI
protocol conformance for its protocol surface
security deny paths
fallback and rollback behavior
plugin parity for plugin-sensitive flows
performance and degradation characteristics appropriate to the protocol

Where a module introduces deployment-specific behavior, release validation should also cover:

compose or local stack validation
Kubernetes or Helm validation where applicable
upgrade and migration compatibility where applicable

The concrete target-state test matrix is defined in Conformance.

Migration Strategy¶

The migration is intentionally phased.

Phase 0: Extract seams inside the monolith¶

Create clearer boundaries inside the current Python code:

isolate protocol dispatch
isolate policy and catalog boundaries
reduce direct cross-service coupling where practical

Phase 1: Define the core SPI¶

Define the first stable internal service families between core and modules.

At this stage, modules may still be embedded.

Phase 2: Wrap existing runtimes behind module lifecycles¶

Make protocol runtimes conform to a common lifecycle and capability model even before all traffic crosses an IPC boundary.

Phase 3: Move selected runtimes to sidecars¶

Use sidecars where the performance, isolation, or language goals justify it. The current Rust MCP runtime is the first concrete example of this phase.

Phase 4: Add new protocol runtimes¶

Introduce new A2A, LLM, and REST/gRPC runtimes behind the same architectural contract.

Phase 5: Optimize¶

Only after the boundary is stable should the implementation optimize for:

direct hot paths
embedded fast paths
selective caching and event-stream ownership

What Is Decided vs What Is Still Open¶

Decided in principle¶

ContextForge should evolve toward a core-plus-modules runtime model.
The core remains the policy and control plane.
Modules are language-agnostic and process-boundary first.
Shared-nothing between modules is a design rule.
Incremental migration is the preferred path.

Still intentionally open¶

exact final SPI RPC names
exact protobuf package layout
exact module descriptor wire schema
whether all plugin hooks are always explicit SPI calls versus selective core delegation for parity-sensitive flows
how much direct data-path optimization a module may keep before it must be expressed through the generic SPI

Open Questions¶

What is the minimal first stable SPI version that supports both MCP and A2A without overfitting to either?
Which cross-module events deserve an event bus rather than synchronous core-mediated routing?
How should optional protocol surfaces be classified in release gating versus follow-up compatibility work?
What is the right balance between generic SPI purity and targeted fast paths for performance-sensitive runtimes?