LLM Module ProfileΒΆ
This profile defines how a future LLM proxy or chat module should map onto the current ContextForge LLM surfaces.
Current SurfaceΒΆ
Today there are two main LLM-facing surfaces:
- OpenAI-compatible proxy
POST /chat/completionsGET /models- Session-oriented chat
/llmchat/connect/llmchat/chat/llmchat/disconnect/llmchat/status/{user_id}/llmchat/config/{user_id}/llmchat/gateway/models
Those surfaces are currently implemented through Python routers and services.
The important split is that ContextForge already has both:
- a direct OpenAI-compatible proxy surface
- a higher-level chat surface that coordinates models, servers, and session state
What the LLM Module OwnsΒΆ
A future LLM module should own:
- request parsing for OpenAI-compatible and session-style chat surfaces
- streaming transport behavior
- provider relay runtime behavior
- chat-session orchestration and protocol-local session state
- provider-specific retries, deadlines, and streaming normalization
- protocol-local metrics and runtime health
What Stays in CoreΒΆ
The core should continue to own:
- provider and model registry CRUD
- provider credentials and secret handling
- auth, RBAC, and token-scope policy
- model visibility and governance
- prompt, tool, and resource catalogs
- plugin policy
- admin UI and provider-management workflows
- any shared governance around which virtual servers or model records are exposed to which callers
Required SPI UsageΒΆ
A future LLM module will typically require:
AuthPolicyServiceCatalogServicefor model lookup and MCP-facing resource accessPluginServicewhere chat or provider flows become plugin-sensitiveObservabilityServiceConfigSecretsServiceSessionEventServiceif shared chat-session semantics are extracted
Cross-Protocol ConstraintΒΆ
If the LLM module can call MCP tools or prompts, it must not call another module directly. The core still decides:
- what catalog entry is being invoked
- what protocol owns it
- what policy and plugin rules apply
Conformance AdditionsΒΆ
An LLM module should additionally prove:
- non-streaming and streaming parity
- model visibility and deny paths
- provider-auth failure handling without leaking sensitive details
- correct cross-protocol invocation when chat flows reach MCP-backed tools or prompts