Skip to content

LLM Module ProfileΒΆ

This profile defines how a future LLM proxy or chat module should map onto the current ContextForge LLM surfaces.

Current SurfaceΒΆ

Today there are two main LLM-facing surfaces:

  • OpenAI-compatible proxy
  • POST /chat/completions
  • GET /models
  • Session-oriented chat
  • /llmchat/connect
  • /llmchat/chat
  • /llmchat/disconnect
  • /llmchat/status/{user_id}
  • /llmchat/config/{user_id}
  • /llmchat/gateway/models

Those surfaces are currently implemented through Python routers and services.

The important split is that ContextForge already has both:

  • a direct OpenAI-compatible proxy surface
  • a higher-level chat surface that coordinates models, servers, and session state

What the LLM Module OwnsΒΆ

A future LLM module should own:

  • request parsing for OpenAI-compatible and session-style chat surfaces
  • streaming transport behavior
  • provider relay runtime behavior
  • chat-session orchestration and protocol-local session state
  • provider-specific retries, deadlines, and streaming normalization
  • protocol-local metrics and runtime health

What Stays in CoreΒΆ

The core should continue to own:

  • provider and model registry CRUD
  • provider credentials and secret handling
  • auth, RBAC, and token-scope policy
  • model visibility and governance
  • prompt, tool, and resource catalogs
  • plugin policy
  • admin UI and provider-management workflows
  • any shared governance around which virtual servers or model records are exposed to which callers

Required SPI UsageΒΆ

A future LLM module will typically require:

  • AuthPolicyService
  • CatalogService for model lookup and MCP-facing resource access
  • PluginService where chat or provider flows become plugin-sensitive
  • ObservabilityService
  • ConfigSecretsService
  • SessionEventService if shared chat-session semantics are extracted

Cross-Protocol ConstraintΒΆ

If the LLM module can call MCP tools or prompts, it must not call another module directly. The core still decides:

  • what catalog entry is being invoked
  • what protocol owns it
  • what policy and plugin rules apply

Conformance AdditionsΒΆ

An LLM module should additionally prove:

  • non-streaming and streaming parity
  • model visibility and deny paths
  • provider-auth failure handling without leaking sensitive details
  • correct cross-protocol invocation when chat flows reach MCP-backed tools or prompts