OpenTelemetry IntegrationΒΆ
ContextForge integrates OpenTelemetry (OTEL) for distributed tracing, providing comprehensive observability across MCP operations, tool invocations, and plugin execution.
OverviewΒΆ
The OTEL integration provides:
- W3C Trace Context Propagation: Automatic propagation of trace context via
traceparentheaders - Request-Root Spans: Every HTTP request creates a root span in the observability middleware
- MCP Client Spans: Detailed tracing of MCP protocol operations (initialize, request, response)
- Plugin Hook Spans: Visibility into plugin execution lifecycle
- Session Pool Awareness: Non-pooled sessions propagate trace context; pooled sessions skip injection to prevent context pollution
ArchitectureΒΆ
Span HierarchyΒΆ
http.request (root span)
βββ mcp.client.call
β βββ mcp.client.initialize
β βββ mcp.client.request
β βββ mcp.client.response
βββ plugin.hook.prompt_pre_fetch
βββ plugin.hook.tool_pre_invoke
βββ plugin.hook.tool_post_invoke
Trace Context FlowΒΆ
- Inbound Request: Extract
traceparentheader from incoming HTTP request - Root Span: Create request-root span with extracted trace ID
- Child Spans: All operations inherit trace context automatically
- Outbound Requests: Inject
traceparentheader into MCP client calls - Upstream Propagation: Upstream MCP servers can attach their spans to the trace
ConfigurationΒΆ
Environment VariablesΒΆ
# Enable OTEL tracing
OTEL_ENABLE_OBSERVABILITY=true
# Exporter configuration
OTEL_EXPORTER_TYPE=otlp # otlp, jaeger, zipkin, console
OTEL_EXPORTER_OTLP_ENDPOINT=http://localhost:4317
OTEL_EXPORTER_OTLP_PROTOCOL=grpc # grpc or http
# Service identification
OTEL_SERVICE_NAME=mcp-gateway
OTEL_SERVICE_VERSION=1.0.0
# Resource attributes (comma-separated key=value pairs)
OTEL_RESOURCE_ATTRIBUTES=deployment.environment=production,service.namespace=mcp
# Batch processor tuning
OTEL_BSP_MAX_QUEUE_SIZE=2048
OTEL_BSP_MAX_EXPORT_BATCH_SIZE=512
OTEL_BSP_SCHEDULE_DELAY=5000
# Copy resource attributes to span attributes (for Arize compatibility)
OTEL_COPY_RESOURCE_ATTRS_TO_SPANS=false
Langfuse IntegrationΒΆ
For Langfuse observability, use the OTLP endpoint:
OTEL_EXPORTER_TYPE=otlp
OTEL_EXPORTER_OTLP_ENDPOINT=https://cloud.langfuse.com
OTEL_EXPORTER_OTLP_HEADERS=Authorization=Bearer sk-lf-...
W3C Trace Context PropagationΒΆ
Inbound PropagationΒΆ
The observability middleware automatically extracts W3C trace context from incoming requests:
The middleware: 1. Parses the traceparent header 2. Extracts trace-id and parent-span-id 3. Creates a new span as a child of the external trace 4. Stores trace context in request state
Outbound PropagationΒΆ
When making MCP client calls, trace context is automatically injected:
This ensures: - Upstream MCP servers receive traceparent header - Distributed traces span multiple services - End-to-end visibility across the call chain
Session Pooling with TracingΒΆ
Design Decision and Trade-offΒΆ
Current Behavior:
# Session pool enabled, but trace headers NOT injected
if settings.mcp_session_pool_enabled:
# Use base headers without trace context injection
async with pool.session(url=server_url, headers=headers) as pooled:
# Pool provides 10-20x latency improvement
# But trace context does NOT propagate to upstream
Why Trace Headers Are Not InjectedΒΆ
The MCP SDK pins headers at transport creation time. If we inject per-request trace headers (traceparent, X-Correlation-ID) before pooling:
- Trace Corruption: The first request's trace context gets pinned to the transport
- Context Leakage: Later unrelated requests reuse the same trace ID
- Broken Distributed Tracing: Upstream servers see wrong parent spans
- Correlation ID Leakage: Different requests appear correlated when they're not
The Trade-offΒΆ
| Aspect | Pooled Sessions | Non-Pooled Sessions |
|---|---|---|
| Latency | 10-20x faster (reuse connection) | Slower (new connection each time) |
| Trace Propagation | β No upstream propagation | β Full W3C trace context |
| Correlation IDs | β Not sent to upstream | β Sent per-request |
| Use Case | High-throughput, internal tracing | Distributed tracing across services |
When to Use EachΒΆ
Use Session Pooling (default): - High request volume to same MCP servers - Internal observability is sufficient - 10-20x latency improvement is critical - Upstream servers don't need trace context
Disable Session Pooling (for distributed tracing):
- Need end-to-end distributed tracing - Upstream MCP servers participate in traces - Correlation IDs must reach upstream - Latency is acceptable trade-offImplementation DetailsΒΆ
The session pool: - Reuses transports with pinned headers (base headers only) - Does NOT inject per-request trace headers - Provides 10-20x latency improvement - Maintains internal trace context within gateway - Upstream servers do not receive trace propagation
Security ConsiderationsΒΆ
SanitizationΒΆ
All sensitive data is sanitized before adding to OTEL spans:
# Query string sanitization
"url.query": sanitize_trace_text(str(request.url.query))
# Exception message sanitization
sanitized_error = sanitize_for_log(sanitize_trace_text(str(e)))
"exception.message": sanitized_error
This prevents: - Leaking credentials in query parameters - Exposing sensitive error details - Bypassing existing sanitization flows
Data MinimizationΒΆ
Only essential attributes are exported: - HTTP method, path, status code - Tool names and IDs (not arguments) - Timing information - Error types (not full stack traces in production)
Span Naming ConventionsΒΆ
All spans follow the <domain>.<operation> pattern:
| Domain | Operations | Example |
|---|---|---|
http | request | http.request |
mcp.client | call, initialize, request, response | mcp.client.call |
tool | invoke, list | tool.invoke |
prompt | render, list | prompt.render |
resource | invoke, list | resource.invoke |
plugin.hook | prompt_pre_fetch, tool_pre_invoke, etc. | plugin.hook.tool_pre_invoke |
Semantic AttributesΒΆ
Standard AttributesΒΆ
Following OpenTelemetry semantic conventions:
{
"http.method": "POST",
"http.route": "/tools/invoke",
"http.status_code": 200,
"network.protocol.name": "mcp",
"server.address": "localhost",
"server.port": 8000,
"url.path": "/mcp/sse",
"url.full": "http://localhost:8000/mcp/sse",
}
Custom AttributesΒΆ
ContextForge-specific attributes use the contextforge. prefix:
{
"contextforge.tool.id": "tool-123",
"contextforge.gateway_id": "gateway-456",
"contextforge.runtime": "python",
"contextforge.transport": "sse",
"contextforge.user.email": "user@example.com",
"contextforge.team.id": "team-789",
}
Plugin Server TracingΒΆ
External plugin servers can enable OTEL tracing:
# In plugin server environment
OTEL_ENABLE_OBSERVABILITY=true
OTEL_SERVICE_NAME=my-plugin-server
OTEL_EXPORTER_OTLP_ENDPOINT=http://localhost:4317
Important: The OTEL_SERVICE_NAME must be set before importing mcpgateway.observability, as the tracer is initialized at import time.
Performance ImpactΒΆ
OverheadΒΆ
- Minimal: ~1-2ms per request for span creation
- Batch Export: Spans are batched and exported asynchronously
- Configurable: Adjust batch size and delay via environment variables
OptimizationΒΆ
# Increase batch size for high-throughput scenarios
OTEL_BSP_MAX_EXPORT_BATCH_SIZE=1024
OTEL_BSP_SCHEDULE_DELAY=10000 # 10 seconds
# Increase queue size to prevent drops
OTEL_BSP_MAX_QUEUE_SIZE=4096
TroubleshootingΒΆ
No Traces AppearingΒΆ
- Check OTEL is enabled:
OBSERVABILITY_ENABLED=true - Verify exporter endpoint: Test connectivity to OTLP endpoint
- Check service name: Ensure
OTEL_SERVICE_NAMEis set correctly - Review logs: Look for "OpenTelemetry initialized" message
Broken Trace ContextΒΆ
- Verify header injection: Check that
inject_trace_context_headers()is called - Session pool headers: Ensure headers are injected before
pool.session() - Upstream support: Verify upstream MCP server supports W3C trace context
Performance IssuesΒΆ
- Reduce batch delay: Lower
OTEL_BSP_SCHEDULE_DELAYfor faster export - Increase batch size: Raise
OTEL_BSP_MAX_EXPORT_BATCH_SIZEto reduce export frequency - Check exporter: Ensure OTLP endpoint is responsive
ExamplesΒΆ
Basic TracingΒΆ
from mcpgateway.observability import create_span, set_span_attribute
with create_span("custom.operation", {"custom.attr": "value"}):
# Your code here
set_span_attribute("result.count", 42)
Distributed TracingΒΆ
# Service A (ContextForge)
headers = inject_trace_context_headers(base_headers)
response = await httpx_client.post(upstream_url, headers=headers)
# Service B (Upstream MCP Server)
# Automatically extracts traceparent and attaches to trace
Plugin Hook TracingΒΆ
# Automatic tracing in plugin framework
async def tool_pre_invoke(self, payload, context):
# This hook execution is automatically traced
# Span name: plugin.hook.tool_pre_invoke
return PluginResult(continue_processing=True)