OpenTelemetry Integration¶

ContextForge integrates OpenTelemetry (OTEL) for distributed tracing, providing comprehensive observability across MCP operations, tool invocations, and plugin execution.

Overview¶

The OTEL integration provides:

W3C Trace Context Propagation: Automatic propagation of trace context via traceparent headers
W3C Baggage Support: Optional extraction of allowlisted HTTP headers into OTEL baggage
Request-Root Spans: Every HTTP request creates a root span in the observability middleware
MCP Client Spans: Detailed tracing of MCP protocol operations (initialize, request, response)
Plugin Hook Spans: Visibility into plugin execution lifecycle
Session Pool Awareness: Non-pooled sessions propagate trace context and baggage; pooled sessions skip per-request propagation to prevent context pollution

Architecture¶

Span Hierarchy¶

http.request (root span)
├── mcp.client.call
│   ├── mcp.client.initialize
│   ├── mcp.client.request
│   └── mcp.client.response
├── plugin.hook.prompt_pre_fetch
├── plugin.hook.tool_pre_invoke
└── plugin.hook.tool_post_invoke

Span attributes may also include request baggage dimensions, for example:
- `baggage.tenant.id`
- `baggage.user.id`
- `baggage.request.id`

Trace Context Flow¶

Inbound Request: Extract traceparent header from incoming HTTP request
Optional Baggage Extraction: Convert configured request headers into OTEL baggage
Root Span: Create request-root span with extracted trace ID and baggage attributes
Child Spans: All operations inherit trace context automatically
Outbound Requests: Inject traceparent and optional baggage headers into MCP client calls
Upstream Propagation: Upstream MCP servers can attach their spans to the trace

Configuration¶

Environment Variables¶

# Enable OTEL tracing
OTEL_ENABLE_OBSERVABILITY=true

# Exporter configuration
OTEL_EXPORTER_TYPE=otlp                           # otlp, jaeger, zipkin, console
OTEL_EXPORTER_OTLP_ENDPOINT=http://localhost:4317
OTEL_EXPORTER_OTLP_PROTOCOL=grpc                  # grpc or http

# Service identification
OTEL_SERVICE_NAME=mcp-gateway
OTEL_SERVICE_VERSION=1.0.0

# Resource attributes (comma-separated key=value pairs)
OTEL_RESOURCE_ATTRIBUTES=deployment.environment=production,service.namespace=mcp

# Batch processor tuning
OTEL_BSP_MAX_QUEUE_SIZE=2048
OTEL_BSP_MAX_EXPORT_BATCH_SIZE=512
OTEL_BSP_SCHEDULE_DELAY=5000

# Copy resource attributes to span attributes (for Arize compatibility)
OTEL_COPY_RESOURCE_ATTRS_TO_SPANS=false

# Optional baggage extraction from inbound HTTP headers
OTEL_BAGGAGE_ENABLED=false
OTEL_BAGGAGE_HEADER_MAPPINGS='[
  {"header_name": "X-Tenant-ID", "baggage_key": "tenant.id"},
  {"header_name": "X-User-ID", "baggage_key": "user.id"}
]'
OTEL_BAGGAGE_PROPAGATE_TO_EXTERNAL=false
OTEL_BAGGAGE_MAX_ITEMS=32
OTEL_BAGGAGE_MAX_SIZE_BYTES=8192
OTEL_BAGGAGE_LOG_REJECTED=true
OTEL_BAGGAGE_LOG_SANITIZATION=true

Langfuse Integration¶

For Langfuse observability, use the OTLP endpoint:

OTEL_EXPORTER_TYPE=otlp
OTEL_EXPORTER_OTLP_ENDPOINT=https://cloud.langfuse.com
OTEL_EXPORTER_OTLP_HEADERS=Authorization=Bearer sk-lf-...

W3C Trace Context Propagation¶

Inbound Propagation¶

The observability middleware automatically extracts W3C trace context from incoming requests:

GET /mcp/sse HTTP/1.1
traceparent: 00-4bf92f3577b34da6a3ce929d0e0e4736-00f067aa0ba902b7-01

The middleware: 1. Parses the traceparent header 2. Extracts trace-id and parent-span-id 3. Creates a new span as a child of the external trace 4. Stores trace context in request state

Outbound Propagation¶

When making MCP client calls, trace context is automatically injected:

# Automatic injection in tool_service.py
pooled_headers = inject_trace_context_headers(headers)

This ensures: - Upstream MCP servers receive traceparent header - Distributed traces span multiple services - End-to-end visibility across the call chain

W3C Baggage Support¶

Purpose¶

W3C baggage carries low-cardinality request context alongside tracing information. In ContextForge this is intended for metadata such as tenant, user, or request IDs that improve trace filtering and cross-service diagnosis.

Inbound Header Extraction¶

When baggage support is enabled, the gateway can map specific inbound HTTP headers to baggage keys:

Security-Enhanced Baggage Processing¶

Dual Processing Model¶

ContextForge implements a fail-closed security model for baggage processing with two distinct input channels:

1. Header Extraction (Gatekeeper)¶

Inbound HTTP headers are converted to baggage using an explicit allowlist:

OTEL_BAGGAGE_HEADER_MAPPINGS='[
  {"header_name": "X-Tenant-ID", "baggage_key": "tenant.id"},
  {"header_name": "X-User-ID", "baggage_key": "user.id"}
]'

Security Properties: - Only explicitly configured headers are processed - Case-insensitive header matching prevents bypass - Values are sanitized (control characters removed) - Size limits prevent resource exhaustion - Undefined headers are logged and rejected

2. Inbound Baggage Header (Security Enhancement)¶

The W3C baggage header from upstream callers is also processed, but with strict filtering:

GET /mcp/sse HTTP/1.1
traceparent: 00-4bf92f3577b34da6a3ce929d0e0e4736-00f067aa0ba902b7-01
baggage: tenant.id=tenant-123,user.id=user-456,malicious.key=attack

Security Properties: - Only baggage keys matching configured baggage_key values are accepted - Unauthorized keys (e.g., malicious.key) are filtered out - Values undergo same sanitization as header-extracted baggage - Same size and item limits apply - Fail-closed: unknown keys are rejected, not propagated

Configuration Approach¶

Production-Ready Example¶

# Enable baggage with security controls
OTEL_BAGGAGE_ENABLED=true

# Define allowlist: only these headers → baggage keys
OTEL_BAGGAGE_HEADER_MAPPINGS='[
  {"header_name": "X-Tenant-ID", "baggage_key": "tenant.id"},
  {"header_name": "X-User-ID", "baggage_key": "user.id"},
  {"header_name": "X-Request-ID", "baggage_key": "request.id"},
  {"header_name": "X-Correlation-ID", "baggage_key": "correlation.id"}
]'

# Security: disable downstream propagation by default
OTEL_BAGGAGE_PROPAGATE_TO_EXTERNAL=false

# Resource limits
OTEL_BAGGAGE_MAX_ITEMS=32
OTEL_BAGGAGE_MAX_SIZE_BYTES=8192

# Audit logging
OTEL_BAGGAGE_LOG_REJECTED=true
OTEL_BAGGAGE_LOG_SANITIZATION=true

Multi-Tenant Example¶

# Multi-tenant SaaS with user context
OTEL_BAGGAGE_HEADER_MAPPINGS='[
  {"header_name": "X-Tenant-ID", "baggage_key": "tenant.id"},
  {"header_name": "X-Organization-ID", "baggage_key": "org.id"},
  {"header_name": "X-User-ID", "baggage_key": "user.id"},
  {"header_name": "X-User-Email", "baggage_key": "user.email"},
  {"header_name": "X-Session-ID", "baggage_key": "session.id"}
]'

Distributed Tracing Example¶

# Enable downstream propagation for distributed tracing
OTEL_BAGGAGE_PROPAGATE_TO_EXTERNAL=true

# Minimal context for cross-service correlation
OTEL_BAGGAGE_HEADER_MAPPINGS='[
  {"header_name": "X-Request-ID", "baggage_key": "request.id"},
  {"header_name": "X-Trace-ID", "baggage_key": "trace.id"}
]'

Security Implications of `PROPAGATE_TO_EXTERNAL`¶

When Disabled (Default - Recommended)¶

OTEL_BAGGAGE_PROPAGATE_TO_EXTERNAL=false

Behavior: - Baggage is recorded on internal spans only - Downstream MCP servers do NOT receive baggage header - Trace context (traceparent) still propagates - Prevents leaking tenant/user metadata to external services

Use When: - Downstream services are untrusted or third-party - Baggage contains sensitive tenant/user identifiers - You want observability without metadata exposure - Compliance requires data minimization

When Enabled (Opt-In)¶

OTEL_BAGGAGE_PROPAGATE_TO_EXTERNAL=true

Behavior: - Baggage is sent to downstream MCP servers via baggage header - Enables end-to-end correlation across service boundaries - Downstream services can attach baggage to their spans

Use When: - All downstream services are trusted and internal - Cross-service correlation is required - Downstream services need tenant/user context - You control the entire service mesh

Security Considerations: - Review what metadata is being propagated - Ensure downstream services sanitize baggage - Consider data residency and compliance requirements - Monitor for baggage size explosion

Validation and Constraints¶

Header Name Validation¶

# Valid header names (RFC 7230)
X-Tenant-ID      ✅
X-User-ID        ✅
X-Request-ID     ✅

# Invalid header names
X-Tenant@ID      ❌ (special characters)
1-Tenant-ID      ❌ (starts with number)
X Tenant ID      ❌ (contains spaces)

Baggage Key Validation¶

# Valid baggage keys (W3C spec)
tenant.id        ✅
user.id          ✅
request-id       ✅
user_email       ✅

# Invalid baggage keys
tenant@id        ❌ (special characters)
1tenant.id       ❌ (starts with number)
tenant id        ❌ (contains spaces)

Size Limits¶

Limit	Default	Purpose
Max Items	32	Prevent cardinality explosion
Max Size	8192 bytes	Prevent resource exhaustion
Max Key Length	256 chars	W3C spec compliance
Max Value Length	4096 chars	Prevent header bloat

Sanitization Process¶

All baggage values undergo sanitization:

# Control characters removed
"value\x00\x01\x02" → "value"

# Whitespace normalized
"value   with   spaces" → "value with spaces"

# Empty after sanitization → rejected
"\x00\x01\x02" → (rejected)

Monitoring and Auditing¶

Enable logging to track security events:

OTEL_BAGGAGE_LOG_REJECTED=true      # Log rejected headers/keys
OTEL_BAGGAGE_LOG_SANITIZATION=true  # Log sanitized values

Logged Events: - Rejected undefined headers (not in allowlist) - Rejected unauthorized baggage keys (not in allowlist) - Values sanitized (control characters removed) - Size limit violations - Item limit violations

Best Practices¶

Minimize Baggage Keys: Only include essential correlation metadata
Disable External Propagation: Keep PROPAGATE_TO_EXTERNAL=false unless required
Use Low-Cardinality Values: Avoid high-cardinality data (e.g., timestamps, UUIDs in values)
Enable Audit Logging: Monitor rejected headers and sanitization events
Review Regularly: Audit configured mappings and remove unused entries
Test Limits: Verify size and item limits match your use case
Document Mappings: Maintain documentation of header → baggage key mappings

GET /mcp/sse HTTP/1.1
traceparent: 00-4bf92f3577b34da6a3ce929d0e0e4736-00f067aa0ba902b7-01
X-Tenant-ID: tenant-123
X-User-ID: user-456

With this configuration:

OTEL_BAGGAGE_ENABLED=true
OTEL_BAGGAGE_HEADER_MAPPINGS='[
  {"header_name": "X-Tenant-ID", "baggage_key": "tenant.id"},
  {"header_name": "X-User-ID", "baggage_key": "user.id"}
]'

the request context will carry baggage similar to:

tenant.id=tenant-123,user.id=user-456

The baggage middleware runs before the request-root tracing middleware, so the root request span and child spans can all see the same baggage values.

Outbound Propagation¶

Outbound propagation of baggage is opt-in:

OTEL_BAGGAGE_PROPAGATE_TO_EXTERNAL=true

When enabled, outbound MCP client requests include a W3C baggage header in addition to traceparent. When disabled, baggage remains internal to the gateway and is recorded only on spans.

Existing Inbound `baggage` Header¶

The gateway also parses inbound W3C baggage headers from upstream callers, but it does not trust them blindly. Incoming baggage is filtered to the configured baggage-key allowlist and is subject to sanitization and size limits before it is merged into the active request context.

Baggage Configuration¶

Variable	Default	Purpose
`OTEL_BAGGAGE_ENABLED`	`false`	Enables baggage extraction and span enrichment
`OTEL_BAGGAGE_HEADER_MAPPINGS`	`[]`	JSON array of `header_name` to `baggage_key` mappings
`OTEL_BAGGAGE_PROPAGATE_TO_EXTERNAL`	`false`	Sends baggage to downstream services when enabled
`OTEL_BAGGAGE_MAX_ITEMS`	`32`	Maximum accepted baggage items per request
`OTEL_BAGGAGE_MAX_SIZE_BYTES`	`8192`	Maximum combined baggage size from untrusted request input
`OTEL_BAGGAGE_LOG_REJECTED`	`true`	Logs rejected headers and baggage keys
`OTEL_BAGGAGE_LOG_SANITIZATION`	`true`	Logs when values are altered by sanitization

Session Pooling with Tracing¶

Design Decision and Trade-off¶

Current Behavior:

# Session pool enabled only when tracing context is inactive
if settings.mcp_session_pool_enabled and not otel_context_active():
    # Use base headers without trace context injection
    async with pool.session(url=server_url, headers=headers) as pooled:
        # Pool provides 10-20x latency improvement
        # But per-request trace context and baggage do NOT propagate upstream

Why Trace Headers Are Not Injected¶

The MCP SDK pins headers at transport creation time. If we inject per-request trace headers (traceparent, X-Correlation-ID) or baggage before pooling:

Trace Corruption: The first request's trace context gets pinned to the transport
Context Leakage: Later unrelated requests reuse the same trace ID
Broken Distributed Tracing: Upstream servers see wrong parent spans
Correlation ID Leakage: Different requests appear correlated when they're not
Baggage Leakage: Tenant or request metadata from one request bleeds into another

The Trade-off¶

Aspect	Pooled Sessions	Non-Pooled Sessions
Latency	10-20x faster (reuse connection)	Slower (new connection each time)
Trace Propagation	❌ No upstream propagation	✅ Full W3C trace context
Baggage Propagation	❌ No upstream propagation	✅ Optional W3C baggage propagation
Correlation IDs	❌ Not sent to upstream	✅ Sent per-request
Use Case	High-throughput, internal tracing	Distributed tracing across services

When to Use Each¶

Use Session Pooling (default): - High request volume to same MCP servers - Internal observability is sufficient - 10-20x latency improvement is critical - Upstream servers don't need trace context or baggage

Disable Session Pooling (for distributed tracing):

MCP_SESSION_POOL_ENABLED=false

- Need end-to-end distributed tracing - Upstream MCP servers participate in traces - Need downstream baggage propagation - Correlation IDs must reach upstream - Latency is acceptable trade-off

Implementation Details¶

The session pool: - Reuses transports with pinned headers (base headers only) - Does NOT inject per-request trace headers or baggage - Provides 10-20x latency improvement - Maintains internal trace context within gateway - Upstream servers do not receive trace or baggage propagation

Security Considerations¶

Sanitization¶

All sensitive data is sanitized before adding to OTEL spans:

# Query string sanitization
"url.query": sanitize_trace_text(str(request.url.query))

# Exception message sanitization
sanitized_error = sanitize_for_log(sanitize_trace_text(str(e)))
"exception.message": sanitized_error

This prevents: - Leaking credentials in query parameters - Exposing sensitive error details - Bypassing existing sanitization flows

Baggage Hardening¶

Header-to-baggage conversion uses a fail-closed model:

Only explicitly configured headers are processed
Only configured baggage keys are accepted from inbound baggage headers
Values are sanitized before use and before downstream propagation
Untrusted request input is constrained by item-count and size limits
Downstream propagation is disabled by default

This reduces the risk of: - High-cardinality baggage exploding span dimensions - Propagating attacker-controlled metadata to downstream services - Cross-request leakage through pooled transports - Using observability channels to exfiltrate sensitive request content

Data Minimization¶

Only essential attributes are exported: - HTTP method, path, status code - Tool names and IDs (not arguments) - Timing information - Error types (not full stack traces in production)

Span Naming Conventions¶

All spans follow the <domain>.<operation> pattern:

Domain	Operations	Example
`http`	`request`	`http.request`
`mcp.client`	`call`, `initialize`, `request`, `response`	`mcp.client.call`
`tool`	`invoke`, `list`	`tool.invoke`
`prompt`	`render`, `list`	`prompt.render`
`resource`	`invoke`, `list`	`resource.invoke`
`plugin.hook`	`prompt_pre_fetch`, `tool_pre_invoke`, etc.	`plugin.hook.tool_pre_invoke`

Semantic Attributes¶

Standard Attributes¶

Following OpenTelemetry semantic conventions:

{
    "http.method": "POST",
    "http.route": "/tools/invoke",
    "http.status_code": 200,
    "network.protocol.name": "mcp",
    "server.address": "localhost",
    "server.port": 8000,
    "url.path": "/mcp/sse",
    "url.full": "http://localhost:8000/mcp/sse",
}

Custom Attributes¶

ContextForge-specific attributes use the contextforge. prefix:

{
    "contextforge.tool.id": "tool-123",
    "contextforge.gateway_id": "gateway-456",
    "contextforge.runtime": "python",
    "contextforge.transport": "sse",
    "contextforge.user.email": "user@example.com",
    "contextforge.team.id": "team-789",
    "baggage.tenant.id": "tenant-123",
    "baggage.user.id": "user-456",
}

Plugin Server Tracing¶

External plugin servers can enable OTEL tracing:

# In plugin server environment
OTEL_ENABLE_OBSERVABILITY=true
OTEL_SERVICE_NAME=my-plugin-server
OTEL_EXPORTER_OTLP_ENDPOINT=http://localhost:4317

Important: The OTEL_SERVICE_NAME must be set before importing mcpgateway.observability, as the tracer is initialized at import time.

Performance Impact¶

Overhead¶

Minimal: ~1-2ms per request for span creation
Batch Export: Spans are batched and exported asynchronously
Configurable: Adjust batch size and delay via environment variables

Optimization¶

# Increase batch size for high-throughput scenarios
OTEL_BSP_MAX_EXPORT_BATCH_SIZE=1024
OTEL_BSP_SCHEDULE_DELAY=10000  # 10 seconds

# Increase queue size to prevent drops
OTEL_BSP_MAX_QUEUE_SIZE=4096

Troubleshooting¶

No Traces Appearing¶

Check OTEL is enabled: OBSERVABILITY_ENABLED=true
Verify exporter endpoint: Test connectivity to OTLP endpoint
Check service name: Ensure OTEL_SERVICE_NAME is set correctly
Review logs: Look for "OpenTelemetry initialized" message

Broken Trace Context¶

Verify header injection: Check that inject_trace_context_headers() is called
Session pool headers: Ensure headers are injected before pool.session()
Upstream support: Verify upstream MCP server supports W3C trace context

Performance Issues¶

Reduce batch delay: Lower OTEL_BSP_SCHEDULE_DELAY for faster export
Increase batch size: Raise OTEL_BSP_MAX_EXPORT_BATCH_SIZE to reduce export frequency
Check exporter: Ensure OTLP endpoint is responsive

Examples¶

Basic Tracing¶

from mcpgateway.observability import create_span, set_span_attribute

with create_span("custom.operation", {"custom.attr": "value"}):
    # Your code here
    set_span_attribute("result.count", 42)

Distributed Tracing¶

# Service A (ContextForge)
headers = inject_trace_context_headers(base_headers)
response = await httpx_client.post(upstream_url, headers=headers)

# Service B (Upstream MCP Server)
# Automatically extracts traceparent and attaches to trace

Plugin Hook Tracing¶

# Automatic tracing in plugin framework
async def tool_pre_invoke(self, payload, context):
    # This hook execution is automatically traced
    # Span name: plugin.hook.tool_pre_invoke
    return PluginResult(continue_processing=True)

User Guides¶

Observability Overview - Choosing the right observability approach
OpenTelemetry Integration - User-facing OTEL setup guide
Internal Observability - Built-in database-backed tracing
Prometheus Metrics - Time-series monitoring
Langfuse Integration - LLM observability platform
Phoenix Integration - AI/LLM-focused observability

Technical References¶

OTEL Span Attributes Reference - Complete list of span attributes used in ContextForge
OpenTelemetry Specification
W3C Trace Context
W3C Baggage
Semantic Conventions
Langfuse OTEL Integration

OpenTelemetry Integration¶

Overview¶

Architecture¶

Span Hierarchy¶

Trace Context Flow¶

Configuration¶

Environment Variables¶

Langfuse Integration¶

W3C Trace Context Propagation¶

Inbound Propagation¶

Outbound Propagation¶

W3C Baggage Support¶

Purpose¶

Inbound Header Extraction¶

Security-Enhanced Baggage Processing¶

Dual Processing Model¶

1. Header Extraction (Gatekeeper)¶

2. Inbound Baggage Header (Security Enhancement)¶

Configuration Approach¶

Production-Ready Example¶

Multi-Tenant Example¶

Distributed Tracing Example¶

Security Implications of PROPAGATE_TO_EXTERNAL¶

When Disabled (Default - Recommended)¶

When Enabled (Opt-In)¶

Validation and Constraints¶

Header Name Validation¶

Baggage Key Validation¶

Size Limits¶

Sanitization Process¶

Monitoring and Auditing¶

Best Practices¶

Outbound Propagation¶

Existing Inbound baggage Header¶

Baggage Configuration¶

Session Pooling with Tracing¶

Design Decision and Trade-off¶

Why Trace Headers Are Not Injected¶

The Trade-off¶

When to Use Each¶

Implementation Details¶

Security Considerations¶

Sanitization¶

Baggage Hardening¶

Data Minimization¶

Span Naming Conventions¶

Semantic Attributes¶

Standard Attributes¶

Custom Attributes¶

Plugin Server Tracing¶

Performance Impact¶

Overhead¶

Optimization¶

Troubleshooting¶

No Traces Appearing¶

Broken Trace Context¶

Performance Issues¶

Examples¶

Basic Tracing¶

Distributed Tracing¶

Plugin Hook Tracing¶

Related Documentation¶

User Guides¶

Technical References¶

Security Implications of `PROPAGATE_TO_EXTERNAL`¶

Existing Inbound `baggage` Header¶