Observability¶
ContextForge includes production-grade OpenTelemetry instrumentation for distributed tracing, enabling you to monitor performance, debug issues, and understand request flows across your gateway instances.
Overview¶
The observability implementation is vendor-agnostic at the transport/export layer and works with any OTLP-compatible backend. ContextForge also supports optional Langfuse-oriented span enrichment when you point OTLP at a Langfuse ingestion endpoint, or when you explicitly enable that schema with OTEL_EMIT_LANGFUSE_ATTRIBUTES=true.
Recommended Backend Options¶
Langfuse - LLM Observability and Analytics¶
Best for: LLM trace visualization, prompt management, evaluations, and cost tracking
Langfuse is an open-source LLM observability platform that receives traces via standard OTLP/HTTP. It provides a comprehensive suite of tools for monitoring, evaluating, and improving LLM applications.
Key Features:
- Trace Visualization: End-to-end request traces with latency breakdown and error analysis
- Prompt Management: Version, test, and deploy prompts with A/B testing
- Evaluations: Score traces with custom or built-in evaluators
- Cost Tracking: Token usage and cost analytics per model and user
- Datasets: Create test datasets from production traces for regression testing
- OpenTelemetry Native: Standard OTLP/HTTP ingestion (v3.22.0+)
Ideal Use Cases:
- Monitoring LLM-powered tool invocations and prompt rendering
- Tracking costs and token usage across gateway operations
- Building evaluation pipelines for response quality
- Managing and versioning prompt templates
Transport: Uses OTLP over HTTP (gRPC not supported)
See the Langfuse Integration Guide for setup instructions.
Arize Phoenix - AI/LLM-Focused Observability¶
Best for: AI applications, LLM debugging, and prompt optimization
Arize Phoenix is an open-source AI observability platform specifically designed for LLM applications and AI systems. Built on OpenTelemetry, it provides specialized features for understanding AI application behavior.
Key Features:
- LLM Tracing: Purpose-built for tracking LLM calls, token usage, and model performance
- Evaluation Tools: Built-in evaluators for response quality, retrieval accuracy, and hallucination detection
- Prompt Playground: Interactive environment for optimizing prompts with version control and experimentation
- Framework Support: Native integrations with LlamaIndex, LangChain, Haystack, DSPy, and major LLM providers
- Cost Tracking: Monitor token usage and API costs across different providers
- Zero Lock-in: Fully open source and self-hostable with no feature gates
Ideal Use Cases:
- Debugging RAG (Retrieval-Augmented Generation) pipelines
- Optimizing prompt templates and model parameters
- Tracking LLM performance metrics and costs
- Evaluating response quality and accuracy
Transport: Uses OTLP (OpenTelemetry Protocol) via gRPC
Jaeger - Production-Grade Distributed Tracing¶
Best for: Microservices architectures, production deployments, and general-purpose tracing
Jaeger is a mature, battle-tested distributed tracing platform originally created by Uber and now a CNCF graduated project. It excels at monitoring complex distributed systems.
Key Features:
- Proven Scale: Handles high-volume production environments with battle-tested reliability
- Full Observability: Comprehensive request flow visualization across microservices
- Multiple Storage Backends: Supports Elasticsearch, Cassandra, and other scalable storage solutions
- Kubernetes Native: Deep integration with cloud-native environments
- Advanced Sampling: Configurable sampling strategies (constant, probabilistic, rate-limiting)
- OpenTelemetry Integration: Full OTLP support in v2 with native OpenTelemetry data model
Ideal Use Cases:
- Large-scale microservices deployments
- Performance bottleneck identification
- Service dependency mapping
- Root cause analysis of distributed failures
Transport: Supports both native Jaeger protocol and OTLP
Grafana Tempo - Cost-Efficient High-Scale Tracing¶
Best for: High-volume environments, cost-conscious deployments, and Grafana ecosystem users
Grafana Tempo is a high-scale, low-cost distributed tracing backend that eliminates expensive indexing by leveraging object storage (S3, GCS, Azure Blob).
Key Features:
- Extreme Cost Efficiency: Uses cheap object storage instead of expensive databases (Elasticsearch/Cassandra)
- Massive Scale: Designed to handle 100% trace capture without prohibitive costs
- No Heavy Indexing: Lightweight bloom filters instead of full-text indexes reduce storage costs dramatically
- TraceQL Query Language: Powerful query language inspired by PromQL and LogQL
- Grafana Ecosystem Integration: Deep integration with Grafana, Prometheus, Loki, and Mimir
- Multi-Protocol Support: Compatible with Jaeger, Zipkin, OpenCensus, and OpenTelemetry
Ideal Use Cases:
- High-volume microservices (170k+ spans/second proven in production)
- Cost-sensitive environments requiring long trace retention
- Organizations already using Grafana stack
- Teams wanting 100% trace capture without sampling
Transport: Uses OTLP (OpenTelemetry Protocol) via gRPC
Other Compatible Backends¶
The OTLP exporter also works with commercial APM solutions:
- Datadog APM - Full-featured commercial observability platform
- New Relic - Cloud-based application performance monitoring
- Honeycomb - Observability for production systems
- Zipkin - Lightweight distributed tracing (legacy but widely supported)
What Gets Traced¶
- Tool invocations - Full lifecycle with arguments, results, and timing
- Prompt rendering - Template processing and message generation
- Resource fetching - URI resolution, caching, and content retrieval
- Gateway federation - Cross-gateway requests and health checks
- Plugin execution - Pre/post hooks if plugins are enabled
- Errors and exceptions - Full stack traces and error context
Quick Start¶
1. Install Dependencies¶
The observability packages are included in the Docker containers by default. For local development:
# Install base gateway with core observability support
pip install mcp-contextforge-gateway[observability]
Important: Only one exporter backend should be installed at a time - they are mutually exclusive. Install the specific backend package based on your chosen observability platform (see backend-specific instructions in the "Start Your Backend" section below).
2. Configure Environment¶
Set these environment variables (or add to .env):
# Enable observability (default: false)
export OTEL_ENABLE_OBSERVABILITY=true
# Service identification
export OTEL_SERVICE_NAME=mcp-gateway
export OTEL_SERVICE_VERSION=1.0.0-RC-3
export OTEL_DEPLOYMENT_ENVIRONMENT=development
# Choose your backend (otlp, jaeger, zipkin, console, none)
export OTEL_TRACES_EXPORTER=otlp
# OTLP Configuration (for Phoenix, Tempo, etc.)
export OTEL_EXPORTER_OTLP_ENDPOINT=http://localhost:4317
export OTEL_EXPORTER_OTLP_PROTOCOL=grpc
export OTEL_EXPORTER_OTLP_INSECURE=true
3. Start Your Backend¶
Choose your preferred observability backend:
Phoenix (AI/LLM Focus)¶
# Install Phoenix exporter backend
pip install opentelemetry-exporter-otlp-proto-grpc
# Start Phoenix
docker run -d \
--name phoenix \
-p 6006:6006 \
-p 4317:4317 \
arizephoenix/phoenix:latest
# Configure environment
export OTEL_TRACES_EXPORTER=otlp
export OTEL_EXPORTER_OTLP_ENDPOINT=http://localhost:4317
export OTEL_SERVICE_NAME=mcp-gateway
# View UI at http://localhost:6006
Jaeger¶
# Install Jaeger exporter backend
pip install opentelemetry-exporter-jaeger
# Start Jaeger
docker run -d \
--name jaeger \
-p 16686:16686 \
-p 14268:14268 \
jaegertracing/all-in-one
# Configure environment
export OTEL_TRACES_EXPORTER=jaeger
export OTEL_EXPORTER_JAEGER_ENDPOINT=http://localhost:14268/api/traces
export OTEL_SERVICE_NAME=mcp-gateway
# View UI at http://localhost:16686
Zipkin¶
# Install Zipkin exporter backend
pip install opentelemetry-exporter-zipkin
# Start Zipkin
docker run -d \
--name zipkin \
-p 9411:9411 \
openzipkin/zipkin
# Configure environment
export OTEL_TRACES_EXPORTER=zipkin
export OTEL_EXPORTER_ZIPKIN_ENDPOINT=http://localhost:9411/api/v2/spans
export OTEL_SERVICE_NAME=mcp-gateway
# View UI at http://localhost:9411
Grafana Tempo¶
# Install OTLP exporter backend (Tempo uses OTLP)
pip install opentelemetry-exporter-otlp-proto-grpc
# Start Tempo
docker run -d \
--name tempo \
-p 4317:4317 \
-p 3200:3200 \
grafana/tempo:latest
# Configure environment (uses OTLP)
export OTEL_TRACES_EXPORTER=otlp
export OTEL_EXPORTER_OTLP_ENDPOINT=http://localhost:4317
export OTEL_SERVICE_NAME=mcp-gateway
Console (Development)¶
# For debugging - prints traces to stdout
export OTEL_TRACES_EXPORTER=console
export OTEL_SERVICE_NAME=mcp-gateway
4. Run ContextForge¶
# Start the gateway (observability is disabled by default)
mcpgateway
# Or with Docker
docker run -e OTEL_ENABLE_OBSERVABILITY=true \
-e OTEL_EXPORTER_OTLP_ENDPOINT=http://host.docker.internal:4317 \
ghcr.io/ibm/mcp-context-forge:1.0.0-RC-3
Configuration Reference¶
Core Settings¶
| Variable | Description | Default | Options |
|---|---|---|---|
OTEL_ENABLE_OBSERVABILITY | Master switch | false | true, false |
OTEL_SERVICE_NAME | Service identifier | mcp-gateway | Any string |
OTEL_SERVICE_VERSION | Service version | 1.0.0-RC-3 | Any string |
DEPLOYMENT_ENV / ENVIRONMENT | Environment tag | development | development, staging, production |
OTEL_TRACES_EXPORTER | Export backend | otlp | otlp, jaeger, zipkin, console, none |
OTEL_RESOURCE_ATTRIBUTES | Custom attributes | - | key=value,key2=value2 |
OTEL_COPY_RESOURCE_ATTRS_TO_SPANS | Copy selected resource attrs onto spans | false | true, false |
OTLP Configuration¶
| Variable | Description | Default | Example |
|---|---|---|---|
OTEL_EXPORTER_OTLP_ENDPOINT | Collector endpoint | - | http://localhost:4317 |
OTEL_EXPORTER_OTLP_PROTOCOL | Protocol | grpc | grpc, http/protobuf |
OTEL_EXPORTER_OTLP_HEADERS | Auth headers | - | api-key=secret,x-auth=token |
LANGFUSE_OTEL_ENDPOINT | Optional Langfuse OTLP/HTTP endpoint override | - | https://cloud.langfuse.com/api/public/otel/v1/traces |
LANGFUSE_PUBLIC_KEY | Langfuse project public key for derived OTLP auth | - | pk-lf-... |
LANGFUSE_SECRET_KEY | Langfuse project secret key for derived OTLP auth | - | sk-lf-... |
LANGFUSE_OTEL_AUTH | Optional base64-encoded pk:sk auth override | - | base64 string |
OTEL_EXPORTER_OTLP_INSECURE | Skip TLS verify | true | true, false |
OTEL_EMIT_LANGFUSE_ATTRIBUTES | Force-enable or disable Langfuse-specific span attributes | auto | true, false |
OTEL_CAPTURE_IDENTITY_ATTRIBUTES | Force-enable or disable user/team identity enrichment | auto | true, false |
Alternative Backends¶
| Variable | Description | Default |
|---|---|---|
OTEL_EXPORTER_JAEGER_ENDPOINT | Jaeger collector | http://localhost:14268/api/traces |
OTEL_EXPORTER_JAEGER_USER | Jaeger collector username | - |
OTEL_EXPORTER_JAEGER_PASSWORD | Jaeger collector password | - |
OTEL_EXPORTER_ZIPKIN_ENDPOINT | Zipkin collector | http://localhost:9411/api/v2/spans |
Payload Capture and Redaction¶
| Variable | Description | Default |
|---|---|---|
OTEL_REDACT_FIELDS | Comma-separated field names redacted from structured trace payloads and free-text error messages | password,secret,token,... |
OTEL_MAX_TRACE_PAYLOAD_SIZE | Maximum serialized trace payload size in characters | 32768 |
OTEL_CAPTURE_INPUT_SPANS | Comma-separated allowlist of span names that may capture observation input payloads | empty |
OTEL_CAPTURE_OUTPUT_SPANS | Comma-separated allowlist of span names that may capture observation output payloads | empty |
Input/output capture is allowlist-based. ContextForge does not capture those payloads by default unless the relevant span names are listed. Structured payloads are redacted by field name, and exported string values are also scrubbed for sensitive URLs, key=value secret patterns, and embedded Bearer / Basic credentials. The local Langfuse compose overlay sets a dev-friendly input allowlist for tool.invoke,prompt.render,llm.proxy,a2a.invoke; production deployments should decide that list intentionally.
Performance Tuning¶
| Variable | Description | Default |
|---|---|---|
OTEL_TRACES_SAMPLER | Sampling strategy | parentbased_traceidratio |
OTEL_TRACES_SAMPLER_ARG | Sample rate (0.0-1.0) | 0.1 (10%) |
OTEL_BSP_MAX_QUEUE_SIZE | Max queued spans | 2048 |
OTEL_BSP_MAX_EXPORT_BATCH_SIZE | Batch size | 512 |
OTEL_BSP_SCHEDULE_DELAY | Export interval (ms) | 5000 |
Understanding Traces¶
Span Attributes¶
Each span includes standard attributes:
- Operation name - e.g.,
tool.invoke,prompt.render,resource.read - Service info - Service name, version, environment
- User context - User ID, tenant ID, request ID
- Timing - Start time, duration, end time
- Status - Success/error status with error details
Tool Invocation Spans¶
{
"name": "tool.invoke",
"attributes": {
"tool.name": "github_search",
"tool.id": "550e8400-e29b-41d4-a716",
"tool.integration_type": "REST",
"arguments_count": 3,
"success": true,
"duration.ms": 234.5,
"http.status_code": 200
}
}
Error Tracking¶
Failed operations include:
error:trueerror.type: Exception class nameerror.message: Error description- Sanitized exception event metadata (
exception.type,exception.message)
Production Deployment¶
Docker Compose¶
Use the provided compose files:
# Start ContextForge with Phoenix observability
docker-compose -f docker-compose.yml \
-f docker-compose.with-phoenix.yml up -d
Kubernetes¶
Add environment variables to your deployment:
apiVersion: apps/v1
kind: Deployment
metadata:
name: mcp-gateway
spec:
template:
spec:
containers:
- name: gateway
image: ghcr.io/ibm/mcp-context-forge:1.0.0-RC-3
env:
- name: OTEL_ENABLE_OBSERVABILITY
value: "true"
- name: OTEL_TRACES_EXPORTER
value: "otlp"
- name: OTEL_EXPORTER_OTLP_ENDPOINT
value: "http://otel-collector:4317"
- name: OTEL_SERVICE_NAME
valueFrom:
fieldRef:
fieldPath: metadata.labels['app.kubernetes.io/name']
Sampling Strategies¶
For production, adjust sampling to balance visibility and performance:
# Sample 1% of traces
export OTEL_TRACES_SAMPLER=parentbased_traceidratio
export OTEL_TRACES_SAMPLER_ARG=0.01
# Always sample errors (coming in future update)
# export OTEL_TRACES_SAMPLER=parentbased_always_on_errors
Testing Your Setup¶
Generate Test Traces¶
Use the trace generator helper to verify your observability backend is working:
# Activate virtual environment if needed
. .venv/bin/activate
# Run the trace generator
python tests/integration/helpers/trace_generator.py
This will send sample traces for:
- Tool invocations
- Prompt rendering
- Resource fetching
- Gateway federation
- Complex workflows with nested spans
Troubleshooting¶
No Traces Appearing¶
- Check observability is enabled:
- Verify endpoint is reachable:
- Use console exporter for debugging:
High Memory Usage¶
Reduce batch size and queue limits:
Missing Spans¶
Check sampling rate:
Performance Impact¶
- When disabled: Zero overhead (no-op context managers)
- When enabled: ~0.1-0.5ms per span
- Memory: ~50MB for typical workload
- Network: Batched exports every 5 seconds
Next Steps¶
- Phoenix Integration - AI/LLM-focused observability
- Langfuse Integration - LLM observability and prompt management
- Internal Observability - Built-in database-backed tracing
- Prometheus Metrics - Time-series monitoring
Related Documentation¶
- OTEL Span Attributes - Complete span attributes reference
- OpenTelemetry Architecture - Technical implementation details
- OpenTelemetry Best Practices - Official OTEL documentation