Skip to content

Phoenix Integration GuideΒΆ

Arize Phoenix provides AI/LLM-focused observability for ContextForge, offering specialized features for monitoring AI-powered applications.

Why Phoenix?ΒΆ

Phoenix is optimized for AI/LLM workloads with features like:

  • Token usage tracking - Monitor prompt and completion tokens
  • Cost analysis - Track API costs across models
  • Evaluation metrics - Measure response quality
  • Drift detection - Identify model behavior changes
  • Conversation analysis - Understand multi-turn interactions

Quick StartΒΆ

# Clone the repository
git clone https://github.com/IBM/mcp-context-forge
cd mcp-context-forge

# Start Phoenix with ContextForge
docker-compose -f docker-compose.yml \
               -f docker-compose.with-phoenix.yml up -d

# View Phoenix UI
open http://localhost:6006

# View traces flowing in
curl http://localhost:4444/health  # Generate a trace

Option 2: Standalone PhoenixΒΆ

# Start Phoenix
docker run -d \
  --name phoenix \
  -p 6006:6006 \
  -p 4317:4317 \
  -v phoenix-data:/phoenix/data \
  arizephoenix/phoenix:latest

# Configure ContextForge
export OTEL_ENABLE_OBSERVABILITY=true
export OTEL_TRACES_EXPORTER=otlp
export OTEL_EXPORTER_OTLP_ENDPOINT=http://localhost:4317
export OTEL_SERVICE_NAME=mcp-gateway

# Start ContextForge
mcpgateway

Option 3: Phoenix CloudΒΆ

For production deployments, use Phoenix Cloud:

# Get your API key from Phoenix Cloud
export PHOENIX_API_KEY=your-api-key

# Configure ContextForge for Phoenix Cloud
export OTEL_EXPORTER_OTLP_ENDPOINT=https://app.phoenix.arize.com
export OTEL_EXPORTER_OTLP_HEADERS="api-key=$PHOENIX_API_KEY"
export OTEL_EXPORTER_OTLP_INSECURE=false

Docker Compose ConfigurationΒΆ

The provided docker-compose.with-phoenix.yml includes:

services:
  phoenix:
    image: arizephoenix/phoenix:latest
    ports:

      - "6006:6006"  # Phoenix UI
      - "4317:4317"  # OTLP gRPC endpoint
    environment:

      - PHOENIX_GRPC_PORT=4317
      - PHOENIX_PORT=6006
      - PHOENIX_HOST=0.0.0.0
    volumes:

      - phoenix-data:/phoenix/data
    healthcheck:
      test: ["CMD", "curl", "-f", "http://localhost:6006/health"]
      interval: 10s
      timeout: 5s
      retries: 5

  mcpgateway:
    environment:

      - OTEL_ENABLE_OBSERVABILITY=true
      - OTEL_TRACES_EXPORTER=otlp
      - OTEL_EXPORTER_OTLP_ENDPOINT=http://phoenix:4317
      - OTEL_SERVICE_NAME=mcp-gateway
    depends_on:
      phoenix:
        condition: service_healthy

Using Phoenix UIΒΆ

Viewing TracesΒΆ

  1. Navigate to http://localhost:6006
  2. Click on "Traces" in the left sidebar
  3. You'll see:

  4. Timeline view of all operations

  5. Span details with attributes
  6. Error rates and latencies
  7. Service dependency graph

Analyzing Tool InvocationsΒΆ

Phoenix provides specialized views for tool calls:

  1. Tool Performance

  2. Average latency per tool

  3. Success/failure rates
  4. Usage frequency

  5. Cost Analysis (when token tracking is implemented)

  6. Token usage per tool

  7. Estimated costs by model
  8. Cost trends over time

Setting Up EvaluationsΒΆ

Phoenix can evaluate response quality:

# Example: Set up Phoenix evaluations (Python)
from phoenix.evals import llm_eval
from phoenix.trace import trace

# Configure evaluations
evaluator = llm_eval.LLMEvaluator(
    model="gpt-4",
    eval_type="relevance"
)

# Traces from ContextForge will be evaluated
evaluator.evaluate(
    trace_dataset=phoenix.get_traces(),
    eval_name="response_quality"
)

Production DeploymentΒΆ

With PostgreSQL BackendΒΆ

For production, use PostgreSQL for Phoenix storage:

services:
  postgres:
    image: postgres:15
    environment:
      POSTGRES_DB: phoenix
      POSTGRES_USER: phoenix
      POSTGRES_PASSWORD: phoenix_secret
    volumes:

      - postgres-data:/var/lib/postgresql/data

  phoenix:
    image: arizephoenix/phoenix:latest
    environment:

      - DATABASE_URL=postgresql://phoenix:phoenix_secret@postgres:5432/phoenix
      - PHOENIX_GRPC_PORT=4317
      - PHOENIX_PORT=6006
    depends_on:

      - postgres

Kubernetes DeploymentΒΆ

Deploy Phoenix on Kubernetes:

apiVersion: apps/v1
kind: Deployment
metadata:
  name: phoenix
spec:
  replicas: 1
  selector:
    matchLabels:
      app: phoenix
  template:
    metadata:
      labels:
        app: phoenix
    spec:
      containers:

      - name: phoenix
        image: arizephoenix/phoenix:latest
        ports:

        - containerPort: 6006
          name: ui

        - containerPort: 4317
          name: otlp
        env:

        - name: PHOENIX_GRPC_PORT
          value: "4317"

        - name: PHOENIX_PORT
          value: "6006"
        volumeMounts:

        - name: data
          mountPath: /phoenix/data
      volumes:

      - name: data
        persistentVolumeClaim:
          claimName: phoenix-data
---
apiVersion: v1
kind: Service
metadata:
  name: phoenix
spec:
  selector:
    app: phoenix
  ports:

  - port: 6006
    name: ui

  - port: 4317
    name: otlp

Advanced FeaturesΒΆ

Custom Span AttributesΒΆ

Add Phoenix-specific attributes in your code:

from mcpgateway.observability import create_span

# Add LLM-specific attributes
with create_span("tool.invoke", {
    "llm.model": "gpt-4",
    "llm.prompt_tokens": 150,
    "llm.completion_tokens": 50,
    "llm.temperature": 0.7,
    "llm.top_p": 0.9
}) as span:
    # Tool execution
    pass

Integrating with Phoenix SDKΒΆ

For advanced analysis, use the Phoenix SDK:

import phoenix as px

# Connect to Phoenix
px.launch_app(trace_dataset=px.Client().get_traces())

# Analyze traces
traces_df = px.Client().get_traces_dataframe()
print(traces_df.describe())

# Export for further analysis
traces_df.to_csv("mcp_gateway_traces.csv")

Monitoring Best PracticesΒΆ

Key Metrics to TrackΒΆ

  1. Response Times

  2. P50, P95, P99 latencies

  3. Slowest operations
  4. Timeout rates

  5. Error Rates

  6. Error percentage by tool

  7. Error types distribution
  8. Error trends

  9. Usage Patterns

  10. Most used tools

  11. Peak usage times
  12. User distribution

Setting Up AlertsΒΆ

Configure alerts in Phoenix Cloud:

  1. Go to Settings β†’ Alerts
  2. Create rules for:

  3. High error rates (> 5%)

  4. Slow responses (P95 > 2s)
  5. Unusual token usage
  6. Cost thresholds

TroubleshootingΒΆ

Phoenix Not Receiving TracesΒΆ

  1. Check Phoenix is running:

    docker ps | grep phoenix
    curl http://localhost:6006/health
    

  2. Verify OTLP endpoint:

    telnet localhost 4317
    

  3. Check ContextForge logs:

    docker logs mcpgateway | grep -i phoenix
    

High Memory UsageΒΆ

Phoenix stores traces in memory by default. For production:

  1. Use PostgreSQL backend
  2. Configure retention policies
  3. Set sampling rates appropriately

Performance OptimizationΒΆ

  1. Reduce trace volume:

    export OTEL_TRACES_SAMPLER_ARG=0.01  # Sample 1%
    

  2. Filter unnecessary spans:

    # In observability.py, add filtering
    if span_name in ["health_check", "metrics"]:
        return nullcontext()
    

Next StepsΒΆ