Skip to content

Phoenix Integration Guide

Arize Phoenix provides AI/LLM-focused observability for MCP Gateway, offering specialized features for monitoring AI-powered applications.

Why Phoenix?

Phoenix is optimized for AI/LLM workloads with features like:

  • Token usage tracking - Monitor prompt and completion tokens
  • Cost analysis - Track API costs across models
  • Evaluation metrics - Measure response quality
  • Drift detection - Identify model behavior changes
  • Conversation analysis - Understand multi-turn interactions

Quick Start

# Clone the repository
git clone https://github.com/IBM/mcp-context-forge
cd mcp-context-forge

# Start Phoenix with MCP Gateway
docker-compose -f docker-compose.yml \
               -f docker-compose.with-phoenix.yml up -d

# View Phoenix UI
open http://localhost:6006

# View traces flowing in
curl http://localhost:4444/health  # Generate a trace

Option 2: Standalone Phoenix

# Start Phoenix
docker run -d \
  --name phoenix \
  -p 6006:6006 \
  -p 4317:4317 \
  -v phoenix-data:/phoenix/data \
  arizephoenix/phoenix:latest

# Configure MCP Gateway
export OTEL_ENABLE_OBSERVABILITY=true
export OTEL_TRACES_EXPORTER=otlp
export OTEL_EXPORTER_OTLP_ENDPOINT=http://localhost:4317
export OTEL_SERVICE_NAME=mcp-gateway

# Start MCP Gateway
mcpgateway

Option 3: Phoenix Cloud

For production deployments, use Phoenix Cloud:

# Get your API key from Phoenix Cloud
export PHOENIX_API_KEY=your-api-key

# Configure MCP Gateway for Phoenix Cloud
export OTEL_EXPORTER_OTLP_ENDPOINT=https://app.phoenix.arize.com
export OTEL_EXPORTER_OTLP_HEADERS="api-key=$PHOENIX_API_KEY"
export OTEL_EXPORTER_OTLP_INSECURE=false

Docker Compose Configuration

The provided docker-compose.with-phoenix.yml includes:

services:
  phoenix:
    image: arizephoenix/phoenix:latest
    ports:
      - "6006:6006"  # Phoenix UI
      - "4317:4317"  # OTLP gRPC endpoint
    environment:
      - PHOENIX_GRPC_PORT=4317
      - PHOENIX_PORT=6006
      - PHOENIX_HOST=0.0.0.0
    volumes:
      - phoenix-data:/phoenix/data
    healthcheck:
      test: ["CMD", "curl", "-f", "http://localhost:6006/health"]
      interval: 10s
      timeout: 5s
      retries: 5

  mcpgateway:
    environment:
      - OTEL_ENABLE_OBSERVABILITY=true
      - OTEL_TRACES_EXPORTER=otlp
      - OTEL_EXPORTER_OTLP_ENDPOINT=http://phoenix:4317
      - OTEL_SERVICE_NAME=mcp-gateway
    depends_on:
      phoenix:
        condition: service_healthy

Using Phoenix UI

Viewing Traces

  1. Navigate to http://localhost:6006
  2. Click on "Traces" in the left sidebar
  3. You'll see:
  4. Timeline view of all operations
  5. Span details with attributes
  6. Error rates and latencies
  7. Service dependency graph

Analyzing Tool Invocations

Phoenix provides specialized views for tool calls:

  1. Tool Performance
  2. Average latency per tool
  3. Success/failure rates
  4. Usage frequency

  5. Cost Analysis (when token tracking is implemented)

  6. Token usage per tool
  7. Estimated costs by model
  8. Cost trends over time

Setting Up Evaluations

Phoenix can evaluate response quality:

# Example: Set up Phoenix evaluations (Python)
from phoenix.evals import llm_eval
from phoenix.trace import trace

# Configure evaluations
evaluator = llm_eval.LLMEvaluator(
    model="gpt-4",
    eval_type="relevance"
)

# Traces from MCP Gateway will be evaluated
evaluator.evaluate(
    trace_dataset=phoenix.get_traces(),
    eval_name="response_quality"
)

Production Deployment

With PostgreSQL Backend

For production, use PostgreSQL for Phoenix storage:

services:
  postgres:
    image: postgres:15
    environment:
      POSTGRES_DB: phoenix
      POSTGRES_USER: phoenix
      POSTGRES_PASSWORD: phoenix_secret
    volumes:
      - postgres-data:/var/lib/postgresql/data

  phoenix:
    image: arizephoenix/phoenix:latest
    environment:
      - DATABASE_URL=postgresql://phoenix:phoenix_secret@postgres:5432/phoenix
      - PHOENIX_GRPC_PORT=4317
      - PHOENIX_PORT=6006
    depends_on:
      - postgres

Kubernetes Deployment

Deploy Phoenix on Kubernetes:

apiVersion: apps/v1
kind: Deployment
metadata:
  name: phoenix
spec:
  replicas: 1
  selector:
    matchLabels:
      app: phoenix
  template:
    metadata:
      labels:
        app: phoenix
    spec:
      containers:
      - name: phoenix
        image: arizephoenix/phoenix:latest
        ports:
        - containerPort: 6006
          name: ui
        - containerPort: 4317
          name: otlp
        env:
        - name: PHOENIX_GRPC_PORT
          value: "4317"
        - name: PHOENIX_PORT
          value: "6006"
        volumeMounts:
        - name: data
          mountPath: /phoenix/data
      volumes:
      - name: data
        persistentVolumeClaim:
          claimName: phoenix-data
---
apiVersion: v1
kind: Service
metadata:
  name: phoenix
spec:
  selector:
    app: phoenix
  ports:
  - port: 6006
    name: ui
  - port: 4317
    name: otlp

Advanced Features

Custom Span Attributes

Add Phoenix-specific attributes in your code:

from mcpgateway.observability import create_span

# Add LLM-specific attributes
with create_span("tool.invoke", {
    "llm.model": "gpt-4",
    "llm.prompt_tokens": 150,
    "llm.completion_tokens": 50,
    "llm.temperature": 0.7,
    "llm.top_p": 0.9
}) as span:
    # Tool execution
    pass

Integrating with Phoenix SDK

For advanced analysis, use the Phoenix SDK:

import phoenix as px

# Connect to Phoenix
px.launch_app(trace_dataset=px.Client().get_traces())

# Analyze traces
traces_df = px.Client().get_traces_dataframe()
print(traces_df.describe())

# Export for further analysis
traces_df.to_csv("mcp_gateway_traces.csv")

Monitoring Best Practices

Key Metrics to Track

  1. Response Times
  2. P50, P95, P99 latencies
  3. Slowest operations
  4. Timeout rates

  5. Error Rates

  6. Error percentage by tool
  7. Error types distribution
  8. Error trends

  9. Usage Patterns

  10. Most used tools
  11. Peak usage times
  12. User distribution

Setting Up Alerts

Configure alerts in Phoenix Cloud:

  1. Go to Settings β†’ Alerts
  2. Create rules for:
  3. High error rates (> 5%)
  4. Slow responses (P95 > 2s)
  5. Unusual token usage
  6. Cost thresholds

Troubleshooting

Phoenix Not Receiving Traces

  1. Check Phoenix is running:

    docker ps | grep phoenix
    curl http://localhost:6006/health
    

  2. Verify OTLP endpoint:

    telnet localhost 4317
    

  3. Check MCP Gateway logs:

    docker logs mcpgateway | grep -i phoenix
    

High Memory Usage

Phoenix stores traces in memory by default. For production:

  1. Use PostgreSQL backend
  2. Configure retention policies
  3. Set sampling rates appropriately

Performance Optimization

  1. Reduce trace volume:

    export OTEL_TRACES_SAMPLER_ARG=0.01  # Sample 1%
    

  2. Filter unnecessary spans:

    # In observability.py, add filtering
    if span_name in ["health_check", "metrics"]:
        return nullcontext()
    

Next Steps