Prometheus Metrics¶
ContextForge exposes Prometheus metrics for monitoring gateway performance, request rates, error rates, and latency distributions.
Overview¶
The Prometheus metrics endpoint provides:
- Request metrics - HTTP request counts, rates, and status codes
- Latency metrics - Request duration histograms with percentiles
- Error tracking - Error rates and types
- Custom labels - Static labels for environment identification
- Gzip compression - Reduced network usage for large metric sets
Quick Start¶
1. Enable Metrics¶
# Enable Prometheus metrics endpoint
export ENABLE_METRICS=true
# Optional: Add custom labels (low-cardinality only)
export METRICS_CUSTOM_LABELS="env=production,region=us-east-1"
# Optional: Exclude high-frequency paths
export METRICS_EXCLUDED_HANDLERS="/servers/.*/sse,/static/.*"
2. Generate Scrape Token¶
Create a non-expiring JWT token for Prometheus to authenticate:
export METRICS_TOKEN=$(python -m mcpgateway.utils.create_jwt_token \
--username prometheus@monitoring \
--exp 0 \
--secret $JWT_SECRET_KEY \
--algo HS256)
# Save to file for Prometheus
echo -n "$METRICS_TOKEN" > /path/to/metrics-token.jwt
3. Start ContextForge¶
The metrics endpoint will be available at:
4. Verify Metrics¶
# Test the endpoint
curl -sS -H "Authorization: Bearer $METRICS_TOKEN" \
http://localhost:4444/metrics/prometheus | head -n 20
Prometheus Configuration¶
Scrape Job¶
Add this job to your prometheus.yml:
scrape_configs:
- job_name: 'mcp-gateway'
metrics_path: /metrics/prometheus
authorization:
type: Bearer
credentials_file: /path/to/metrics-token.jwt
static_configs:
- targets: ['localhost:4444']
Docker Compose¶
If Prometheus runs in Docker, adjust the target:
scrape_configs:
- job_name: 'mcp-gateway'
metrics_path: /metrics/prometheus
authorization:
type: Bearer
credentials_file: /etc/prometheus/metrics-token.jwt
static_configs:
- targets: ['gateway:4444'] # Use service name
Mount the token file:
services:
prometheus:
image: prom/prometheus:latest
volumes:
- ./prometheus.yml:/etc/prometheus/prometheus.yml
- ./metrics-token.jwt:/etc/prometheus/metrics-token.jwt:ro
Configuration Reference¶
Environment Variables¶
| Variable | Description | Default | Options |
|---|---|---|---|
ENABLE_METRICS | Enable Prometheus endpoint | false | true, false |
METRICS_EXCLUDED_HANDLERS | Regex patterns to exclude | (empty) | comma-separated |
METRICS_NAMESPACE | Metrics namespace prefix | default | string |
METRICS_SUBSYSTEM | Metrics subsystem prefix | (empty) | string |
METRICS_CUSTOM_LABELS | Static labels for app_info | (empty) | key=value,... |
Excluded Handlers¶
Exclude high-frequency or high-cardinality paths:
# Exclude SSE streams and static assets
METRICS_EXCLUDED_HANDLERS="/servers/.*/sse,/static/.*,/health.*"
Patterns are compiled as regular expressions and matched against request paths.
Custom Labels¶
Add static labels to the app_info gauge:
Warning: Never use high-cardinality values (user IDs, request IDs, timestamps) as labels.
Available Metrics¶
Request Metrics¶
# Total HTTP requests
http_requests_total{method="POST",handler="/tools/invoke",status="200"}
# Request rate (requests per second)
rate(http_requests_total[1m])
Latency Metrics¶
# Request duration histogram
http_request_duration_seconds_bucket{le="0.1"}
http_request_duration_seconds_bucket{le="0.5"}
http_request_duration_seconds_bucket{le="1.0"}
# P50 latency
histogram_quantile(0.50, sum(rate(http_request_duration_seconds_bucket[5m])) by (le))
# P95 latency
histogram_quantile(0.95, sum(rate(http_request_duration_seconds_bucket[5m])) by (le))
# P99 latency
histogram_quantile(0.99, sum(rate(http_request_duration_seconds_bucket[5m])) by (le))
Error Metrics¶
# Error rate (5xx responses)
rate(http_requests_total{status=~"5.."}[5m])
# Error percentage
sum(rate(http_requests_total{status=~"5.."}[5m])) / sum(rate(http_requests_total[5m])) * 100
Application Info¶
Grafana Dashboards¶
Example Queries¶
Request Rate:
Error Rate:
P99 Latency:
Success Rate:
Dashboard Panels¶
Create panels for:
- Request Rate - Line graph of requests per second
- Error Rate - Line graph of 5xx errors per second
- Latency Percentiles - Multi-line graph (P50, P95, P99)
- Status Code Distribution - Pie chart or bar graph
- Top Endpoints - Table sorted by request count
Import Dashboards¶
Use community dashboards for common components:
- Kubernetes: Dashboard ID 315
- PostgreSQL: Dashboard ID 9628
- Redis: Dashboard ID 11835
Production Deployment¶
Kubernetes¶
Deploy Prometheus with the gateway:
apiVersion: v1
kind: Secret
metadata:
name: prometheus-token
type: Opaque
stringData:
token: <base64-encoded-jwt>
---
apiVersion: v1
kind: ConfigMap
metadata:
name: prometheus-config
data:
prometheus.yml: |
scrape_configs:
- job_name: 'mcp-gateway'
metrics_path: /metrics/prometheus
authorization:
type: Bearer
credentials: <token-from-secret> # pragma: allowlist secret
kubernetes_sd_configs:
- role: pod
namespaces:
names:
- mcp-gateway
relabel_configs:
- source_labels: [__meta_kubernetes_pod_label_app]
action: keep
regex: mcp-gateway
High Availability¶
For HA deployments:
- Multiple Prometheus instances - Scrape all gateway replicas
- Federation - Aggregate metrics from multiple Prometheus servers
- Remote Write - Send metrics to long-term storage (Thanos, Cortex)
Retention¶
Configure Prometheus retention:
# prometheus.yml
global:
scrape_interval: 15s
evaluation_interval: 15s
# Command line flags
--storage.tsdb.retention.time=30d
--storage.tsdb.retention.size=50GB
Security Best Practices¶
Token Management¶
- Non-expiring tokens - Use
--exp 0for service accounts - Rotate regularly - Update tokens quarterly
- Restrict permissions - Token only needs read access
- Secure storage - Store tokens in secrets management
Network Security¶
- Internal network - Keep Prometheus on private network
- Firewall rules - Restrict access to metrics endpoint
- TLS - Use HTTPS for production deployments
- Authentication - Always require JWT authentication
Performance Considerations¶
High-Cardinality Labels¶
Never use high-cardinality values as labels:
# BAD - Explodes time series
http_requests_total{user_id="12345",request_id="abc-123"}
# GOOD - Low cardinality
http_requests_total{method="POST",status="200"}
High-cardinality labels can: - Crash Prometheus with OOM errors - Slow down queries significantly - Increase storage requirements exponentially
Compression¶
The metrics endpoint supports gzip compression:
# Prometheus automatically uses compression
curl -H "Accept-Encoding: gzip" \
-H "Authorization: Bearer $TOKEN" \
http://localhost:4444/metrics/prometheus
Trade-off: Compression reduces network usage but increases CPU on scrape.
Scrape Interval¶
Balance freshness vs. load:
# High frequency (more load)
scrape_interval: 5s
# Standard (recommended)
scrape_interval: 15s
# Low frequency (less load)
scrape_interval: 30s
Excluded Handlers¶
Reduce metric cardinality by excluding paths:
# Exclude high-frequency endpoints
METRICS_EXCLUDED_HANDLERS="/health,/healthz,/ready,/metrics,/static/.*"
Troubleshooting¶
No Metrics Appearing¶
-
Check metrics are enabled:
-
Verify endpoint is accessible:
-
Test with token:
-
Check gateway logs:
Metrics Disabled Response¶
If metrics are disabled, the endpoint returns:
Status code: 503 Service Unavailable
Authentication Errors¶
Solutions: 1. Verify token is valid: python -m mcpgateway.utils.verify_jwt_token --token $METRICS_TOKEN 2. Check token hasn't expired 3. Ensure JWT_SECRET_KEY matches between token generation and gateway
High Memory Usage¶
If Prometheus uses excessive memory:
- Reduce retention:
--storage.tsdb.retention.time=7d - Increase scrape interval:
scrape_interval: 30s - Exclude high-cardinality paths:
METRICS_EXCLUDED_HANDLERS - Review custom labels: Remove high-cardinality labels
Duplicate Collectors¶
Error: "Collector already registered"
Cause: Instrumentation registered multiple times (tests, reloads)
Solution: Restart the gateway process or clear the registry in test fixtures
Integration with Other Systems¶
Datadog¶
Forward Prometheus metrics to Datadog:
# datadog-agent.yaml
prometheus_scrape:
enabled: true
configs:
- configurations:
- url: http://gateway:4444/metrics/prometheus
headers:
Authorization: Bearer <token>
New Relic¶
Use the Prometheus OpenMetrics integration:
# newrelic-infrastructure.yml
integrations:
- name: nri-prometheus
config:
urls:
- http://gateway:4444/metrics/prometheus
bearer_token: <token>
Splunk¶
Use the Splunk OpenTelemetry Collector:
# otel-collector-config.yaml
receivers:
prometheus:
config:
scrape_configs:
- job_name: 'mcp-gateway'
authorization:
type: Bearer
credentials: <token> # pragma: allowlist secret
static_configs:
- targets: ['gateway:4444']
Next Steps¶
- Internal Observability - Built-in database-backed tracing
- OpenTelemetry - Distributed tracing with OTLP
- Grafana Setup - Dashboard configuration
Related Documentation¶
- Configuration Reference - All metrics settings
- Scaling Guide - Production deployment patterns
- Security Features - Authentication and authorization