Skip to content

Well-Known URI Configuration

Well-known URIs are standardized endpoints that web services expose for automated discovery and security contact purposes, as defined by RFC 8615. MCP Gateway provides configurable support for standard well-known URIs with security-first defaults.

Overview

The well-known URI handler provides:

  • robots.txt - Search engine crawler management
  • security.txt - Security contact information (RFC 9116)
  • Custom files - Organization-specific policies (ai.txt, dnt-policy.txt, etc.)
  • Security-first defaults - Appropriate for private API gateway deployment
  • Admin monitoring - Configuration status via /admin/well-known

Quick Start

Default Configuration (Private API)

# Enable well-known URIs (enabled by default)
WELL_KNOWN_ENABLED=true

# Default robots.txt blocks all crawlers (appropriate for private APIs)
# No additional configuration needed - uses built-in secure defaults

Access your well-known files: - GET /.well-known/robots.txt - Always available - GET /.well-known/security.txt - Available when configured - GET /admin/well-known - Configuration status (requires auth)

Configuration

Basic Settings

# Enable/disable well-known URI endpoints
WELL_KNOWN_ENABLED=true

# Cache control for well-known files (seconds)
WELL_KNOWN_CACHE_MAX_AGE=3600  # 1 hour default

robots.txt Configuration

# Default: blocks all crawlers (security-first for private APIs)
WELL_KNOWN_ROBOTS_TXT="User-agent: *\nDisallow: /\n\n# MCP Gateway is a private API gateway\n# Public crawling is disabled by default"

# Public API example: allow health checks, block admin
WELL_KNOWN_ROBOTS_TXT="User-agent: *\nAllow: /health\nAllow: /docs\nDisallow: /admin\nDisallow: /tools\nDisallow: /"

# Allow specific bots only
WELL_KNOWN_ROBOTS_TXT="User-agent: monitoring-bot\nAllow: /health\nAllow: /metrics\n\nUser-agent: *\nDisallow: /"

security.txt Configuration

Configure security contact information per RFC 9116:

# Basic security contact
WELL_KNOWN_SECURITY_TXT="Contact: mailto:security@example.com\nExpires: 2025-12-31T23:59:59Z\nPreferred-Languages: en"

# Comprehensive security.txt
WELL_KNOWN_SECURITY_TXT="Contact: mailto:security@example.com\nContact: https://example.com/security\nEncryption: https://example.com/pgp-key.txt\nAcknowledgments: https://example.com/security/thanks\nPreferred-Languages: en, es\nCanonical: https://api.example.com/.well-known/security.txt\nHiring: https://example.com/careers"

Note: The system automatically: - Adds Expires field if missing (6 months from generation) - Adds header comments with generation timestamp - Validates RFC 9116 format requirements

Custom Well-Known Files

Add organization-specific well-known files via JSON configuration:

# AI usage policy
WELL_KNOWN_CUSTOM_FILES='{"ai.txt": "# AI Usage Policy\n\nThis MCP Gateway uses AI for:\n- Tool orchestration\n- Response generation\n- Error handling\n\nWe do not use AI for:\n- User data analysis\n- Behavioral tracking\n- Decision making without human oversight"}'

# Multiple custom files
WELL_KNOWN_CUSTOM_FILES='{"ai.txt": "AI Policy: Responsible use only", "dnt-policy.txt": "# Do Not Track Policy\n\nWe respect the DNT header.\nNo tracking cookies are used.\nOnly essential session data is stored.", "change-password": "https://mycompany.com/account/password"}'

API Access

Public Endpoints (No Authentication)

Well-known URIs are public by design (RFC 8615):

# Always available (when enabled)
curl https://api.example.com/.well-known/robots.txt

# Available when configured
curl https://api.example.com/.well-known/security.txt

# Custom files (when configured)
curl https://api.example.com/.well-known/ai.txt

Admin Monitoring (Authentication Required)

# Check configuration status
curl -H "Authorization: Bearer $TOKEN" \
  https://api.example.com/admin/well-known

# Response example:
{
  "enabled": true,
  "configured_files": [
    {
      "path": "/.well-known/robots.txt",
      "enabled": true,
      "description": "Robot exclusion standard",
      "cache_max_age": 3600
    },
    {
      "path": "/.well-known/security.txt",
      "enabled": true,
      "description": "Security contact information",
      "cache_max_age": 3600
    }
  ],
  "supported_files": ["robots.txt", "security.txt", "ai.txt", "dnt-policy.txt", "change-password"],
  "cache_max_age": 3600
}

Security Considerations

Private API Deployment (Default)

For private API gateways, the default configuration: - Blocks all crawlers via robots.txt - Minimizes information disclosure - No security.txt (unless explicitly configured) - Cache headers for performance but not long-term public caching

Public API Deployment

For public-facing APIs, consider: - Selective crawler access - allow health checks, block admin endpoints - Security contact information - enable security.txt for responsible disclosure - Custom policies - AI usage policy, privacy policy links - Monitoring - track well-known endpoint access in logs

Information Disclosure

Well-known URIs intentionally disclose information: - Service type/purpose through robots.txt comments - Security contact through security.txt - Organizational policies through custom files

Review all content before deployment.

Deployment Examples

Docker Compose

services:
  gateway:
    environment:
      WELL_KNOWN_ENABLED: "true"
      WELL_KNOWN_ROBOTS_TXT: |
        User-agent: monitoring-bot
        Allow: /health

        User-agent: *
        Disallow: /
      WELL_KNOWN_SECURITY_TXT: |
        Contact: security@example.com
        Encryption: https://example.com/pgp
        Expires: 2025-12-31T23:59:59Z
      WELL_KNOWN_CUSTOM_FILES: '{"ai.txt": "AI is used for tool orchestration"}'
      WELL_KNOWN_CACHE_MAX_AGE: "7200"

Kubernetes ConfigMap

apiVersion: v1
kind: ConfigMap
metadata:
  name: mcp-gateway-wellknown
data:
  WELL_KNOWN_ENABLED: "true"
  WELL_KNOWN_ROBOTS_TXT: |
    User-agent: *
    Disallow: /

    # Private API - No public crawling
  WELL_KNOWN_SECURITY_TXT: |
    Contact: mailto:security@example.com
    Contact: https://example.com/security
    Expires: 2025-12-31T23:59:59Z
    Preferred-Languages: en
  WELL_KNOWN_CUSTOM_FILES: |
    {
      "ai.txt": "This service uses AI for tool orchestration only.",
      "dnt-policy.txt": "We honor Do Not Track headers."
    }

Helm Chart Values

config:
  wellKnown:
    enabled: true
    cacheMaxAge: 3600
    robotsTxt: |
      User-agent: internal-monitor
      Allow: /health
      Allow: /metrics

      User-agent: *
      Disallow: /
    securityTxt: |
      Contact: security@example.com
      Encryption: https://example.com/pgp-key.txt
      Acknowledgments: https://example.com/security/hall-of-fame
    customFiles:
      ai.txt: "AI Usage: Tool orchestration and response generation only"
      dnt-policy.txt: "We respect Do Not Track headers and implement minimal tracking"

Troubleshooting

Common Issues

Problem: Well-known endpoints return 404

# Check if feature is enabled
curl -H "Authorization: Bearer $TOKEN" \
  https://api.example.com/admin/well-known

Problem: security.txt not available

# security.txt is only enabled when content is provided
WELL_KNOWN_SECURITY_TXT="Contact: security@example.com"

Problem: Custom files not working

# Ensure valid JSON format
WELL_KNOWN_CUSTOM_FILES='{"ai.txt": "AI Policy content here"}'

# Check JSON validity
echo '{"ai.txt": "content"}' | python3 -m json.tool

Problem: Cache headers not updating

# Clear browser cache or check cache-control header
curl -I https://api.example.com/.well-known/robots.txt

Validation

Check well-known URI configuration:

# Test robots.txt
curl -I https://api.example.com/.well-known/robots.txt

# Test security.txt (if configured)
curl https://api.example.com/.well-known/security.txt

# Check admin status
curl -H "Authorization: Bearer $TOKEN" \
  https://api.example.com/admin/well-known | jq .

Monitoring

Monitor well-known URI access in logs:

# Search for well-known requests in logs
grep "/.well-known/" /var/log/mcpgateway.log

# Monitor for unexpected access patterns
grep -E "(/.well-known/|robots|security\.txt)" /var/log/access.log | \
  awk '{print $1, $7}' | sort | uniq -c

Standards Compliance

RFC 8615 - Well-Known URIs

  • βœ… Serves content at /.well-known/ path
  • βœ… Uses appropriate content types
  • βœ… Implements proper caching headers
  • βœ… Provides helpful error messages

RFC 9116 - security.txt

  • βœ… Validates required fields
  • βœ… Auto-generates Expires field if missing
  • βœ… Serves with correct content-type
  • βœ… Supports all standard fields (Contact, Expires, Encryption, etc.)

RFC 9309 - Robots Exclusion Protocol

  • βœ… Standard robots.txt format
  • βœ… User-agent directive support
  • βœ… Allow/Disallow directive support
  • βœ… Comment support for documentation