Skip to content

πŸ§ͺ Testing ContextForgeΒΆ

This section covers the testing strategy and tools for ContextForge.


Testing PyramidΒΆ

Layer Tool Location Status
Unit tests pytest tests/unit/ Implemented
Integration tests pytest tests/integration/ Implemented
End-to-end tests pytest tests/e2e/, tests/e2e_rust/ Implemented
UI automation Playwright tests/playwright/ Implemented
Security / DAST Playwright + OWASP ZAP tests/playwright/security/ Implemented
Load testing Locust tests/loadtest/ Implemented
Concurrency tests Manual (asyncio) tests/manual/concurrency/ Implemented
JS unit tests - - Not yet implemented

πŸ”Ή Basic Smoke TestΒΆ

Use the Basic Smoke Test to verify:

  • JWT token generation and authentication
  • Gateway registration
  • Tool registration
  • Server creation and event streaming
  • Tool invocation via JSON-RPC

This test is ideal for validating local development environments or freshly deployed test instances.


🐍 Python Testing (pytest)¢

Run the full test suite or specific categories:

make test                      # full suite
pytest -k "<name>" tests/unit/ # only tests matching <name>
pytest tests/unit/             # unit tests only
pytest tests/integration/      # integration tests
pytest tests/e2e/              # end-to-end scenarios

Coverage reporting:

make coverage                  # run with coverage
make coverage-html             # generate HTML report

🎭 UI Automation (Playwright)¢

Playwright tests validate the Admin UI interactions:

# Install Playwright browsers (one-time)
playwright install

# Run UI tests
pytest tests/playwright/

# Run specific admin tests
pytest tests/playwright/ -k admin

Tests cover login flows, CRUD operations, and UI state management.


πŸ¦— Load Testing (Locust)ΒΆ

Locust is used for performance and load testing:

# Containerized load testing (recommended for docker-compose users)
make testing-up
# Locust UI: http://localhost:8089 (targets http://nginx:80 by default)

# Start Locust web UI
locust -f tests/loadtest/locustfile.py --host=http://localhost:8080

# Headless load test
locust -f tests/loadtest/locustfile.py --host=http://localhost:8080 \
  --headless -u 100 -r 10 -t 60s

Access the Locust dashboard at http://localhost:8089 when running with the web UI.


πŸ¦€ Rust MCP Runtime ValidationΒΆ

For the Rust MCP runtime path, the most important stack-backed checks are:

make testing-rebuild-rust-full
make test-mcp-protocol-e2e
make test-mcp-rbac
make test-mcp-access-matrix
make test-mcp-session-isolation
make test-mcp-session-isolation-load MCP_ISOLATION_LOAD_RUN_TIME=30s
cargo test --release --manifest-path crates/mcp_runtime/Cargo.toml

For live plugin parity, use the test-specific plugin config and run the same E2E against both Python mode and Rust full mode:

PLUGINS_CONFIG_FILE=plugins/plugin_parity_config.yaml make testing-up
MCP_PLUGIN_PARITY_EXPECTED_RUNTIME=python make test-mcp-plugin-parity

PLUGINS_CONFIG_FILE=plugins/plugin_parity_config.yaml make testing-rebuild-rust-full
MCP_PLUGIN_PARITY_EXPECTED_RUNTIME=rust make test-mcp-plugin-parity

This parity gate currently proves live plugin behavior on: - resources/read - tools/call - prompts/get

For revocation and membership/role-drift validation, shorten the reuse TTL so the bounded-TTL contract completes quickly:

MCP_RUST_SESSION_AUTH_REUSE_TTL_SECONDS=2 MCP_RUST_SESSION_AUTH_REUSE_GRACE_SECONDS=1 make testing-rebuild-rust-full
make test-mcp-access-matrix
make test-mcp-session-isolation
make test-mcp-session-isolation-load MCP_ISOLATION_LOAD_RUN_TIME=30s

Use these mode-specific rebuild targets when validating rollout behavior:

make testing-rebuild-rust-shadow
make testing-rebuild-rust
make testing-rebuild-rust-full

These validate, respectively:

  • shadow: Rust sidecar present while public /mcp stays on Python
  • edge: direct Rust public ingress without the full Rust session/runtime cores
  • full: direct Rust public ingress plus Rust session/event/resume/live-stream and affinity cores

For throughput benchmarks and Locust wrappers, see Performance Testing.


🌐 Frontend JavaScript Testing¢

Frontend JavaScript unit tests are not yet implemented. The codebase uses plain JavaScript (not TypeScript) with:

  • ESLint + Prettier for linting/formatting
  • No test framework (Jest/Vitest/Mocha) currently configured

Linting is available:

make eslint        # lint JavaScript
make lint-web      # ESLint + HTMLHint + Stylelint
make format-web    # Prettier formatting

πŸ”’ Security Testing (OWASP & DAST)ΒΆ

Two-layer coverage for OWASP A01:2021 – Broken Access Control:

make test-owasp   # Layer 1: direct Playwright access-control tests (no ZAP needed)
make test-zap     # Layer 2: ZAP DAST scan (requires make testing-zap-up)

See Security Testing for the full guide including environment variables, authentication setup, ZAP target URL configuration, and report locations.


πŸ”€ Concurrency TestingΒΆ

Manual concurrency tests validate data consistency under concurrent access. These require a live ContextForge instance backed by PostgreSQL and Redis β€” they are not part of automated CI.

Test ID Makefile Target What it validates
CONC-02 make conc-02-gateways No 5xx errors, no malformed payloads, and valid final read when concurrent readers and writers hit GET/PUT /gateways/{id}

Quick start:

# Prerequisites: PostgreSQL + Redis + gateway + translator running
# (see tests/manual/README.md for full infrastructure setup)

# Generate token and run
export CONC_TOKEN="$(python3 -m mcpgateway.utils.create_jwt_token \
  --username admin@example.com --exp 120 --secret my-test-key-but-now-longer-than-32-bytes)"
make conc-02-gateways

# Custom parameters
CONC_RW_DURATION_SEC=30 CONC_RW_READERS=10 CONC_RW_WRITERS=2 make conc-02-gateways

Full runbook, environment variable reference, and results template: tests/manual/concurrency/conc_02_gateways_results.md.


πŸ” Additional TestingΒΆ

For database performance testing, see Database Performance.

πŸ”Ή Microsoft Entra ID E2E TestsΒΆ

Use the Entra ID E2E Testing Guide to validate:

  • SSO integration with Microsoft Entra ID (Azure AD)
  • Group-based platform_admin role assignment
  • Dynamic user and group management via Microsoft Graph API

These tests are fully automated and self-contained, creating and cleaning up Azure resources automatically.


For additional scenarios (e.g., completion APIs, multi-hop toolchains), expand the test suite as needed.