ADR-034: Centralized Notification ServiceΒΆ
- Status: Accepted
- Date: 2026-01-13
- Deciders: Platform Team
ContextΒΆ
The Model Context Protocol (MCP) supports various server-to-client notifications that enable proactive communication and real-time updates. key notification types include:
| Notification Method | Direction | Description | Typical Use Case |
|---|---|---|---|
| notifications/initialized | Client β Server | Signals that the client has completed initialization. | After handshake, client ready to operate. |
| notifications/tools/list_changed | Server β Client | Notifies the client that available tools on the server have changed. | Client should refresh the tool list. |
| notifications/resources/list_changed | Server β Client | Informs the client that available resources on the server have changed. | Client should refresh resource list. |
| notifications/resources/updated | Server β Client | Resource the client subscribed to has changed. | Client may re-read the resource. |
| notifications/prompts/list_changed | Server β Client | Server's prompts list changed. | Client may refresh prompt templates list. |
| notifications/roots/list_changed | Client β Server | Client informs server that its root set changed. | Server might request updated roots from client. |
| notifications/message (logging) | Server β Client | Sends a log message (e.g., informational output from the server). | Logging or debugging stream. |
| progress | Both β Both | Updates progress on a long-running operation with a token. | Streaming progress for long-running tasks. |
| cancelled | Both β Both | Indicates an in-progress request has been cancelled. | Cancelling a long-running operation. |
Prior to this decision, the gateway lacked a unified mechanism to handle these notifications. Specifically:
- Fragmented Handling: Notification logic would otherwise be scattered across individual service methods or session implementations.
- Session Pooling: With the introduction of
MCPSessionPool, sessions are long-lived and shared. A dedicated mechanism is needed to route notifications from these pooled connections back to the appropriate application context (e.g., triggering a refresh). - Refresh Storms: Aggressive servers sending frequent
list_changednotifications could trigger excessive database writes and upstream calls if not debounced. - Context Propagation: Notifications arriving on a pooled session need to be correctly attributed to the originating
gateway_idto perform actions like refreshing the correct gateway's schema.
DecisionΒΆ
We have implemented a Centralized Notification Service (NotificationService) to act as the single hub for managing all MCP server notifications.
This service is designed to:
- Centralize Routing: Act as the destination for
message_handlercallbacks frommcp-python-sdk'sClientSession. - Debounce Events: Implement a configurable buffering strategy (default 5s) to coalesce rapid
list_changedevents into single refresh actions, protecting the gateway and database. - Manage Capabilities: Track per-gateway capabilities (e.g.,
tools.listChanged: true) to know which gateways support event-driven updates. - Integrate with Pooling: Seamlessly hook into
MCPSessionPoolto provide automatic notification handling for all pooled sessions without requiring changes to client code.
Handling Other Notification TypesΒΆ
This centralized architecture is critical for supporting future notification types beyond list_changed:
-
Progress (
notifications/progress):- The centralized service can route progress tokens from backend tools to the client's request context.
- It allows for aggregating progress from multiple parallel tool executions if needed.
-
Logging (
notifications/message):- Server logs can be intercepted centrally and piped into the gateway's observability stack (e.g., structured JSON logging, Prometheus counters).
- Prevents log spam from reaching individual clients unless explicitly subscribed.
-
Cancellation (
notifications/cancelled):- Provides a central point to handle cancellation signals, allowing the gateway to terminate associated backend processes or cleanup resources even if the original request handler has detached.
Changes MadeΒΆ
-
New Service:
mcpgateway/services/notification_service.py- Implements the
NotificationServiceclass withasyncio.Queuefor processing. - Provides
create_message_handler(gateway_id)factory for easy integration with SDK sessions. - Handles
tools/list_changed,resources/list_changed, andprompts/list_changedwith smart debouncing.
- Implements the
-
Session Pool Integration (
MCPSessionPool)- Automatically initializes the notification service on startup.
- Injects the
message_handlerinto every newClientSessioncreated by the pool. - ensures
gateway_idcontext is passed during session creation.
-
Gateway Service Updates (
GatewayService)- Registers gateway capabilities with the notification service upon initialization.
- Propagates
gateway_idin health checks to ensure connectivity context is maintained.
-
Code Fixes
- Updated
tool_service.py,resource_service.py, andgateway_service.pyto explicitly passgateway_idtopool.session()calls, ensuring notifications can be traced back to their source.
- Updated
Architecture DiagramΒΆ
flowchart TD
MCP[MCP Server] -->|notifications/...| Session[ClientSession/Pool]
Session -->|list tools/resources/prompts| MCP
Session -->|callback| Handler[Message Handler]
Handler -->|enqueue| Queue[Async Queue]
subgraph NotificationService
Queue --> Worker[Background Worker]
Worker -->|Debounce Logic| Action{Trigger Action?}
end
Action -->|Yes| GatewayService[Gateway Service]
GatewayService -->|Refresh lists| Session
GatewayService -->|Refresh| DB[(Database)] BenefitsΒΆ
- Scalability: Debouncing prevents system overload from chatty servers.
- Maintainability: Single location for all notification logic. Adding support for
notifications/progressornotifications/messagewill only require changes in this one service. - Consistency: Ensures all pooled sessions behave identically regarding notifications.
- Observability: Centralized logging and metrics for all notification events (received, debounced, processed, failed).
ConsequencesΒΆ
PositiveΒΆ
- Automatic schema synchronization for supported servers.
- Reduced database load compared to aggressive polling.
- Foundation laid for implementing real-time progress bars and log streaming in the future.
NegativeΒΆ
- Adds stateful complexity (background worker task).
- Requires careful lifecycle management (startup/shutdown) to prevent resource leaks.
- Debugging decentralized events can be harder than synchronous flows (mitigated by extensive logging).
- Slightly reduced session reuse when gateway_id differs (sessions are now keyed by gateway_id for correct notification attribution).
Alternatives ConsideredΒΆ
-
Per-Session Handling: Implementing logic directly in the session callback.
-
Rejected: Hard to debounce globally for a gateway if multiple sessions exist (though currently pooled by URL). Hard to unit test complex logic embedded in callbacks. Violates Single Responsibility Principle.
-
Polling Only:
-
Rejected: Inefficient. High latency for discovering new tools, or high load if polling frequency is increased.
ReferencesΒΆ
- MCP Protocol - Notifications
- Issue #1924: Pooled Session Implementation and #1984 Event driven list/spec refresh
- PR #2071 Centralized Notification System designed to work with session pooling