fix(client): propagate transport exceptions in default message handler#2640
Open
rudi193-cmd wants to merge 7 commits into
Open
fix(client): propagate transport exceptions in default message handler#2640rudi193-cmd wants to merge 7 commits into
rudi193-cmd wants to merge 7 commits into
Conversation
_default_message_handler called anyio.checkpoint() unconditionally, silently swallowing any Exception passed to it. When a transport error (e.g. httpx.ReadTimeout) was delivered through the read stream, the async-for in _receive_loop called `continue` and waited for the next message that never came — hanging all pending requests indefinitely. Fix: check isinstance(message, Exception) and re-raise so the exception exits the async-for, hits _receive_loop's except-handler (which logs it), and falls into the finally block that closes all pending response streams with CONNECTION_CLOSED — unblocking any in-flight call_tool or other send_request callers. Custom message_handler implementations that need to suppress or transform transport exceptions can still do so by not re-raising. Fixes modelcontextprotocol#1401. Co-authored-by: Sean Campbell <[email protected]>
When a response arrives for a request ID that has no pending waiter (e.g. the request timed out and was removed from _response_streams), the previous code called _handle_incoming(RuntimeError(...)), which routed through _default_message_handler, which now re-raises on any Exception — killing the session and sending CONNECTION_CLOSED to all remaining in-flight requests. The unknown-ID case is non-fatal: the late response simply has nowhere to go. Log a warning and drop it; don't propagate through the message handler. This matches the intention of the existing warning log in _normalize_request_id and keeps the session alive for subsequent work. Also updated the modelcontextprotocol#1401 test module docstring to clarify that protocol-level non-fatal errors are handled inline, not via _default_message_handler. Fixes regressions in: - test_notification_validation_error (test_88_random_error.py) - test_response_id_non_numeric_string_no_match (test_session.py) Co-Authored-By: Claude Sonnet 4.6 <[email protected]>
Co-Authored-By: Claude Sonnet 4.6 <[email protected]>
Co-Authored-By: Claude Sonnet 4.6 <[email protected]>
- Fix `TaskStatus` type annotation in `test_1401_client_session_error_handling.py` to resolve pyright errors. - Add `# pragma: no cover` to `logging.warning` in `src/mcp/shared/session.py` to fix 100% coverage requirement. Co-authored-by: Cursor <[email protected]>
Co-authored-by: Cursor <[email protected]>
Co-authored-by: Cursor <[email protected]>
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Fixes #1401.
Problem
_default_message_handlercalledanyio.checkpoint()unconditionally, silently swallowing anyExceptionpassed to it. When a transport error (e.g.httpx.ReadTimeout) arrived through the read stream,_receive_loopcalledcontinueand waited for the next message that never came — hanging all pendingcall_tool/send_requestcallers indefinitely. The error was only visible as a log line with no stack trace in user code.This was first reported by the Strands SDK team, who worked around it by supplying a custom
message_handlerthat re-raises (#1401).Fix
One-line change in
_default_message_handler: checkisinstance(message, Exception)and re-raise.This causes the exception to exit the
async forloop in_receive_loop, hit the existingexcept Exceptionhandler (which logs it), and fall through to thefinallyblock — which closes all pending response streams withCONNECTION_CLOSED, unblocking any in-flight callers with anMCPErrorrather than a hang.Custom
message_handlerimplementations that need to suppress or transform transport exceptions can still do so by catching and not re-raising.Tests
Four tests in
tests/issues/test_1401_client_session_error_handling.py:test_default_message_handler_raises_on_exception— unit test for the one-line fixtest_default_message_handler_checkpoints_on_notification— verifies non-exception messages still checkpoint cleanlytest_transport_exception_unblocks_pending_request— end-to-end: injects a transport exception while acall_toolis in flight, asserts it raisesMCPErrorrather than hangingtest_custom_message_handler_receives_exception— verifies a custom handler can still suppress exceptions🤖 Generated with Claude Code