feat!: Proposal for bidirectional streaming over gRPC.#1120
feat!: Proposal for bidirectional streaming over gRPC.#1120mikeas1 wants to merge 2 commits intoa2aproject:mainfrom
Conversation
This adds a new `SendLiveMessage` RPC to the gRPC A2A service definition. The purpose of this RPC is to address cases where a client would like to maintain an active request stream with an agent, rather than requiring separate follow-up requests any time an agent needs further input. This bidirectional communication is ONLY supported via gRPC. I have tried to lay out the expected semantics as a comment in the gRPC specification. A2A is generally a turn-based protocol, where clients and servers take turns sending each other messages. This is not specifically encoded in our specification, however our SDK implementations do not directly support the idea of receiving a stream of messages from a client. Instead, each new message is treated as a new invocation of the agent implementation. The assumption here is that a client will only send messages to tasks that are in an interrupted state (i.e. input-required), and that agent executions will exit when they reach an interrupted state. The current behavior of SDKs for when a message is received for a task that is actively being processed could be categorized as "undefined". A bidirectional endpoint opens the possibility of full-duplex communication between client and server, however it doesn't require it: bidirectionality is still useful even in a turn-based protocol. The benefit is that an ongoing connection can be maintained for sending responses to agents that enter interrupted states, which enables several convenient properties: - Agent implementations can more easily "await" responses inline, rather than needing to save all necessary state, exit, then reconstitute state when a response is received. - Clients can perform less state tracking, particularly around specifying task and context IDs. These can be implicit to the connection, as a connection is only valid for a single task. - Distributed agent implementations don't need to implement clever load-balancing/routing to achieve task processing locality. Since an active connection is maintained, all responses are received by the same server. This is another view of the first point in this list, but from the networking layer.
Summary of ChangesHello @mikeas1, I'm Gemini Code Assist1! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed! This pull request introduces a significant enhancement to the A2A service by adding support for bidirectional streaming over gRPC. The core change is the new Highlights
Using Gemini Code AssistThe full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips. Invoking Gemini You can request assistance from Gemini at any point by creating a comment using either
Customization To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a Limitations & Feedback Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments. If you're interested in giving your feedback about your experience with Gemini Code Assist for Github and other Google products, sign up here. You can also get AI-powered code generation, chat, as well as code reviews directly in the IDE at no cost with the Gemini Code Assist IDE Extension. Footnotes
|
There was a problem hiding this comment.
Code Review
This pull request introduces a new bidirectional streaming RPC, SendLiveMessage, to the gRPC A2A service. This is a significant feature that enables more interactive, stateful communication between clients and agents. The changes include updating the protobuf definition, the JSON schema, and the TypeScript types to reflect this new capability.
My review focuses on the correctness and clarity of the new RPC definition. Overall, the proposal is well-documented and thought out. I have a few suggestions to improve the API design and code style:
- In
specification/grpc/a2a.proto, I've recommended using a more appropriate gRPC status code (FAILED_PRECONDITIONinstead ofUNIMPLEMENTED) for a specific error case to align better with gRPC best practices. - I've also suggested clarifying the behavior for omitted
SendMessageConfigurationin subsequent requests to remove ambiguity. - Finally, I've pointed out a minor style issue in the protobuf definition regarding field ordering to improve readability.
The other changes in the JSON schema and TypeScript types are consistent with the protobuf definition and look good. This is a great addition to the protocol.
| // If the agent can send push notifications to the clients webhook | ||
| bool push_notifications = 2; | ||
| // If the agent supports bidirectional streaming. | ||
| bool bidi_streaming = 4; |
There was a problem hiding this comment.
Co-authored-by: gemini-code-assist[bot] <176961590+gemini-code-assist[bot]@users.noreply.github.com>
|
So how does this interact with Resubscribe or other Send{xyz} methods? Also, what is the behavior if the client calls this multiple times? Is that disallowed? Allowed with responses multiplexed among both response streams? What happens if the client calls this without specifying task_id but the agent infers the request is relevant to an existing task which is already running against an instance of this? |
Resubscribe should continue to work as-is. It's just a read-only view on the stream of updates, so it would receive the same set of messages as the bidi connected receiver does. Send* methods don't have a well-defined specification for how they should react when multiple messages are received simultaneously. I'd suggest we designate a status code that an agent developer MAY use to reject requests if there's an active agent execution for a task and the agent doesn't support multiple input streams. This situation applies in both the bidi case and non-bidi case (i.e. there's an active bidi stream and another bidi stream or non-bidi request is received, or there's an active server-side stream and another request is received). My suggestion is
I call this out a little in the comments, but let me try to clarify. I'd propose the following rules:
Fall through to the above and treat the request as though it specified that task_id. It's probably good to explain this situation specifically, but I see this as a chain of two rules that cause this result naturally: the agent gets ultimate authority on what task a request is interpreted against + how do agents handle multiple bidi connections for a single task. |
|
Hi @mikeas1 Thanks for the initiative, our team are building agent upon A2A, and we are looking forward to following up on this PR Question: Bidirectional streaming (SendLiveMessage), the stream stays open after interruption. This might create new possibilities but also ambiguities; it would be better if the spec should explicitly include
|
|
This change is really useful for us. I hope it gets merged soon |
|
This PR is small but has been stale for a while. cc @holtskinner |
|
We strongly support merging this proposal. This change fills a real gap in the current gRPC binding: handling follow-up input while a task is processing is currently undefined. Bidirectional streaming provides a clean transport-level solution without changing A2A semantics. Importantly, this remains fully backward compatible:
It unlocks natural interactive A2A use cases (live input, multimodal, voice). This has been blocking our implementation since October 2025. Merging it upstream would allow us to proceed without maintaining a fork, and while keeping the ecosystem aligned. |
|
I like this proposal as a starting point, my primary concern however is whether this encourages multi-turn interaction to stick to a specific Task which isn't the recommended behaviour in the protocol. Ideally interrupted states should only be used for human-in-the-loop interactions and not multi-turn: BAD -> vs Recommended -> I wonder if its possible to approach BiDi streaming from the second perspective with a BiDi stream actually sticks to a "context" rather than Task. Maybe this is a different set of operations. |
|
Just to clarify: Bidirectional streaming does not change the Task model; it enables live interaction with an in-flight Task (e.g., missing parameters, confirmation, interrupts, client-end tools) in a transport-efficient way. Multi-turn conversations that represent new units of work should still be modeled as new Tasks with context reuse. |
Reframe bidirectional streaming documentation to emphasize persistent connections and simplified state management rather than human-in-the-loop interactions, aligning with the original PR a2aproject#1120 framing. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
This adds a new
SendLiveMessageRPC to the gRPC A2A service definition. The purpose of this RPC is to address cases where a client would like to maintain an active request stream with an agent, rather than requiring separate follow-up requests any time an agent needs further input. This bidirectional communication is ONLY supported via gRPC.I have tried to lay out the expected semantics as a comment in the gRPC specification.
A2A is generally a turn-based protocol, where clients and servers take turns sending each other messages. This is not specifically encoded in our specification, however our SDK implementations do not directly support the idea of receiving a stream of messages from a client. Instead, each new message is treated as a new invocation of the agent implementation. The assumption here is that a client will only send messages to tasks that are in an interrupted state (i.e. input-required), and that agent executions will exit when they reach an interrupted state. The current behavior of SDKs for when a message is received for a task that is actively being processed could be categorized as "undefined".
A bidirectional endpoint opens the possibility of full-duplex communication between client and server, however it doesn't require it: bidirectionality is still useful even in a turn-based protocol. The benefit is that an ongoing connection can be maintained for sending responses to agents that enter interrupted states, which enables several convenient properties: