Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
Skip to content

feat!: Proposal for bidirectional streaming over gRPC.#1120

Open
mikeas1 wants to merge 2 commits intoa2aproject:mainfrom
mikeas1:live-rpc
Open

feat!: Proposal for bidirectional streaming over gRPC.#1120
mikeas1 wants to merge 2 commits intoa2aproject:mainfrom
mikeas1:live-rpc

Conversation

@mikeas1
Copy link
Contributor

@mikeas1 mikeas1 commented Oct 1, 2025

This adds a new SendLiveMessage RPC to the gRPC A2A service definition. The purpose of this RPC is to address cases where a client would like to maintain an active request stream with an agent, rather than requiring separate follow-up requests any time an agent needs further input. This bidirectional communication is ONLY supported via gRPC.

I have tried to lay out the expected semantics as a comment in the gRPC specification.

A2A is generally a turn-based protocol, where clients and servers take turns sending each other messages. This is not specifically encoded in our specification, however our SDK implementations do not directly support the idea of receiving a stream of messages from a client. Instead, each new message is treated as a new invocation of the agent implementation. The assumption here is that a client will only send messages to tasks that are in an interrupted state (i.e. input-required), and that agent executions will exit when they reach an interrupted state. The current behavior of SDKs for when a message is received for a task that is actively being processed could be categorized as "undefined".

A bidirectional endpoint opens the possibility of full-duplex communication between client and server, however it doesn't require it: bidirectionality is still useful even in a turn-based protocol. The benefit is that an ongoing connection can be maintained for sending responses to agents that enter interrupted states, which enables several convenient properties:

  • Agent implementations can more easily "await" responses inline, rather than needing to save all necessary state, exit, then reconstitute state when a response is received.
  • Clients can perform less state tracking, particularly around specifying task and context IDs. These can be implicit to the connection, as a connection is only valid for a single task.
  • Distributed agent implementations don't need to implement clever load-balancing/routing to achieve task processing locality. Since an active connection is maintained, all responses are received by the same server. This is another view of the first point in this list, but from the networking layer.

This adds a new `SendLiveMessage` RPC to the gRPC A2A service
definition. The purpose of this RPC is to address cases where a
client would like to maintain an active request stream with an agent,
rather than requiring separate follow-up requests any time an agent
needs further input. This bidirectional communication is ONLY supported
via gRPC.

I have tried to lay out the expected semantics as a comment in the gRPC
specification.

A2A is generally a turn-based protocol, where clients and servers take
turns sending each other messages. This is not specifically encoded in
our specification, however our SDK implementations do not directly
support the idea of receiving a stream of messages from a client.
Instead, each new message is treated as a new invocation of the
agent implementation. The assumption here is that a client will only
send messages to tasks that are in an interrupted state (i.e.
input-required), and that agent executions will exit when they reach an
interrupted state. The current behavior of SDKs for when a message is
received for a task that is actively being processed could be
categorized as "undefined".

A bidirectional endpoint opens the possibility of
full-duplex communication between client and server, however it doesn't
require it: bidirectionality is still useful even in a turn-based
protocol. The benefit is that an ongoing connection can be maintained
for sending responses to agents that enter interrupted states, which
enables several convenient properties:

- Agent implementations can more easily "await" responses inline, rather
  than needing to save all necessary state, exit, then reconstitute
  state when a response is received.
- Clients can perform less state tracking, particularly around
  specifying task and context IDs. These can be implicit to the
  connection, as a connection is only valid for a single task.
- Distributed agent implementations don't need to implement clever
  load-balancing/routing to achieve task processing locality. Since an
  active connection is maintained, all responses are received by the
  same server. This is another view of the first point in this list, but
  from the networking layer.
@mikeas1 mikeas1 requested a review from a team as a code owner October 1, 2025 21:27
@gemini-code-assist
Copy link
Contributor

Summary of Changes

Hello @mikeas1, I'm Gemini Code Assist1! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed!

This pull request introduces a significant enhancement to the A2A service by adding support for bidirectional streaming over gRPC. The core change is the new SendLiveMessage RPC, which moves beyond the current turn-based communication model to allow for persistent, full-duplex connections between clients and agents. This change is designed to streamline agent implementations, reduce the complexity of client-side state management, and improve the handling of distributed agent tasks by enabling agents to more naturally await and process responses within an ongoing session.

Highlights

  • New Bidirectional gRPC RPC: Introduced a new SendLiveMessage RPC to the A2A gRPC service, enabling clients to establish and maintain a continuous, bidirectional communication stream with an agent. This allows for more dynamic interactions where agents can await responses inline.
  • Enhanced Agent Capabilities: Added a bidi_streaming flag to the AgentCapabilities message in the gRPC specification, allowing agents to explicitly declare support for this new bidirectional streaming functionality.
  • Specification Updates: Updated the JSON schema (a2a.json) and TypeScript types (types.ts) to include the new bidiStreaming property within AgentCapabilities, ensuring consistency across different protocol definitions and client-side interfaces.
Using Gemini Code Assist

The full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips.

Invoking Gemini

You can request assistance from Gemini at any point by creating a comment using either /gemini <command> or @gemini-code-assist <command>. Below is a summary of the supported commands on the current page.

Feature Command Description
Code Review /gemini review Performs a code review for the current pull request in its current state.
Pull Request Summary /gemini summary Provides a summary of the current pull request in its current state.
Comment @gemini-code-assist Responds in comments when explicitly tagged, both in pull request comments and review comments.
Help /gemini help Displays a list of available commands.

Customization

To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a .gemini/ folder in the base of the repository. Detailed instructions can be found here.

Limitations & Feedback

Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments. If you're interested in giving your feedback about your experience with Gemini Code Assist for Github and other Google products, sign up here.

You can also get AI-powered code generation, chat, as well as code reviews directly in the IDE at no cost with the Gemini Code Assist IDE Extension.

Footnotes

  1. Review the Privacy Notices, Generative AI Prohibited Use Policy, Terms of Service, and learn how to configure Gemini Code Assist in GitHub here. Gemini can make mistakes, so double check it and use code with caution.

Copy link
Contributor

@gemini-code-assist gemini-code-assist bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request introduces a new bidirectional streaming RPC, SendLiveMessage, to the gRPC A2A service. This is a significant feature that enables more interactive, stateful communication between clients and agents. The changes include updating the protobuf definition, the JSON schema, and the TypeScript types to reflect this new capability.

My review focuses on the correctness and clarity of the new RPC definition. Overall, the proposal is well-documented and thought out. I have a few suggestions to improve the API design and code style:

  • In specification/grpc/a2a.proto, I've recommended using a more appropriate gRPC status code (FAILED_PRECONDITION instead of UNIMPLEMENTED) for a specific error case to align better with gRPC best practices.
  • I've also suggested clarifying the behavior for omitted SendMessageConfiguration in subsequent requests to remove ambiguity.
  • Finally, I've pointed out a minor style issue in the protobuf definition regarding field ordering to improve readability.

The other changes in the JSON schema and TypeScript types are consistent with the protobuf definition and look good. This is a great addition to the protocol.

// If the agent can send push notifications to the clients webhook
bool push_notifications = 2;
// If the agent supports bidirectional streaming.
bool bidi_streaming = 4;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

low

For better readability and to follow common Protobuf style, it's recommended to order fields by their number in the .proto file. bidi_streaming has field number 4 but is placed before extensions which has field number 3. Please consider moving this field definition after the extensions field.

Co-authored-by: gemini-code-assist[bot] <176961590+gemini-code-assist[bot]@users.noreply.github.com>
@amye amye added the TSC Review To be reviewed by the Technical Steering Committee label Oct 6, 2025
@ToddSegal
Copy link
Contributor

ToddSegal commented Oct 7, 2025

So how does this interact with Resubscribe or other Send{xyz} methods? Also, what is the behavior if the client calls this multiple times? Is that disallowed? Allowed with responses multiplexed among both response streams? What happens if the client calls this without specifying task_id but the agent infers the request is relevant to an existing task which is already running against an instance of this?

@mikeas1
Copy link
Contributor Author

mikeas1 commented Oct 7, 2025

So how does this interact with Resubscribe or other Send{xyz} methods

Resubscribe should continue to work as-is. It's just a read-only view on the stream of updates, so it would receive the same set of messages as the bidi connected receiver does.

Send* methods don't have a well-defined specification for how they should react when multiple messages are received simultaneously. I'd suggest we designate a status code that an agent developer MAY use to reject requests if there's an active agent execution for a task and the agent doesn't support multiple input streams. This situation applies in both the bidi case and non-bidi case (i.e. there's an active bidi stream and another bidi stream or non-bidi request is received, or there's an active server-side stream and another request is received). My suggestion is FAILED_PRECONDITION for gRPC. 409 Conflict would make sense for HTTP, but this proposal is gRPC only.

Also, what is the behavior if the client calls this multiple times? Is that disallowed? Allowed with responses multiplexed among both response streams?

I call this out a little in the comments, but let me try to clarify.

I'd propose the following rules:

  1. If the agent doesn't support multiple active client connections, error with FAILED_PRECONDITION.
  2. If the agent establishes (meaning: the first response from the agent on the stream is a Task) the same task for multiple client connections, the response stream MUST be treated as though both were connected via resubscribe, i.e. all streams get all responses. All input streams are essentially multiplexed together with the agent seeing one unified stream.
  3. As always, an agent may ignore the requested task and return a new task. In that case, a client attempting to connect to a task multiple times may end up with connections to different tasks.

What happens if the client calls this without specifying task_id but the agent infers the request is relevant to an existing task which is already running against an instance of this?

Fall through to the above and treat the request as though it specified that task_id. It's probably good to explain this situation specifically, but I see this as a chain of two rules that cause this result naturally: the agent gets ultimate authority on what task a request is interpreted against + how do agents handle multiple bidi connections for a single task.

@HungYangChang
Copy link

Hi @mikeas1

Thanks for the initiative, our team are building agent upon A2A, and we are looking forward to following up on this PR

Question: Bidirectional streaming (SendLiveMessage), the stream stays open after interruption. This might create new possibilities but also ambiguities; it would be better if the spec should explicitly include

  • Interrupted (Waiting for client input, can be resumed) vs Failed (Error occurred, cannot be resumed (terminal state))
  • State Transition Rules: When a Task reaches 'interrupted' state on a bidirectional stream, what should Agent and Client do?
    • client might send:
    1. A new Message (with the same task_id) to provide additional info
    2. A CancelTask request equivalent (or just close the stream)
    • Upon receiving additional messages, the agent MUST:
    1. Transition task to 'working' state
    2. Continue processing
    3. Eventually reach terminal state (completed/failed/cancelled)
  • Timeout rule: What if the client never responds? Auto-cancel after timeout
  • Multiple interruptions (book flight: need budget -> need date confirmation -> need final approval)
  • clear and proper error code definition

@lumeche
Copy link

lumeche commented Jan 8, 2026

This change is really useful for us. I hope it gets merged soon

@HungYangChang
Copy link

Hi @mikeas1 @ToddSegal

This PR is small but has been stale for a while.
It will be great to review in the upcoming TSC Review.
Our team is happy to resolve the comments and submit a new PR if needed

cc @holtskinner

@hkmo99
Copy link

hkmo99 commented Feb 24, 2026

We strongly support merging this proposal.

This change fills a real gap in the current gRPC binding: handling follow-up input while a task is processing is currently undefined. Bidirectional streaming provides a clean transport-level solution without changing A2A semantics.

Importantly, this remains fully backward compatible:

  • Adds a new RPC only
  • Explicitly capability-gated
  • No changes to existing messages or flows

It unlocks natural interactive A2A use cases (live input, multimodal, voice).

This has been blocking our implementation since October 2025. Merging it upstream would allow us to proceed without maintaining a fork, and while keeping the ecosystem aligned.

@Tehsmash
Copy link
Contributor

I like this proposal as a starting point, my primary concern however is whether this encourages multi-turn interaction to stick to a specific Task which isn't the recommended behaviour in the protocol. Ideally interrupted states should only be used for human-in-the-loop interactions and not multi-turn:

BAD -> SendMessage -> Task -> Input-required -> SendMessage -> Input-required -> SendMessage -> Complete

vs

Recommended -> SendMessage -> Task -> Complete -> SendMessage (reuse context) -> Task -> Complete

I wonder if its possible to approach BiDi streaming from the second perspective with a BiDi stream actually sticks to a "context" rather than Task. Maybe this is a different set of operations.

@hkmo99
Copy link

hkmo99 commented Feb 26, 2026

Just to clarify: input-required in A2A is not inherently human-in-the-loop. It simply means the Task requires additional input before progressing; that input can come from another agent, a tool, an orchestrator, or a human.

Bidirectional streaming does not change the Task model; it enables live interaction with an in-flight Task (e.g., missing parameters, confirmation, interrupts, client-end tools) in a transport-efficient way. Multi-turn conversations that represent new units of work should still be modeled as new Tasks with context reuse.

HungYangChang pushed a commit to HungYangChang/A2A that referenced this pull request Feb 27, 2026
Reframe bidirectional streaming documentation to emphasize persistent
connections and simplified state management rather than human-in-the-loop
interactions, aligning with the original PR a2aproject#1120 framing.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

TSC Review To be reviewed by the Technical Steering Committee

Projects

None yet

Development

Successfully merging this pull request may close these issues.

7 participants