Overview
LangChain’s streaming system lets you surface live feedback from agent runs to your application. What’s possible with LangChain streaming:- Stream agent progress — get state updates after each agent step.
- Stream LLM tokens — stream language model tokens as they’re generated.
- Stream thinking / reasoning tokens — surface model reasoning as it’s generated.
- Stream custom updates — emit user-defined signals (e.g.,
"Fetched 10/100 records"). - Stream multiple modes — choose from
updates(agent progress),messages(LLM tokens + metadata), orcustom(arbitrary user data).
Supported stream modes
Pass one or more of the following stream modes as a list to thestream method:
| Mode | Description |
|---|---|
updates | Streams state updates after each agent step. If multiple updates are made in the same step (e.g., multiple nodes are run), those updates are streamed separately. |
messages | Streams tuples of (token, metadata) from any graph nodes where an LLM is invoked. |
custom | Streams custom data from inside your graph nodes using the stream writer. |
Agent progress
To stream agent progress, use thestream method with streamMode: "updates". This emits an event after every agent step.
For example, if you have an agent that calls a tool once, you should see the following updates:
- LLM node:
AIMessagewith tool call requests - Tool node:
ToolMessagewith execution result - LLM node: Final AI response
LLM tokens
To stream tokens as they are produced by the LLM, usestreamMode: "messages":
Custom updates
To stream updates from tools as they are executed, you can use thewriter parameter from the configuration.
Output
If you add the
writer parameter to your tool, you won’t be able to invoke the tool outside of a LangGraph execution context without providing a writer function.Stream multiple modes
You can specify multiple streaming modes by passing streamMode as an array:streamMode: ["updates", "messages", "custom"].
The streamed outputs will be tuples of [mode, chunk] where mode is the name of the stream mode and chunk is the data streamed by that mode.
Common patterns
Below are examples showing common use cases for streaming.Streaming thinking / reasoning tokens
Some models perform internal reasoning before producing a final answer. You can stream these thinking / reasoning tokens as they’re generated by filtering standard content blocks for thetype "reasoning".
Reasoning output must be enabled on the model.See the reasoning section and your provider’s integration page for configuration details.To quickly check a model’s reasoning support, see models.dev.
streamMode: "messages" and filter for reasoning content blocks. Use a model instance (e.g. ChatAnthropic) with extended thinking enabled when the model supports it:
Output
thinking blocks, OpenAI reasoning summaries, etc.) into a standard "reasoning" content block type via the content_blocks property.
To stream reasoning tokens directly from a chat model (without an agent), see streaming with chat models.
Disable streaming
In some applications you might need to disable streaming of individual tokens for a given model. This is useful when:- Working with multi-agent systems to control which agents stream their output
- Mixing models that support streaming with those that do not
- Deploying to LangSmith and wanting to prevent certain model outputs from being streamed to the client
streaming: false when initializing the model.
Not all chat model integrations support the
streaming parameter. If your model doesn’t support it, use disableStreaming: true instead. This parameter is available on all chat models via the base class.Related
- Frontend streaming — Build React UIs with
useStreamfor real-time agent interactions - Streaming with chat models — Stream tokens directly from a chat model without using an agent or graph
- Reasoning with chat models — Configure and access reasoning output from chat models
- Standard content blocks — Understand the normalized content block format used for reasoning, text, and other content types
- Streaming with human-in-the-loop — Stream agent progress while handling interrupts for human review
- LangGraph streaming — Advanced streaming options including
values,debugmodes, and subgraph streaming
Connect these docs to Claude, VSCode, and more via MCP for real-time answers.

