AI Agents with LangGraph in 2026 — A Practical Tutorial

By 2026 the agent landscape has stabilized around a few sane patterns. LangGraph has become the default way to build them in Python — not because it’s fashionable, but because it solves the hard problems: state, branching, retries, human-in-the-loop, and observability.

This post builds a useful agent end-to-end. Not a def hello() toy — an agent that can search, call tools, decide between paths, and persist its conversation across requests.

Why graphs (not chains)

The original LangChain abstraction — a chain — is a straight line: prompt → llm → output_parser. That breaks the moment your agent needs to:

Decide whether to call a tool or answer directly.
Loop until a condition is met.
Branch based on tool output.
Hand control to a human and resume later.

LangGraph models your agent as a state machine: nodes that mutate state, edges that route between them, conditional edges that branch. It’s just a graph, but the right abstraction.

The agent we’ll build

A research assistant that:

Takes a question.
Decides whether it needs to search the web.
If yes, calls a search tool, then summarizes results.
If no, answers directly.
Persists conversation history across calls.

Small enough to fit in a post, real enough to extract patterns from.

Setup

uv init agent && cd agent
uv add langgraph langchain-anthropic langchain-core httpx

I’m using Claude Sonnet 4.6 here because it’s currently the best agentic LLM, but every line works with langchain-openai or langchain-google-genai by swapping one import.

Step 1 — Define the state

# agent/state.py
from typing import Annotated, TypedDict
from langgraph.graph.message import add_messages
from langchain_core.messages import BaseMessage


class AgentState(TypedDict):
    """Shared state passed between nodes.

    `add_messages` is a reducer — when a node returns {"messages": [m]},
    LangGraph appends rather than overwrites. It's the canonical way to
    accumulate chat history.
    """
    messages: Annotated[list[BaseMessage], add_messages]

Reducers are LangGraph’s killer feature. You declare how state merges, not when. No more “did I forget to extend the list?” bugs.

Step 2 — Define the tools

# agent/tools.py
import httpx
from langchain_core.tools import tool


@tool
def web_search(query: str) -> str:
    """Search the web for recent information. Use this for facts that may
    have changed or that you don't already know."""
    # Use any search API: Tavily, Brave, SerpAPI, your own scraper.
    resp = httpx.get(
        "https://api.tavily.com/search",
        params={"query": query, "max_results": 5},
        headers={"Authorization": f"Bearer {os.environ['TAVILY_KEY']}"},
        timeout=15.0,
    )
    resp.raise_for_status()
    results = resp.json()["results"]
    return "\n\n".join(
        f"{r['title']}\n{r['url']}\n{r['content']}" for r in results
    )


TOOLS = [web_search]

Three things to notice:

@tool registers the function as a tool the LLM can call.
The docstring is the prompt the LLM sees. Write it carefully.
The function is plain Python — no magic. Test it like any other function.

Step 3 — Define the nodes

# agent/nodes.py
from langchain_anthropic import ChatAnthropic
from langgraph.prebuilt import ToolNode

from .tools import TOOLS
from .state import AgentState

llm = ChatAnthropic(model="claude-sonnet-4-6", temperature=0).bind_tools(TOOLS)


def call_model(state: AgentState) -> dict:
    """The brain. Decides what to do next based on state."""
    response = llm.invoke(state["messages"])
    return {"messages": [response]}


# Pre-built node that executes any tool calls in the latest AIMessage
tool_node = ToolNode(TOOLS)

bind_tools is doing a lot: it tells the LLM what tools exist, formats the spec for the provider’s tool-calling API, and parses the response back into ToolCall objects. You don’t have to hand-roll JSON parsing.

Step 4 — Wire the graph

# agent/graph.py
from langgraph.graph import StateGraph, END
from langgraph.prebuilt import tools_condition

from .state import AgentState
from .nodes import call_model, tool_node


def build_graph():
    workflow = StateGraph(AgentState)

    workflow.add_node("agent", call_model)
    workflow.add_node("tools", tool_node)

    workflow.set_entry_point("agent")

    # Conditional edge: if the agent's last message has tool calls, go to tools.
    # Otherwise, end.
    workflow.add_conditional_edges(
        "agent",
        tools_condition,            # pre-built helper; returns "tools" or END
        {"tools": "tools", END: END},
    )

    # After tools run, always go back to the agent so it can respond.
    workflow.add_edge("tools", "agent")

    return workflow.compile()

Read the graph: agent → (maybe tools → agent) → end. Classic ReAct loop, modeled explicitly.

Step 5 — Run it

# agent/main.py
from langchain_core.messages import HumanMessage
from .graph import build_graph

graph = build_graph()

state = graph.invoke(
    {"messages": [HumanMessage(content="What did Anthropic announce this week?")]}
)
print(state["messages"][-1].content)

That’s a working agent. ~50 lines of real code. The LLM will look at the question, decide it needs fresh information, call web_search, get results, and synthesize an answer.

Step 6 — Persistence (the production unlock)

The above is stateless: every invoke starts fresh. To make it conversational across HTTP requests, add a checkpointer:

from langgraph.checkpoint.postgres import PostgresSaver

checkpointer = PostgresSaver.from_conn_string(DATABASE_URL)
graph = build_graph().compile(checkpointer=checkpointer)

# Now invoke with a thread_id — state persists per thread.
config = {"configurable": {"thread_id": "user-42"}}
graph.invoke({"messages": [HumanMessage("Hi")]}, config=config)
graph.invoke({"messages": [HumanMessage("What did I just say?")]}, config=config)
# → "You said 'Hi'."

The checkpointer serializes the full state graph at each step. Every conversation becomes a resumable, debuggable history. You can also do time travel — fork from any prior checkpoint to try a different path.

Postgres is the recommended backend. SQLite for dev, Postgres for prod, no surprises.

Step 7 — Streaming

Don’t make users wait for the full response. Stream:

async for chunk in graph.astream(
    {"messages": [HumanMessage("...")]},
    config=config,
    stream_mode="messages",
):
    msg, meta = chunk
    print(msg.content, end="", flush=True)

stream_mode="messages" yields token-by-token. stream_mode="updates" yields per-node deltas (useful for progress UIs). stream_mode="values" yields the full state after each step (useful for debugging).

Step 8 — Human-in-the-loop

Some actions are too risky to let the agent do unsupervised. LangGraph’s interrupt_before pauses execution before a node:

graph = build_graph().compile(
    checkpointer=checkpointer,
    interrupt_before=["tools"],     # pause before any tool call
)

# First call: gets to the tools node and pauses
state = graph.invoke(initial, config=config)

# Inspect what's about to happen
last = state["messages"][-1]
print(last.tool_calls)              # show the user

# After human approves:
graph.invoke(None, config=config)   # resume from checkpoint

Pattern: render the pending tool call in your UI, wait for approval, resume. This is how you ship agents that touch databases, send emails, or move money.

Patterns I’d reach for next

Multi-agent

Build smaller specialist graphs (researcher, writer, critic) and orchestrate them with a supervisor node that routes work. This is the LangGraph equivalent of microservices for agents.

Structured output

When you want JSON, use with_structured_output(MySchema) instead of asking the model to “please return JSON”:

class SearchPlan(BaseModel):
    queries: list[str]
    rationale: str

planner = ChatAnthropic(model="claude-sonnet-4-6").with_structured_output(SearchPlan)
plan = planner.invoke([HumanMessage("Plan a search for...")])
plan.queries  # ['...', '...']

The provider parses for you, validates against the Pydantic schema, retries on parse failures. Stop hand-rolling JSON parsing.

Observability

import os
os.environ["LANGCHAIN_TRACING_V2"] = "true"
os.environ["LANGCHAIN_API_KEY"] = "..."

LangSmith traces every node, every tool call, every LLM call. When your agent loops 14 times instead of 2, the trace tells you why.

When not to use an agent

A simple chain (or a plain function call) wins over an agent when:

The path is fixed: “embed → retrieve → answer.” That’s RAG, not agency.
Latency matters and the model would only call one tool anyway.
The “agent” is being used because it sounds smart in a deck, not because the workflow has decisions.

The agent tax is real: more tokens, more latency, more failure modes. Only pay it when the problem actually requires decisions.

Wrapping up

LangGraph isn’t the only way to build agents in 2026 — pydantic-ai, Agno, OpenAI’s Agents SDK, and Anthropic’s Agent SDK are all valid. But LangGraph is the one I reach for when the problem has more than two states and a non-trivial control flow.

Once you’ve got nodes, edges, conditional routing, a checkpointer, and human-in-the-loop, you can model almost any agent workflow honestly — including ones with more than one agent.

If you want to see a fuller multi-agent system (researcher + writer + critic) wired up, see rajpoot.dev — there’s a worked-out repo there.

Building something AI-, backend-, or data-heavy and want a second pair of eyes? I do consulting and freelance work — see my projects and ways to reach me at rajpoot.dev .

Why graphs (not chains)#

The agent we’ll build#

Setup#

Step 1 — Define the state#

Step 2 — Define the tools#

Step 3 — Define the nodes#

Step 4 — Wire the graph#

Step 5 — Run it#

Step 6 — Persistence (the production unlock)#

Step 7 — Streaming#

Step 8 — Human-in-the-loop#

Patterns I’d reach for next#

Multi-agent#

Structured output#

Observability#

When not to use an agent#

Wrapping up#