Agent frameworks proliferated in 2024-2025; by 2026 the landscape clarified. Some are genuinely useful; others are abstraction tax. This post is the honest comparison.

The frameworks

StrengthsWeaknesses
LangGraphExplicit state machine; observable; matureVerbose; LangChain ecosystem
CrewAIRole-based multi-agent; declarativeOpinionated; harder to escape
OpenAI Agents SDKSimple; OpenAI-blessedOpenAI-centric
AutoGenMulti-agent conversations; research-yMicrosoft; less production-tested
Pydantic AIType-safe; Python-first; cleanNewer; smaller ecosystem
Bare-metal PythonTotal control; debuggableYou build it

Bare-metal first

Most “agent” needs are a tool-calling loop:

async def run_agent(messages, tools, max_iters=10):
    for _ in range(max_iters):
        resp = await client.messages.create(
            model="claude-sonnet-4-6",
            messages=messages,
            tools=tools,
            max_tokens=4096,
        )
        if resp.stop_reason == "end_turn":
            return resp
        
        messages.append({"role": "assistant", "content": resp.content})
        tool_results = []
        for block in resp.content:
            if block.type == "tool_use":
                result = await dispatch(block.name, block.input)
                tool_results.append({
                    "type": "tool_result",
                    "tool_use_id": block.id,
                    "content": json.dumps(result),
                })
        messages.append({"role": "user", "content": tool_results})
    
    raise MaxItersReached()

That’s it. 25 lines. For most agent use cases this is the right starting point. Add observability, error handling, persistence as you need them.

See LLM Agent Error Recovery and Agent Tool Design .

When a framework helps

  • Complex state machines: many states, conditional transitions, fanout.
  • Multi-agent: agents collaborating with distinct roles.
  • Persistence: durable across crashes / restarts (workflow engines beat agent frameworks here, see Temporal ).
  • Standardized tracing / debugging.

LangGraph

from langgraph.graph import StateGraph, END

def classify(state):
    state["category"] = classify_query(state["question"])
    return state

def search(state):
    state["results"] = search_kb(state["question"])
    return state

def respond(state):
    state["response"] = generate(state["question"], state["results"])
    return state

graph = StateGraph(AgentState)
graph.add_node("classify", classify)
graph.add_node("search", search)
graph.add_node("respond", respond)

graph.set_entry_point("classify")
graph.add_conditional_edges("classify", lambda s: 
    "search" if s["category"] == "factual" else "respond"
)
graph.add_edge("search", "respond")
graph.add_edge("respond", END)

app = graph.compile()
result = await app.ainvoke({"question": "..."})

Explicit nodes; explicit transitions. LangSmith integration for tracing. Good when the flow is non-linear.

CrewAI

from crewai import Agent, Task, Crew

researcher = Agent(role="Researcher", goal="Find facts", backstory="...")
writer = Agent(role="Writer", goal="Write draft", backstory="...")
editor = Agent(role="Editor", goal="Polish", backstory="...")

task1 = Task(description="Research X", agent=researcher)
task2 = Task(description="Write draft on X", agent=writer)
task3 = Task(description="Edit draft", agent=editor)

crew = Crew(agents=[researcher, writer, editor], tasks=[task1, task2, task3])
result = crew.kickoff()

Role-based; declarative; debugging is harder because the orchestration is opaque. For well-defined multi-step workflows with clear roles.

OpenAI Agents SDK

from agents import Agent, Runner

agent = Agent(
    name="Support",
    instructions="Help with billing questions",
    tools=[lookup_invoice, refund],
)

result = await Runner.run(agent, "Why was I charged twice?")

Simple. Tied to OpenAI but works with other providers via adapters. Best for OpenAI-first shops.

Pydantic AI

from pydantic_ai import Agent

agent = Agent(
    "claude-sonnet-4-6",
    deps_type=Database,
    result_type=Order,
)

@agent.tool
async def find_order(ctx, order_id: int) -> Order:
    return await ctx.deps.get_order(order_id)

result = await agent.run("Find order 123", deps=db)

Type-safe. Clean. Good fit for Python teams that want structure without LangChain’s surface area.

Choosing

NeedPick
Simple single-agent loopBare metal
Type-safe + cleanPydantic AI
Complex state machineLangGraph
Multi-agent rolesCrewAI
OpenAI-onlyOpenAI Agents SDK
Durable, long-runningTemporal + agents

For 70% of use cases I’ve seen: bare metal or Pydantic AI is the right answer.

What frameworks add

  • Tracing integration (LangSmith, etc.).
  • Tool registries.
  • Multi-agent orchestration.
  • State persistence.

What they cost:

  • Learning curve.
  • Lock-in.
  • Abstraction tax (debugging through layers).
  • Slower iteration (their patterns vs your code).

Anti-patterns

1. Framework for a 50-line script

Loading LangChain to call an LLM with one tool. The framework dwarfs the work.

2. Multi-agent without need

“Three agents collaborate” sounds cool. Often: one agent with three tools is clearer and cheaper.

3. Locked in to LangChain everywhere

Every component is a langchain class. Migrating costs months. Use frameworks at boundaries; not everywhere.

4. No tracing

Whichever framework, instrument. See LLM Observability .

5. Ignoring durability

Long-running agent crashes; loses state. For anything important: persist state or use a workflow engine.

What I’d ship today

For new agent projects:

  1. Start bare-metal: a tool-calling loop in 200 lines.
  2. Add tracing (Langfuse / OTEL).
  3. Add structured-output validation.
  4. If state grows complex: LangGraph or Pydantic AI.
  5. If multi-agent emerges naturally: CrewAI or hand-rolled coordinator.
  6. For long-running / durable: Temporal + your agent code.

Avoid: starting with a framework “because everyone does.”

Read this next

If you want my bare-metal agent loop + tracing starter, it’s at rajpoot.dev .


Building something AI-, backend-, or data-heavy and want a second pair of eyes? I do consulting and freelance work — see my projects and ways to reach me at rajpoot.dev .