How many tools should I expose to the LLM?

5-15 is the sweet spot. Beyond 20, models start picking poorly and tool definitions blow your token budget. For larger surfaces, consider tool routing or namespaced tools.

Should I let the model call tools in parallel?

Yes for independent reads (fetch user + fetch settings + fetch posts). No for dependent operations or writes — sequential makes the dependency obvious and easier to debug.

LLM Tool Use Patterns in 2026 — Schemas, Validation, and the Loop

Tool use is how LLMs reach beyond text. Done right, agents become useful — they fetch real data, call your APIs, write files. Done wrong, you get hallucinated tool calls, validation hell, and infinite loops. This post is the working set.

Tool definition basics

tools = [{
    "name": "get_weather",
    "description": "Get current weather for a city",
    "input_schema": {
        "type": "object",
        "properties": {
            "city": {"type": "string", "description": "City name"},
            "units": {"type": "string", "enum": ["celsius", "fahrenheit"], "default": "celsius"}
        },
        "required": ["city"]
    }
}]

The description is the prompt. Be precise about behavior, edge cases, what NOT to use it for.

Schema design

{
    "name": "search_products",
    "description": "Search the product catalog. Use this when the user mentions a product type or name. NOT for category browsing — use list_categories for that.",
    "input_schema": {
        "type": "object",
        "properties": {
            "query": {"type": "string", "description": "Search query, max 100 chars"},
            "limit": {"type": "integer", "minimum": 1, "maximum": 50, "default": 10},
            "category": {"type": "string", "enum": ["electronics", "books", "clothing"]},
        },
        "required": ["query"]
    }
}

Constraints in the schema (enum, minimum, etc.) reduce LLM mistakes. The description tells the model when to choose this tool over others.

The loop

async def run(messages, tools, max_iters=15):
    for _ in range(max_iters):
        resp = await client.messages.create(
            model="claude-sonnet-4-6",
            messages=messages,
            tools=tools,
            max_tokens=4096,
        )
        messages.append({"role": "assistant", "content": resp.content})
        
        if resp.stop_reason == "end_turn":
            return resp
        
        # Process tool calls
        results = []
        for block in resp.content:
            if block.type == "tool_use":
                try:
                    result = await dispatch(block.name, block.input)
                    results.append({
                        "type": "tool_result",
                        "tool_use_id": block.id,
                        "content": json.dumps(result),
                    })
                except Exception as e:
                    results.append({
                        "type": "tool_result",
                        "tool_use_id": block.id,
                        "content": f"Error: {e}",
                        "is_error": True,
                    })
        
        messages.append({"role": "user", "content": results})
    
    raise MaxItersReached()

See LLM Agent Frameworks .

Parallel tool calls

Modern models call multiple tools in one turn:

# resp.content has multiple tool_use blocks
[
    {"type": "tool_use", "id": "1", "name": "get_user", "input": {"id": 42}},
    {"type": "tool_use", "id": "2", "name": "get_orders", "input": {"user_id": 42}},
]

Run them concurrently:

results = await asyncio.gather(*[
    dispatch(block.name, block.input)
    for block in resp.content if block.type == "tool_use"
])

For independent reads: massive latency win. For dependent ops, the model usually serializes naturally.

Validation

async def dispatch(name, args):
    schema = TOOL_SCHEMAS[name]
    try:
        validated = schema.model_validate(args)
    except ValidationError as e:
        return {"error": f"Invalid arguments: {e}"}
    return await TOOL_FNS[name](validated)

Validate at the boundary. Models occasionally hallucinate fields or wrong types — let validation reject them, return the error, let the model retry.

See Structured Output .

Error handling

# Bad: raise; loop crashes
result = await tool(args)

# Good: return as data; model decides
try:
    result = await tool(args)
except NotFoundError:
    return {"error": "not found"}
except RateLimitError:
    return {"error": "rate limited; try again later"}
except Exception as e:
    log.exception("tool failed")
    return {"error": "internal error"}

The model recovers gracefully from tool errors when given as result content. Crashing the loop loses progress.

Tool naming and grouping

get_user
get_user_orders
get_user_settings
update_user_email

Consistent prefixes; clear actions. The model learns patterns from naming.

For larger surfaces (20+ tools), namespace:

db.user.get
db.user.update
http.get
fs.read

Tool routing

For huge tool catalogs:

# Step 1: ask LLM which "category" of tools it needs
category = await classify(user_query, ["users", "orders", "products", "support"])

# Step 2: only expose those tools
tools = TOOLS_BY_CATEGORY[category]

Cuts context usage; model has fewer choices.

Tool result formatting

# Bad: dump raw JSON of 1000 records
return json.dumps(huge_response)

# Good: shape for LLM consumption
return {
    "summary": f"Found {len(results)} matching items",
    "items": results[:10],   # first 10
    "total": len(results),
    "more_available": len(results) > 10,
}

Trim. Summarize. Hint the model when there’s more.

For huge results: store and return a token:

async def search(...):
    handle = await store_results_in_cache(big_results)
    return {"handle": handle, "preview": big_results[:5], "total": len(big_results)}

# Model can call get_more(handle, offset)

Streaming with tools

async with client.messages.stream(...) as stream:
    async for event in stream:
        if event.type == "content_block_start" and event.content_block.type == "tool_use":
            # Tool call coming
            pass
        # ... handle text + tool_use blocks

Mostly the loop is the same; you can show “thinking…” then “calling get_user()…” for UX.

State across iterations

Some tools need state (a session token, a connection):

class ToolContext:
    def __init__(self, user, db, http):
        self.user = user
        self.db = db
        self.http = http
        self.cache = {}

async def dispatch(ctx, name, args):
    return await TOOL_FNS[name](ctx, args)

Pass context to every tool call. Avoid global state.

Side-effecting tools

async def transfer_money(ctx, amount, to):
    if not ctx.confirmation:
        return {"awaiting_confirmation": True, "preview": f"Transfer ${amount} to {to}"}
    if amount > 10000 and not ctx.user.is_admin:
        return {"error": "amount too large for non-admin"}
    # ... actually transfer ...

For dangerous operations: confirmation tokens, authorization checks, audit logs. The model can call; the system enforces. See LLM Guardrails .

Common mistakes

1. Vague descriptions

“Gets data.” About what? When? Spend time on descriptions; you save tokens AND failures.

2. Tool flood

50+ tools in the prompt. Token bloat; selection errors. Categorize and route.

3. Raising on tool errors

Loop crashes; user sees nothing. Always return errors as data.

4. Returning megabytes

LLM context blows up; cost spikes. Trim or paginate.

5. No max iters

Model loops forever calling the same tool. Always bound.

What I’d ship today

For an agent project:

Pydantic schemas for all tool inputs.
Structured tool definitions with rich descriptions.
Parallel calls for independent ops.
Errors as data, not exceptions.
Bounded result sizes with pagination tokens.
Authorization at tool layer, not in prompts.
Tracing every tool call.

Read this next

If you want my tool schema library + validation harness, it’s at rajpoot.dev .

Building something AI-, backend-, or data-heavy and want a second pair of eyes? I do consulting and freelance work — see my projects and ways to reach me at rajpoot.dev .

Tool definition basics#

Schema design#

The loop#

Parallel tool calls#

Validation#

Error handling#

Tool naming and grouping#

Tool routing#

Tool result formatting#

Streaming with tools#

State across iterations#

Side-effecting tools#

Common mistakes#

1. Vague descriptions#

2. Tool flood#

3. Raising on tool errors#

4. Returning megabytes#

5. No max iters#

What I’d ship today#

Read this next#