What makes a good tool description?

Verb-led, specific about what it does AND doesn't, mentions limits ('returns up to 25'), and what the agent should do next. The description IS the prompt the model sees; treat it like one.

Should tools return JSON or text?

Text — pre-formatted for the model to read and synthesize. JSON forces the model to parse and reformat. Text saves tokens and reduces error paths. Reserve JSON for tools that pass output to other tools mechanically.

Designing Tools for AI Agents in 2026 — The Patterns That Work

The biggest factor in whether an agent uses your tools correctly isn’t the model — it’s the tool descriptions. After shipping a few agent products, the patterns are clear. This post is the working set.

Names matter

search is too vague. The agent doesn’t know what it searches.

search_orders_by_email is specific. The agent calls it confidently when the user asks about their orders.

Verb_object naming. Be specific. A search_X and search_Y tool pair works far better than one generic search.

Descriptions are prompts

{
  name: "search_orders",
  description: "Search the user's orders by email or order ID. Returns up to 25 most recent matches with id, total, status, and date. Use this when the user asks about their order history.",
  input_schema: { ... }
}

Three things in one description:

What it does: search orders by specific keys.
Limits: returns up to 25.
When to use it: when the user asks about orders.

The third is what most teams forget. It’s a usage hint that prevents misuse.

Input schemas — be explicit

{
  type: "object",
  properties: {
    query: {
      type: "string",
      description: "Email address or order ID. Empty string returns the most recent orders."
    },
    limit: {
      type: "integer",
      minimum: 1,
      maximum: 50,
      default: 25,
      description: "Number of orders to return."
    },
    status: {
      type: "string",
      enum: ["pending", "shipped", "delivered", "cancelled"],
      description: "Filter by status. Omit to include all."
    }
  },
  required: ["query"]
}

Every field has a description. Enums replace stringly-typed args where possible. Minimums/maximums clamp realistic ranges.

Output: text, not JSON

async def search_orders(query: str, limit: int = 25) -> list[TextContent]:
    rows = await db.search(query, limit)
    if not rows:
        return [TextContent(type="text", text="No orders found.")]
    text = "\n".join(
        f"{r['id']}  {r['email']}  ${r['total']:.2f}  {r['status']}  {r['created_at']}"
        for r in rows
    )
    return [TextContent(type="text", text=text)]

Pre-formatted text is what the model reads and synthesizes. JSON forces it to parse twice (read + reformat for the user). For human-facing agents, text wins.

Error returns

async def search_orders(query: str) -> list[TextContent]:
    if not query.strip():
        return [TextContent(
            type="text",
            text="Query is empty. Provide an email address or order ID."
        )]
    try:
        rows = await db.search(query, 25)
    except DatabaseUnavailable:
        return [TextContent(
            type="text",
            text="Database is temporarily unavailable. Try again in a moment."
        )]
    # ...

Helpful errors lead to graceful agent behavior. The model reads the error, knows what to do next (retry, ask user, give up cleanly).

For native MCP error semantics see Model Context Protocol Explained .

Idempotency

Tools that mutate state should be safe to call twice:

async def create_order(idempotency_key: str, items: list[Item]):
    existing = await db.get_by_key(idempotency_key)
    if existing:
        return f"Order {existing.id} already exists for this key."
    new_order = await db.create(items, key=idempotency_key)
    return f"Created order {new_order.id}."

Agents retry. Without idempotency, retries become double-orders. See Idempotency, Retries, and Exactly-Once Illusions .

Hints, not hand-holding

Found 3 matching orders. Use get_order_details with the order ID for more.

A hint at the end of a response nudges the agent to the next useful action. The model picks up on it; user gets richer responses.

Versioning tools

A tool’s behavior changes; the agent’s “memory” of the description doesn’t. To migrate:

Add the new tool with a new name (search_orders_v2).
Mark the old as deprecated: do not use; prefer search_orders_v2.
Eventually remove.

Don’t silently change a tool’s contract. Agents that learned the old shape will misbehave.

Cardinality of tools

Five focused tools beat fifty generic ones. The model picks better when the option set is small and meaningful.

If you have 20 tools, ask: which 8 are most-used? Promote those. Hide the others behind a generic “advanced action” tool that takes a free-form description.

Common mistakes

1. Tool names that overlap

search and find and lookup — the agent picks randomly. Pick one verb per concept.

2. Descriptions copy-pasted from API docs

API docs are written for human developers. Tool descriptions are written for the model. Different audiences.

3. Returning huge JSON

100-row response × verbose JSON = burnt context. Truncate, summarize, paginate.

4. Tools that mutate without confirmation

A delete_account tool that just deletes is dangerous. Add confirmation gates or human-in-the-loop (LLM Security ).

5. No observability

When agents misuse tools, you need to see what was called with what args. Log every tool invocation. See LLM Observability .

Read this next

If you want my tool-design template + eval harness, it’s at rajpoot.dev .

Building something AI-, backend-, or data-heavy and want a second pair of eyes? I do consulting and freelance work — see my projects and ways to reach me at rajpoot.dev .

Names matter#

Descriptions are prompts#

Input schemas — be explicit#

Output: text, not JSON#

Error returns#

Idempotency#

Hints, not hand-holding#

Versioning tools#

Cardinality of tools#

Common mistakes#

1. Tool names that overlap#

2. Descriptions copy-pasted from API docs#

3. Returning huge JSON#

4. Tools that mutate without confirmation#

5. No observability#

Read this next#