How to Build an LLM Agent with Tool Use

An LLM agent is a loop: the model decides to call a tool, the tool runs and returns a result, the model continues with the result in context, and the loop repeats until the model decides to stop. The full pattern in 2026 needs four pieces — a tool definition with a JSON schema, a dispatcher that runs the tool the model picked, a conversation history that grows each loop iteration, and a stop condition that recognises when the model is done. Anthropic, OpenAI, and Gemini all support this same pattern with slightly different request shapes. I'll walk runnable code in JavaScript and Python plus the failure modes that bite in production.

The reason this matters in 2026: agentic workflows have moved from research-paper territory to production. A coding assistant that reads files, runs tests, and edits code is an agent. A customer-support bot that looks up the user's order, checks the warranty status, and drafts a refund is an agent. The pattern is uniform; the tools change per use case.

Jump to:

The agentic loop
Defining tools with JSON Schema
Anthropic: tool_use and tool_result content blocks
OpenAI: tool_calls and tool messages
Gemini: function_calling
Code: JavaScript and Python (Anthropic SDK)
The stop condition
Common failure modes
FAQ

The agentic loop

The whole loop in pseudocode:

code

messages = [{ role: "user", content: user_question }]
while True:
    response = model.generate(messages, tools)
    if response.stop_reason == "end_turn":
        return response.text
    if response.stop_reason == "tool_use":
        messages.append(response.tool_call)
        result = dispatch(response.tool_call)
        messages.append({ role: "tool", content: result })
        continue

The model emits either a final text response (stop) or a tool call (continue). When it emits a tool call, you run the tool yourself and feed the result back into the conversation. The loop repeats until the model decides it has enough information to give a final answer.

The loop typically runs 2-10 iterations for most workflows. Set a hard cap (15-20) to avoid runaway loops where the model keeps calling tools without converging.

Defining tools with JSON Schema

Every tool is a JSON Schema description of its parameters. The model reads this schema and decides when each tool is appropriate.

json

{
  "name": "get_weather",
  "description": "Get the current weather for a city. Use this when the user asks about temperature, conditions, or forecast.",
  "input_schema": {
    "type": "object",
    "properties": {
      "city": {
        "type": "string",
        "description": "The city name, including country if ambiguous (e.g., 'Paris, France' vs 'Paris, Texas')"
      },
      "unit": {
        "type": "string",
        "enum": ["celsius", "fahrenheit"],
        "default": "celsius"
      }
    },
    "required": ["city"]
  }
}

The descriptions matter as much as the schema. The model picks tools based on the description, not the name. Vague descriptions ("Get weather data") produce flaky routing; specific descriptions ("Use this when the user asks about temperature, conditions, or forecast") route correctly.

Anthropic: tool_use and tool_result content blocks

Anthropic's tool format uses content blocks. The model emits a tool_use block with the tool name and input; you respond with a tool_result block containing the output.

Request includes:

json

{
  "model": "claude-sonnet-4-6",
  "tools": [{ "name": "get_weather", "description": "...", "input_schema": {...} }],
  "messages": [
    { "role": "user", "content": "What's the weather in Tokyo?" }
  ]
}

Model response:

json

{
  "stop_reason": "tool_use",
  "content": [
    { "type": "tool_use", "id": "toolu_01abc", "name": "get_weather", "input": { "city": "Tokyo" } }
  ]
}

You run get_weather("Tokyo"), then send back:

json

{
  "role": "user",
  "content": [
    { "type": "tool_result", "tool_use_id": "toolu_01abc", "content": "22°C, partly cloudy" }
  ]
}

The model continues and produces a final response.

OpenAI: tool_calls and tool messages

OpenAI's format is similar but uses tool_calls and tool role messages.

Request:

json

{
  "model": "gpt-5",
  "tools": [{ "type": "function", "function": { "name": "get_weather", "description": "...", "parameters": {...} } }],
  "messages": [{ "role": "user", "content": "..." }]
}

Model response includes message.tool_calls: [{ id, type: "function", function: { name, arguments } }].

You send back a { "role": "tool", "tool_call_id": "...", "content": "..." } message.

Gemini: function_calling

Gemini calls them "functions" instead of "tools" but the pattern is identical. Function declarations go in the request, function calls come back in the response, you reply with the function response.

For multi-provider code, abstract the format conversion behind a small adapter. The agentic loop logic is identical across providers.

Code: JavaScript and Python (Anthropic SDK)

JavaScript:

javascript

import Anthropic from "@anthropic-ai/sdk";
const client = new Anthropic();

const tools = [
  {
    name: "get_weather",
    description: "Get current weather for a city.",
    input_schema: {
      type: "object",
      properties: { city: { type: "string" } },
      required: ["city"],
    },
  },
];

async function dispatch(toolUse) {
  if (toolUse.name === "get_weather") {
    return await fetchWeather(toolUse.input.city);
  }
  throw new Error(`Unknown tool: ${toolUse.name}`);
}

async function runAgent(question, maxSteps = 10) {
  const messages = [{ role: "user", content: question }];
  for (let i = 0; i < maxSteps; i++) {
    const response = await client.messages.create({
      model: "claude-sonnet-4-6",
      max_tokens: 1024,
      tools,
      messages,
    });
    messages.push({ role: "assistant", content: response.content });
    if (response.stop_reason === "end_turn") {
      return response.content.find((b) => b.type === "text")?.text;
    }
    if (response.stop_reason === "tool_use") {
      const toolUseBlocks = response.content.filter((b) => b.type === "tool_use");
      const toolResults = await Promise.all(
        toolUseBlocks.map(async (tu) => ({
          type: "tool_result",
          tool_use_id: tu.id,
          content: String(await dispatch(tu)),
        }))
      );
      messages.push({ role: "user", content: toolResults });
    }
  }
  throw new Error("Agent did not converge");
}

Python:

python

from anthropic import Anthropic
client = Anthropic()

tools = [{
    "name": "get_weather",
    "description": "Get current weather for a city.",
    "input_schema": {
        "type": "object",
        "properties": {"city": {"type": "string"}},
        "required": ["city"],
    },
}]

def dispatch(tool_use):
    if tool_use.name == "get_weather":
        return fetch_weather(tool_use.input["city"])
    raise ValueError(f"Unknown tool: {tool_use.name}")

def run_agent(question: str, max_steps: int = 10) -> str:
    messages = [{"role": "user", "content": question}]
    for _ in range(max_steps):
        response = client.messages.create(
            model="claude-sonnet-4-6",
            max_tokens=1024,
            tools=tools,
            messages=messages,
        )
        messages.append({"role": "assistant", "content": response.content})
        if response.stop_reason == "end_turn":
            return next(b.text for b in response.content if b.type == "text")
        if response.stop_reason == "tool_use":
            tool_uses = [b for b in response.content if b.type == "tool_use"]
            results = [{
                "type": "tool_result",
                "tool_use_id": tu.id,
                "content": str(dispatch(tu)),
            } for tu in tool_uses]
            messages.append({"role": "user", "content": results})
    raise RuntimeError("Agent did not converge")

Note the parallel tool calls: the model can emit multiple tool_use blocks in a single response. Run them concurrently with Promise.all (JS) or asyncio.gather (Python) for speed.

The stop condition

The model decides when to stop via the stop_reason field in the response: end_turn means it produced a final text response, tool_use means it wants you to run a tool and continue.

A hard cap on steps (10-20) is the safety net. Without it, an unhappy agent can loop indefinitely. With it, you fail clearly when the model can't converge — better than running up a $400 token bill on an infinite loop.

Common failure modes

After running agents in production:

The model loops on the same tool. It calls search_docs, gets a result, calls search_docs again with a slightly different query, repeat. Cause: the tool descriptions don't help the model know when it has enough information. Fix: tighten the tool description, add a "Stop calling this when you have N results" hint to the system prompt.
The model never calls the tool. It just makes up an answer. Cause: the tool description doesn't match how the model thinks about the task. Fix: add explicit triggering language ("When the user asks about X, call tool Y").
The model calls the wrong tool. Two tools with overlapping purposes. Cause: ambiguity. Fix: rename and re-describe so they're orthogonal, or merge into one tool with a parameter.
Tool runs slowly and the agent appears stuck. No streaming. Cause: tool execution is synchronous and slow. Fix: stream agent thinking back to the user, or move long-running tools to a background job with status polling.
The model corrupts the conversation history. It generates a malformed tool call that fails to parse. Cause: rare, but happens. Fix: validate the tool call shape; on parse failure, send a tool_result with "error: malformed call, please try again" and let the model retry.

What to do next

For the foundational techniques agentic workflows depend on:

How to Write an Effective System Prompt — agents need clear role definitions for tool selection.
How to Get Reliable JSON from an LLM — tool input parameters are essentially structured outputs; same techniques apply.
How to Stop an LLM from Hallucinating — agents that ground via tools hallucinate dramatically less.

For the MCP servers that ship as ready-made tools:

Top 5 MCP Servers Every Developer Should Try in 2026 — filesystem, GitHub, database MCPs are the most common agent tools.

External references: Anthropic tool use guide, OpenAI function calling, Gemini function calling.

How to Build an LLM Agent with Tool Use

The agentic loop

Defining tools with JSON Schema

Anthropic: tool_use and tool_result content blocks

OpenAI: tool_calls and tool messages

Gemini: function_calling

Code: JavaScript and Python (Anthropic SDK)

The stop condition

Common failure modes

What to do next

FAQ

Ishan Karunaratne

Related posts

How to Run a Local LLM with Ollama

How to Build RAG with Embeddings and Vector Search

How to Match an Email Address with Regex

What's the difference between a tool and a function in LLM APIs?

Can the model call multiple tools in one response?

How do I prevent an agent from getting stuck in a loop?

Do I need to use MCP for tool use?

How much does an agentic workflow cost compared to a single call?

Ishan Karunaratne