TechEarl

How to Build an LLM Agent with Tool Use

Build an LLM agent with tool use: the agentic loop, the tool-call format on Anthropic, OpenAI, and Gemini, runnable code in JavaScript and Python, plus the common failure modes.

Ishan KarunaratneIshan Karunaratne⏱️ 9 min readUpdated
Build an LLM agent with tool use. The agentic loop, tool-call formats on Anthropic / OpenAI / Gemini, JavaScript and Python code, common failure modes.

An LLM agent is a loop: the model decides to call a tool, the tool runs and returns a result, the model continues with the result in context, and the loop repeats until the model decides to stop. The full pattern in 2026 needs four pieces — a tool definition with a JSON schema, a dispatcher that runs the tool the model picked, a conversation history that grows each loop iteration, and a stop condition that recognises when the model is done. Anthropic, OpenAI, and Gemini all support this same pattern with slightly different request shapes. I'll walk runnable code in JavaScript and Python plus the failure modes that bite in production.

The reason this matters in 2026: agentic workflows have moved from research-paper territory to production. A coding assistant that reads files, runs tests, and edits code is an agent. A customer-support bot that looks up the user's order, checks the warranty status, and drafts a refund is an agent. The pattern is uniform; the tools change per use case.

Jump to:

The agentic loop

The whole loop in pseudocode:

code
messages = [{ role: "user", content: user_question }]
while True:
    response = model.generate(messages, tools)
    if response.stop_reason == "end_turn":
        return response.text
    if response.stop_reason == "tool_use":
        messages.append(response.tool_call)
        result = dispatch(response.tool_call)
        messages.append({ role: "tool", content: result })
        continue

The model emits either a final text response (stop) or a tool call (continue). When it emits a tool call, you run the tool yourself and feed the result back into the conversation. The loop repeats until the model decides it has enough information to give a final answer.

The loop typically runs 2-10 iterations for most workflows. Set a hard cap (15-20) to avoid runaway loops where the model keeps calling tools without converging.

Defining tools with JSON Schema

Every tool is a JSON Schema description of its parameters. The model reads this schema and decides when each tool is appropriate.

json
{
  "name": "get_weather",
  "description": "Get the current weather for a city. Use this when the user asks about temperature, conditions, or forecast.",
  "input_schema": {
    "type": "object",
    "properties": {
      "city": {
        "type": "string",
        "description": "The city name, including country if ambiguous (e.g., 'Paris, France' vs 'Paris, Texas')"
      },
      "unit": {
        "type": "string",
        "enum": ["celsius", "fahrenheit"],
        "default": "celsius"
      }
    },
    "required": ["city"]
  }
}

The descriptions matter as much as the schema. The model picks tools based on the description, not the name. Vague descriptions ("Get weather data") produce flaky routing; specific descriptions ("Use this when the user asks about temperature, conditions, or forecast") route correctly.

Anthropic: tool_use and tool_result content blocks

Anthropic's tool format uses content blocks. The model emits a tool_use block with the tool name and input; you respond with a tool_result block containing the output.

Request includes:

json
{
  "model": "claude-sonnet-4-6",
  "tools": [{ "name": "get_weather", "description": "...", "input_schema": {...} }],
  "messages": [
    { "role": "user", "content": "What's the weather in Tokyo?" }
  ]
}

Model response:

json
{
  "stop_reason": "tool_use",
  "content": [
    { "type": "tool_use", "id": "toolu_01abc", "name": "get_weather", "input": { "city": "Tokyo" } }
  ]
}

You run get_weather("Tokyo"), then send back:

json
{
  "role": "user",
  "content": [
    { "type": "tool_result", "tool_use_id": "toolu_01abc", "content": "22°C, partly cloudy" }
  ]
}

The model continues and produces a final response.

OpenAI: tool_calls and tool messages

OpenAI's format is similar but uses tool_calls and tool role messages.

Request:

json
{
  "model": "gpt-5",
  "tools": [{ "type": "function", "function": { "name": "get_weather", "description": "...", "parameters": {...} } }],
  "messages": [{ "role": "user", "content": "..." }]
}

Model response includes message.tool_calls: [{ id, type: "function", function: { name, arguments } }].

You send back a { "role": "tool", "tool_call_id": "...", "content": "..." } message.

Gemini: function_calling

Gemini calls them "functions" instead of "tools" but the pattern is identical. Function declarations go in the request, function calls come back in the response, you reply with the function response.

For multi-provider code, abstract the format conversion behind a small adapter. The agentic loop logic is identical across providers.

Code: JavaScript and Python (Anthropic SDK)

JavaScript:

javascript
import Anthropic from "@anthropic-ai/sdk";
const client = new Anthropic();

const tools = [
  {
    name: "get_weather",
    description: "Get current weather for a city.",
    input_schema: {
      type: "object",
      properties: { city: { type: "string" } },
      required: ["city"],
    },
  },
];

async function dispatch(toolUse) {
  if (toolUse.name === "get_weather") {
    return await fetchWeather(toolUse.input.city);
  }
  throw new Error(`Unknown tool: ${toolUse.name}`);
}

async function runAgent(question, maxSteps = 10) {
  const messages = [{ role: "user", content: question }];
  for (let i = 0; i < maxSteps; i++) {
    const response = await client.messages.create({
      model: "claude-sonnet-4-6",
      max_tokens: 1024,
      tools,
      messages,
    });
    messages.push({ role: "assistant", content: response.content });
    if (response.stop_reason === "end_turn") {
      return response.content.find((b) => b.type === "text")?.text;
    }
    if (response.stop_reason === "tool_use") {
      const toolUseBlocks = response.content.filter((b) => b.type === "tool_use");
      const toolResults = await Promise.all(
        toolUseBlocks.map(async (tu) => ({
          type: "tool_result",
          tool_use_id: tu.id,
          content: String(await dispatch(tu)),
        }))
      );
      messages.push({ role: "user", content: toolResults });
    }
  }
  throw new Error("Agent did not converge");
}

Python:

python
from anthropic import Anthropic
client = Anthropic()

tools = [{
    "name": "get_weather",
    "description": "Get current weather for a city.",
    "input_schema": {
        "type": "object",
        "properties": {"city": {"type": "string"}},
        "required": ["city"],
    },
}]

def dispatch(tool_use):
    if tool_use.name == "get_weather":
        return fetch_weather(tool_use.input["city"])
    raise ValueError(f"Unknown tool: {tool_use.name}")

def run_agent(question: str, max_steps: int = 10) -> str:
    messages = [{"role": "user", "content": question}]
    for _ in range(max_steps):
        response = client.messages.create(
            model="claude-sonnet-4-6",
            max_tokens=1024,
            tools=tools,
            messages=messages,
        )
        messages.append({"role": "assistant", "content": response.content})
        if response.stop_reason == "end_turn":
            return next(b.text for b in response.content if b.type == "text")
        if response.stop_reason == "tool_use":
            tool_uses = [b for b in response.content if b.type == "tool_use"]
            results = [{
                "type": "tool_result",
                "tool_use_id": tu.id,
                "content": str(dispatch(tu)),
            } for tu in tool_uses]
            messages.append({"role": "user", "content": results})
    raise RuntimeError("Agent did not converge")

Note the parallel tool calls: the model can emit multiple tool_use blocks in a single response. Run them concurrently with Promise.all (JS) or asyncio.gather (Python) for speed.

The stop condition

The model decides when to stop via the stop_reason field in the response: end_turn means it produced a final text response, tool_use means it wants you to run a tool and continue.

A hard cap on steps (10-20) is the safety net. Without it, an unhappy agent can loop indefinitely. With it, you fail clearly when the model can't converge — better than running up a $400 token bill on an infinite loop.

Common failure modes

After running agents in production:

  • The model loops on the same tool. It calls search_docs, gets a result, calls search_docs again with a slightly different query, repeat. Cause: the tool descriptions don't help the model know when it has enough information. Fix: tighten the tool description, add a "Stop calling this when you have N results" hint to the system prompt.
  • The model never calls the tool. It just makes up an answer. Cause: the tool description doesn't match how the model thinks about the task. Fix: add explicit triggering language ("When the user asks about X, call tool Y").
  • The model calls the wrong tool. Two tools with overlapping purposes. Cause: ambiguity. Fix: rename and re-describe so they're orthogonal, or merge into one tool with a parameter.
  • Tool runs slowly and the agent appears stuck. No streaming. Cause: tool execution is synchronous and slow. Fix: stream agent thinking back to the user, or move long-running tools to a background job with status polling.
  • The model corrupts the conversation history. It generates a malformed tool call that fails to parse. Cause: rare, but happens. Fix: validate the tool call shape; on parse failure, send a tool_result with "error: malformed call, please try again" and let the model retry.

What to do next

For the foundational techniques agentic workflows depend on:

For the MCP servers that ship as ready-made tools:

External references: Anthropic tool use guide, OpenAI function calling, Gemini function calling.

FAQ

TagsLLMAgentTool UseFunction CallingAnthropicOpenAIGemini
Share
Ishan Karunaratne

Ishan Karunaratne

Tech Architect · Software Engineer · AI/DevOps

Tech architect and software engineer with 20+ years across software, Linux systems, DevOps, and infrastructure — and a more recent focus on AI. Currently Chief Technology Officer at a tech startup in the healthcare space.

Keep reading

Related posts

Run a local LLM with Ollama: install, pull a model, hardware floor, picking between Llama, Mistral, Qwen. When local beats cloud and when it doesn't.

How to Run a Local LLM with Ollama

Run a local LLM with Ollama: install, pull a model, the hardware floor, picking between Llama, Mistral, and Qwen, and when local is faster than cloud (and when it isn't).

Match an email address with regex. Practical pattern, strict RFC 5321 pattern, JavaScript / Python / PHP examples, edge cases, engine compatibility, common mistakes, and a test table.

How to Match an Email Address with Regex

Match an email address with regex. The practical pattern, the strict RFC 5321 pattern, examples in JavaScript, Python, and PHP, edge cases, engine compatibility, common mistakes, and a validation test table.