Three reliable ways to make an LLM return JSON that parses every time: use the provider's native structured-output mode (Anthropic tool use, OpenAI Structured Outputs, Gemini's response_schema), validate against a schema with Zod or Pydantic after parsing, and retry with the error message if validation fails. In 2026 the structured-output modes are mature enough that a well-written schema produces valid JSON on the first call essentially 100% of the time — the retry path is a defensive backstop, not the main flow. I'll walk all three with runnable code in JavaScript and Python, plus the prompt-engineering tricks for cases where you cannot use a structured-output mode.
The reason "JSON from an LLM" used to be hard: free-form generation occasionally produces trailing commas, smart quotes, comments, or extra prose before/after the JSON. In 2024 you'd see "JSON mode" in OpenAI as a coarse flag that guaranteed valid JSON but not the right shape. In 2025, every major provider added schema-constrained JSON that guarantees both validity and shape. That's the path to use.
Jump to:
- Anthropic: tool use as structured output
- OpenAI: Structured Outputs with a JSON Schema
- Gemini: response_schema parameter
- Validate the result anyway: Zod and Pydantic
- The retry-with-error pattern
- What to do when structured output isn't available
- FAQ
Anthropic: tool use as structured output
Anthropic's pattern: define a single tool with the schema you want, force the model to call it, then pull the JSON out of the tool-use block. Tool use was designed for function calling but works equally well as a structured-output mechanism.
JavaScript:
import Anthropic from "@anthropic-ai/sdk";
const client = new Anthropic();
const response = await client.messages.create({
model: "claude-sonnet-4-6",
max_tokens: 1024,
tool_choice: { type: "tool", name: "extract_invoice" },
tools: [{
name: "extract_invoice",
description: "Extract structured invoice data from the input",
input_schema: {
type: "object",
properties: {
invoice_number: { type: "string" },
total_cents: { type: "integer" },
line_items: {
type: "array",
items: {
type: "object",
properties: {
description: { type: "string" },
amount_cents: { type: "integer" },
},
required: ["description", "amount_cents"],
},
},
},
required: ["invoice_number", "total_cents", "line_items"],
},
}],
messages: [{ role: "user", content: "Invoice #INV-4521..." }],
});
const result = response.content.find((b) => b.type === "tool_use")?.input;The tool_choice: { type: "tool", name: "..." } forces the model to use that specific tool — it cannot reply with prose, refuse, or pick a different action. The output is guaranteed to match the schema or the call fails with an error.
OpenAI: Structured Outputs with a JSON Schema
OpenAI's Structured Outputs is the direct equivalent — pass a JSON Schema and set strict: true to get back exactly that shape.
import OpenAI from "openai";
const client = new OpenAI();
const response = await client.chat.completions.create({
model: "gpt-5",
messages: [{ role: "user", content: "Invoice #INV-4521..." }],
response_format: {
type: "json_schema",
json_schema: {
name: "invoice",
strict: true,
schema: {
type: "object",
properties: {
invoice_number: { type: "string" },
total_cents: { type: "integer" },
line_items: {
type: "array",
items: {
type: "object",
properties: {
description: { type: "string" },
amount_cents: { type: "integer" },
},
required: ["description", "amount_cents"],
additionalProperties: false,
},
},
},
required: ["invoice_number", "total_cents", "line_items"],
additionalProperties: false,
},
},
},
});
const result = JSON.parse(response.choices[0].message.content);additionalProperties: false on every object is mandatory for strict mode — it tells the API "no extra fields, ever". required arrays must list every property defined (OpenAI does not support optional fields in strict mode, but you can simulate them with ["string", "null"] union types).
Gemini: response_schema parameter
Gemini accepts a schema directly via response_schema. The shape mirrors a Pydantic model or a TypeScript type.
from google import genai
from pydantic import BaseModel
class LineItem(BaseModel):
description: str
amount_cents: int
class Invoice(BaseModel):
invoice_number: str
total_cents: int
line_items: list[LineItem]
client = genai.Client()
response = client.models.generate_content(
model="gemini-2.5-pro",
contents="Invoice #INV-4521...",
config={
"response_mime_type": "application/json",
"response_schema": Invoice,
},
)
invoice = Invoice.model_validate_json(response.text)The Pydantic model serves as both the schema for Gemini AND the parser on the way back. One source of truth.
Validate the result anyway: Zod and Pydantic
Native structured output is reliable but not infallible. Use a runtime validator on the result to catch the rare edge case and to give you a typed object instead of an any:
JavaScript with Zod:
import { z } from "zod";
const InvoiceSchema = z.object({
invoice_number: z.string(),
total_cents: z.number().int().nonnegative(),
line_items: z.array(
z.object({
description: z.string(),
amount_cents: z.number().int(),
})
),
});
const invoice = InvoiceSchema.parse(rawJsonFromLlm); // throws ZodError if invalidPython with Pydantic:
from pydantic import BaseModel, Field, ValidationError
class Invoice(BaseModel):
invoice_number: str
total_cents: int = Field(ge=0)
line_items: list[LineItem]
try:
invoice = Invoice.model_validate_json(raw_json)
except ValidationError as e:
# Handle the validation error — retry, log, surface to user
...The validator catches semantic constraints the schema cannot (a total_cents that is suspiciously larger than the sum of line-item amounts, a date in the future, an email that doesn't pass the email regex pattern).
The retry-with-error pattern
When validation fails, retry once with the error message appended to the prompt. The model self-corrects more often than not.
async function getValidatedInvoice(input) {
for (let attempt = 1; attempt <= 3; attempt++) {
const raw = await callLLM(input);
try {
return InvoiceSchema.parse(JSON.parse(raw));
} catch (err) {
if (attempt === 3) throw err;
input += `\n\nPrevious response had this validation error:\n${err.message}\nReturn corrected JSON only.`;
}
}
}Three attempts is the right ceiling. After three the model is unlikely to fix itself; escalate to a different model tier (Sonnet → Opus) or fail loudly.
What to do when structured output isn't available
For older models, open-source models without schema mode, or providers that don't support it, the fallback pattern:
- Explicit format instruction in the system prompt:
"Return your answer as a single JSON object. No markdown fences, no prose, no comments. The JSON object must have exactly these keys: ..." - Few-shot examples showing the exact output format you want.
- Strip wrapping whitespace and markdown fences before parsing:
raw.trim().replace(/^```(?:json)?\s*|\s*```$/g, ""). - Validate with Zod / Pydantic and retry-with-error.
This is the "old way" — works fine for one-off scripts, not great for production where you want zero parsing errors. Move to a structured-output mode whenever the provider supports it.
What to do next
For the LLM-API foundation that lets all this scale:
- How to Cut LLM API Costs with Prompt Caching covers the cost-optimisation pattern that compounds with structured outputs (cache the schema-bearing system prompt).
- How to Choose Between Claude Haiku, Sonnet, and Opus covers tier selection — structured outputs are equally reliable on Haiku and Sonnet for most schemas, so use the cheaper tier where you can.
For the post-extraction validation layer:
- The Regex Cheat Sheet covers the pattern syntax for field-level validation after parse (email, URL, dates, IPs).
External references: Anthropic tool use documentation, OpenAI Structured Outputs, Gemini structured output.





