TL;DR
- Free-form LLM outputs break downstream systems; structured outputs ensure predictable, parseable data.
- Use JSON mode + schema validation for simple cases; function calling for complex actions with parameters.
- Always validate outputs - LLMs can generate syntactically valid but semantically wrong JSON.
- Build recovery paths: retry with stricter prompts, fallback to simpler schemas, or escalate to humans.
Jump to Why structured outputs matter · Jump to Implementation patterns · Jump to Schema design · Jump to Error handling
Structured Output Patterns for AI Agents: JSON, Function Calls, and Schema Validation
Ask an LLM to extract customer details and you might get: "The customer John Smith from Acme Corp needs help with billing." That's useful for a human but useless for code. Your system needs { "name": "John Smith", "company": "Acme Corp", "issue": "billing" } - structured, predictable, parseable.
Structured outputs force LLMs to respond in specific formats that integrate cleanly with downstream systems. When done right, you get reliable data extraction, predictable action parameters, and code that doesn't break when the model decides to be creative.
This guide covers structured output patterns from basic JSON mode to complex multi-step validation, with production patterns we use at Athenic for agents that process thousands of structured responses daily.
Key takeaways
- Structured outputs trade flexibility for reliability - the trade is almost always worth it.
- JSON mode alone isn't enough; combine with schema validation to catch semantic errors.
- Function calling is structured output for actions - same principles apply.
- Design schemas for failure: what happens when fields are missing or wrong?
Why structured outputs matter
Consider a simple task: extract action items from meeting notes.
Without structured output:
The meeting covered three action items. First, Sarah needs to update the roadmap by Friday.
Second, the team should review the Q3 metrics. Third, schedule a follow-up with the client.
With structured output:
{
"actionItems": [
{ "owner": "Sarah", "task": "Update the roadmap", "deadline": "Friday" },
{ "owner": "Team", "task": "Review Q3 metrics", "deadline": null },
{ "owner": null, "task": "Schedule follow-up with client", "deadline": null }
]
}
The first is human-readable; the second is machine-processable. Your task management system can parse the JSON, create tickets, assign owners, and set deadlines. It cannot do that with prose.
The reliability equation
Free-form outputs have high variance. The same prompt might return:
- "Here are the action items: ..."
- "Action items from the meeting:\n1. ..."
- "I found 3 action items. Sarah should..."
Each format requires different parsing logic. Structured outputs collapse this variance to predictable formats.
Implementation patterns
Four patterns cover most structured output needs, listed from simplest to most robust.
Pattern 1: JSON mode (basic)
Tell the model to respond in JSON. Simplest approach but least reliable.
const response = await openai.chat.completions.create({
model: 'gpt-4o',
response_format: { type: 'json_object' },
messages: [
{
role: 'system',
content: `Extract customer information as JSON with fields:
- name: customer name
- company: company name
- issue: brief issue description`
},
{
role: 'user',
content: 'Hi, this is John from Acme Corp. We have a problem with our billing.'
}
]
});
const data = JSON.parse(response.choices[0].message.content);
// { name: "John", company: "Acme Corp", issue: "billing problem" }
Limitations:
- No schema enforcement - model can add/omit fields
- No type checking - numbers might come as strings
- No default values - missing fields return
undefined
Use when: Quick prototypes, low-stakes applications, when you'll manually review outputs.
Pattern 2: JSON mode + Zod validation
Add schema validation to catch structural and type errors.
import { z } from 'zod';
// Define schema
const CustomerSchema = z.object({
name: z.string().min(1),
company: z.string().min(1),
issue: z.string().min(1),
priority: z.enum(['low', 'medium', 'high']).optional(),
contactEmail: z.string().email().optional()
});
type Customer = z.infer<typeof CustomerSchema>;
async function extractCustomer(message: string): Promise<Customer> {
const response = await openai.chat.completions.create({
model: 'gpt-4o',
response_format: { type: 'json_object' },
messages: [
{
role: 'system',
content: `Extract customer information as JSON:
- name (required): customer's full name
- company (required): company name
- issue (required): description of their issue
- priority (optional): low, medium, or high
- contactEmail (optional): their email if mentioned
Respond ONLY with valid JSON.`
},
{ role: 'user', content: message }
]
});
const raw = JSON.parse(response.choices[0].message.content);
// Validate and parse
const result = CustomerSchema.safeParse(raw);
if (!result.success) {
console.error('Validation failed:', result.error.issues);
throw new ValidationError(result.error);
}
return result.data;
}
Benefits:
- Type safety at runtime
- Clear error messages for invalid data
- Schema doubles as documentation
- Default values and transformations possible
Pattern 3: OpenAI Structured Outputs (recommended)
OpenAI's native structured output feature guarantees schema compliance.
import { z } from 'zod';
import { zodResponseFormat } from 'openai/helpers/zod';
const ActionItemSchema = z.object({
actionItems: z.array(z.object({
owner: z.string().nullable(),
task: z.string(),
deadline: z.string().nullable(),
priority: z.enum(['low', 'medium', 'high'])
}))
});
async function extractActionItems(meetingNotes: string) {
const response = await openai.beta.chat.completions.parse({
model: 'gpt-4o',
messages: [
{
role: 'system',
content: 'Extract action items from meeting notes.'
},
{ role: 'user', content: meetingNotes }
],
response_format: zodResponseFormat(ActionItemSchema, 'action_items')
});
// Guaranteed to match schema
return response.choices[0].message.parsed;
}
Key benefits:
- Guaranteed compliance: Model cannot return invalid JSON or wrong types
- Native TypeScript types: No parsing gymnastics
- Refusals handled: If model can't comply, it refuses clearly rather than hallucinating
Limitations:
- OpenAI only (as of writing)
- Slight latency overhead
- Not all schema features supported (no regex patterns)
Pattern 4: Function calling for actions
When the agent needs to take actions with parameters, use function calling.
const tools = [
{
type: 'function' as const,
function: {
name: 'create_task',
description: 'Create a new task in the project management system',
strict: true, // Enable strict schema adherence
parameters: {
type: 'object',
properties: {
title: {
type: 'string',
description: 'Task title'
},
assignee: {
type: 'string',
description: 'Person responsible for the task'
},
dueDate: {
type: 'string',
description: 'Due date in ISO format (YYYY-MM-DD)'
},
priority: {
type: 'string',
enum: ['low', 'medium', 'high']
}
},
required: ['title', 'priority'],
additionalProperties: false
}
}
}
];
const response = await openai.chat.completions.create({
model: 'gpt-4o',
messages: [
{
role: 'system',
content: 'You are a project assistant. Create tasks based on user requests.'
},
{ role: 'user', content: 'Remind me to review the proposal by Friday, high priority' }
],
tools,
tool_choice: { type: 'function', function: { name: 'create_task' } }
});
const toolCall = response.choices[0].message.tool_calls?.[0];
if (toolCall) {
const args = JSON.parse(toolCall.function.arguments);
// { title: "Review the proposal", dueDate: "2025-10-10", priority: "high" }
}
When to use function calling vs structured output:
- Function calling: Agent decides whether/when to take action
- Structured output: You always want structured data back
Schema design best practices
Schema design directly impacts output quality. Poor schemas produce poor results.
Make fields explicit, not implicit
Bad:
const Schema = z.object({
data: z.any() // What goes here?
});
Good:
const Schema = z.object({
customerName: z.string(),
issueCategory: z.enum(['billing', 'technical', 'account', 'other']),
urgency: z.number().min(1).max(5)
});
Use enums for constrained values
When possible values are finite, enumerate them.
const SentimentSchema = z.object({
sentiment: z.enum(['positive', 'negative', 'neutral']),
confidence: z.number().min(0).max(1),
reasoning: z.string()
});
This prevents creative responses like "somewhat positive" or "mixed feelings".
Handle nullable fields explicitly
Don't use optional when you mean nullable.
// Optional: field might not exist
const A = z.object({
deadline: z.string().optional() // { } or { deadline: "..." }
});
// Nullable: field exists but might be null
const B = z.object({
deadline: z.string().nullable() // { deadline: null } or { deadline: "..." }
});
// Both: might not exist OR might be null
const C = z.object({
deadline: z.string().nullable().optional()
});
For LLM outputs, nullable is usually clearer - the model explicitly indicates "I couldn't determine this" rather than omitting the field.
Design for partial success
Not every extraction will be complete. Design schemas that capture what's available.
const ContactExtractionSchema = z.object({
contacts: z.array(z.object({
name: z.string(),
email: z.string().email().nullable(),
phone: z.string().nullable(),
role: z.string().nullable(),
confidence: z.number().min(0).max(1)
})),
extractionNotes: z.string().optional() // Model can explain difficulties
});
Error handling strategies
Even with structured outputs, things go wrong. Build recovery paths.
Strategy 1: Retry with feedback
When validation fails, retry with the error message.
async function extractWithRetry<T>(
schema: z.ZodSchema<T>,
prompt: string,
maxRetries: number = 2
): Promise<T> {
let lastError: z.ZodError | null = null;
for (let attempt = 0; attempt <= maxRetries; attempt++) {
const errorFeedback = lastError
? `\n\nPrevious attempt failed validation:\n${lastError.issues.map(i => `- ${i.path.join('.')}: ${i.message}`).join('\n')}\n\nPlease fix these issues.`
: '';
const response = await openai.chat.completions.create({
model: 'gpt-4o',
response_format: { type: 'json_object' },
messages: [
{ role: 'system', content: prompt + errorFeedback },
{ role: 'user', content: inputData }
]
});
const raw = JSON.parse(response.choices[0].message.content);
const result = schema.safeParse(raw);
if (result.success) {
return result.data;
}
lastError = result.error;
}
throw new ExtractionError('Failed after retries', lastError);
}
Strategy 2: Fallback to simpler schema
If the complex schema fails, try a simpler version.
const DetailedSchema = z.object({
name: z.string(),
email: z.string().email(),
phone: z.string().regex(/^\+?[\d\s-()]+$/),
address: z.object({
street: z.string(),
city: z.string(),
postcode: z.string()
})
});
const SimpleSchema = z.object({
name: z.string(),
contactInfo: z.string() // Just capture whatever's there
});
async function extractContact(input: string) {
try {
return await extractWithSchema(DetailedSchema, input);
} catch (error) {
console.warn('Detailed extraction failed, trying simple:', error);
return await extractWithSchema(SimpleSchema, input);
}
}
Strategy 3: Partial extraction
Extract what you can, flag what failed.
interface ExtractionResult<T> {
data: Partial<T>;
failedFields: string[];
confidence: number;
}
async function partialExtract<T>(
schema: z.ZodObject<any>,
input: string
): Promise<ExtractionResult<T>> {
const response = await getLLMResponse(input);
const raw = JSON.parse(response);
const data: Partial<T> = {};
const failedFields: string[] = [];
// Validate each field independently
for (const [key, fieldSchema] of Object.entries(schema.shape)) {
const fieldResult = (fieldSchema as z.ZodType).safeParse(raw[key]);
if (fieldResult.success) {
data[key as keyof T] = fieldResult.data;
} else {
failedFields.push(key);
}
}
const confidence = 1 - (failedFields.length / Object.keys(schema.shape).length);
return { data, failedFields, confidence };
}
Provider comparison
| Feature | OpenAI | Anthropic | Google |
|---|
| JSON mode | ✅ Native | ✅ Via prompt | ✅ Native |
| Structured outputs | ✅ Native (schema) | ❌ Via prompt | ✅ Native |
| Function calling | ✅ Robust | ✅ Robust | ✅ Robust |
| Schema enforcement | ✅ Guaranteed | ❌ Best-effort | ✅ Guaranteed |
Anthropic-specific pattern
Anthropic doesn't have native structured output mode. Use prompt engineering with explicit schema.
const response = await anthropic.messages.create({
model: 'claude-sonnet-4-20250514',
max_tokens: 1024,
messages: [
{
role: 'user',
content: `Extract information from this text and respond with ONLY valid JSON matching this schema:
{
"name": "string (required)",
"company": "string (required)",
"issue": "string (required)"
}
Do not include any text before or after the JSON.
Text: "${input}"`
}
]
});
// May need to extract JSON from response
const jsonMatch = response.content[0].text.match(/\{[\s\S]*\}/);
if (jsonMatch) {
const data = JSON.parse(jsonMatch[0]);
// Validate with Zod
}
Testing structured outputs
Structured outputs need thorough testing - edge cases hide bugs.
describe('Customer extraction', () => {
const cases = [
{
input: 'This is John Smith from Acme Corp with a billing issue',
expected: { name: 'John Smith', company: 'Acme Corp', issue: 'billing' }
},
{
input: 'Customer didn't give their name but they're from TechStartup',
expected: { name: null, company: 'TechStartup', issue: expect.any(String) }
},
{
input: 'asdfghjkl', // Nonsense input
shouldFail: true
}
];
cases.forEach(({ input, expected, shouldFail }) => {
it(`handles: "${input.slice(0, 40)}..."`, async () => {
if (shouldFail) {
await expect(extractCustomer(input)).rejects.toThrow();
} else {
const result = await extractCustomer(input);
expect(result).toMatchObject(expected);
}
});
});
});
FAQs
Should I always use structured outputs?
No. For creative tasks (writing, brainstorming), free-form is appropriate. Use structured outputs when downstream systems need to process the response programmatically.
How do I handle arrays of unknown length?
Zod handles this naturally:
const Schema = z.object({
items: z.array(z.object({
name: z.string(),
value: z.number()
})).min(1).max(100) // Optional bounds
});
Can I use nested schemas?
Yes, deeply nested structures work fine:
const OrderSchema = z.object({
customer: z.object({
name: z.string(),
address: z.object({
street: z.string(),
city: z.string()
})
}),
items: z.array(z.object({
product: z.string(),
quantity: z.number()
}))
});
What about recursive schemas?
Zod supports lazy evaluation for recursive types:
const CommentSchema: z.ZodType<Comment> = z.lazy(() =>
z.object({
text: z.string(),
replies: z.array(CommentSchema).optional()
})
);
How do I handle model refusals?
OpenAI's structured outputs include refusal handling:
if (response.choices[0].message.refusal) {
console.log('Model refused:', response.choices[0].message.refusal);
// Handle appropriately
}
Summary and next steps
Structured outputs transform unpredictable LLM responses into reliable data pipelines. Start with JSON mode + Zod validation, graduate to native structured outputs when available, and always build recovery paths for failures.
Implementation checklist:
- Define schemas in Zod for type safety
- Choose output mode (JSON, structured, function calling)
- Implement validation layer
- Build retry and fallback strategies
- Add comprehensive tests for edge cases
Quick wins:
- Wrap existing JSON mode calls with Zod validation
- Use enums instead of free-form strings where possible
- Add
.nullable() to fields that might be missing
Internal links:
External references: