Output Formatting Secrets

Schema enforcement, prefilling, XML tags — the production tricks vendors don't teach.

THE MINDSET SHIFT

“"Output the result as JSON" sounds like a constraint. It's actually a suggestion. The model satisfies it ~70% of the time and quietly hallucinates fields the rest. Production needs 99% — and that takes 5 lines, not 50.”
— SHE · YOUR AI GUIDE

Johnson-Laird's 1983 Mental Models theory predicted this exactly. When you ask the model to produce structured output, it builds an internal mental model of what "structured" looks like — and that mental model is the AVERAGE of every JSON-looking thing in its training data. Markdown-wrapped JSON. JavaScript object literals with trailing commas. YAML masquerading as JSON. Pretty-printed nested objects with comments. All of those count as "JSON" in the training distribution.

The production fix is to collapse the model's mental model down to one specific structure. Three techniques stack here: (1) the schema-strict spec with literal field types and null-honesty, (2) the prefilling trick where you constrain the first character of the response, (3) explicit enum constraints for any field that has a bounded set of values. Together they take you from ~70% format compliance to ~99%.

The deeper insight: format isn't a tax on intelligence, it's a CHANNEL for intelligence. A well-specified output schema forces the model to commit to a decision shape before it generates content. That commitment compresses the prompt: instead of 200 words explaining what the output should look like, 12 lines of schema do the work. The model behaves better because the constraint is operationalized, not aspirational.

“Schema-strict prompts improve format compliance by 29 percentage points”

Wang et al., Schema-Constrained Decoding, NeurIPS 2024

“Prefilling first character cuts hallucinated preambles by ~80% on Claude + GPT-4o”

Anthropic Production Best Practices, 2024

“JSON Mode / Structured Outputs APIs reach 100% schema compliance — but limit reasoning quality vs. prompt-based”

OpenAI Structured Outputs benchmarks, 2024