Validate Structured Output from AI
Fields checked against the contract: missing ones flagged, invented ones caught, prose around the object detected.
Overview
Structured output drifts in three ways: fields go missing, fields get invented, and prose creeps in around the object. This setup validates an extraction response that does all three at a typical scale — a friendly "Sure! Here is the extracted data:" preamble, a missing timeline field, and an invented notes field. The validator separates them by severity (missing = FAIL, invented = WARN, prose = WARN), and the repair prompt addresses each one specifically, including the null-over-invention rule for the missing field.
Workflow
-
Reuse your generator's field list
The fields you defined in the extraction prompt are exactly the expected structure here.
-
Mind the severity split
Missing fields fail; invented fields warn — your pipeline may tolerate extras but never absences.
-
Repair with null discipline
The repair prompt says "use null if the value is unknown — never invent one"; the fix doesn't become a new violation.
Why This Works
- Field-set comparison catches both directions of drift
- Severity weighting mirrors real consumer tolerance
- The null-over-invention repair rule closes the loop the contract opened
Best for
- Extraction pipelines with downstream consumers
- Outputs that must match a field list exactly
- Debugging "where did this field come from?" incidents
Not for
- Defining the extraction fields — that's the Extraction Prompt Generator
- Checking whether extracted values are TRUE — structure validation, not fact-checking
Use cases
- Auditing extraction output before it enters the CRM
- Catching invented fields that schema-less checks miss
- Stripping conversational wrapping from data responses