Catch Invalid AI Labels
The model answered "Complaints" — your set says "Complaint". One character of drift, one broken dashboard. Caught before it counts.
View Resource →Structured Output
Paste an AI's output and check it against the structure you expected: real JSON parsing, heading and label checks, a 0–100 health score with a PASS / WARNING / FAIL verdict — and a repair prompt you send straight back to the model. Runs entirely in your browser.
What output is being checked, and against what? E.g. "Validate support ticket classification output." Feeds the repair prompt's requirement line.
Paste the AI's response exactly as it came back — fences, prose, and all. That's what gets validated.
The model answered "Complaints" — your set says "Complaint". One character of drift, one broken dashboard. Caught before it counts.
View Resource →The PRD looks done — but Non-Goals is gone and Requirements jumped above Goals. Section presence and order, checked in one pass.
View Resource →Run 40 looks nothing like run 1: sections reordered, one came back empty. Detect the drift, repair the run, keep the pipeline honest.
View Resource →The JSON won't parse and you can't see why. Deterministic cause-sniffing — trailing commas, single quotes, unclosed brackets — and the repair prompt that fixes it.
View Resource →Don't re-roll the whole response — send back a surgical prompt that fixes the violations and keeps everything that was right.
View Resource →Paste the response, get the verdict: real JSON parsing, missing-field detection, and a repair prompt for everything found.
View Resource →Label in the set? Case exact? Confidence in range? The checks that keep classification output usable for routing.
View Resource →Heading scan against the skeleton: missing sections, broken order, absent title — the README that shipped without its Examples.
View Resource →Fields checked against the contract: missing ones flagged, invented ones caught, prose around the object detected.
View Resource →Did the summary keep its contract? Section presence, preamble detection, and the "Here is the summary" tax — checked and repaired.
View Resource →Turn messy text into structured data you can trust enough to feed another system — bound the source, extract the fields, force clean JSON, and validate before it flows downstream.
View Playbook →Make any AI task return JSON your code can rely on — define the schema, force the model to it, validate every response, and diff the drift when a model update breaks the shape.
View Playbook →Build a text classification step you can automate on — pull out the unit to classify, assign a label from a fixed set, and validate the label is one you actually allow.
View Playbook →Generate documentation that matches the code instead of drifting from it — have AI explain what the code really does, write it up as structured docs, then validate the format holds.
View Playbook →State the validation goal, pick the expected output type — JSON, YAML, XML, CSV, markdown document, structured summary, classification, or extraction output — and list the expected structure, one item per line: field names for JSON, section headings for documents, allowed labels for classification. Then paste the AI's actual response, exactly as it came back, and click Validate Output. The engine runs a real check, not a string match: JSON gets parsed (with deterministic cause-sniffing when it doesn't), XML gets tag-stack well-formedness checking, CSV gets quote-aware column math, markdown gets a heading scan with order checking, classification gets label-set and confidence validation. You get a 0–100 health score with a PASS / WARNING / FAIL verdict, every issue listed with its fix — and a repair prompt you paste straight back into the model to get the corrected output. Nothing leaves your browser.
Direction. The other five tools generate prompts that DEFINE what the output should be — schema, fields, labels, sections. This tool checks what the output actually WAS. It's the category's only output-side tool: paste a response, get a verdict and a repair prompt. They define the contract; this one enforces it.
Real, within what a browser can do deterministically: JSON goes through JSON.parse with cause-sniffing for trailing commas, single quotes, and unclosed brackets; XML gets tag-stack well-formedness checking; CSV gets quote-aware cell counting against the header; markdown and summaries get a heading scan with presence, order, and empty-section checks; classification output gets label-set membership, case, and confidence-format checks.
A deterministic 0–100: every issue subtracts a weighted penalty — a parse failure costs far more than an unexpected field. 85+ is PASS, 50–84 WARNING, below 50 FAIL. The score is for triage, not for averaging: it tells you whether to ship the output, repair it, or regenerate from scratch.
A ready-to-send prompt built from the issues found: it restates your original requirement, lists each problem with its specific fix ("include 'email'; use null if the value is unknown", "remove the ``` fences"), and instructs the model to return only the corrected output, changing nothing that was already right. Paste it into the same conversation and the model repairs its own response.
Structurally, yes; semantically, partially — there's no full YAML parser in the browser, so the validator checks the things that break consumers deterministically: forbidden tabs, multiple documents, code fences, and the presence of your expected top-level keys. For deep YAML schema validation, a dedicated parser in your pipeline is the right tool; this catches the format violations models actually make.
No — a PASS with zero issues shows "no issues found" and skips the repair prompt entirely. The tool never invents problems to fix; an honest pass is the goal, not an opportunity for busywork.