Information Extraction Prompt — the Anatomy
The six sections a reliable extraction prompt needs: source guidance, field definitions, extraction rules, missing-data behavior, ambiguity policy, example.
Overview
Most extraction prompts are one sentence and a hope. The ones that survive production have an anatomy: a SOURCE section that tells the model how to read this kind of text, FIELDS that define each piece of information, per-field EXTRACTION RULES (emails validated, dates ISO-formatted, identifiers untouched), a MISSING DATA contract, an AMBIGUITY POLICY, and an example of a valid extraction. This resource loads a job-posting extraction that exercises every section — including a salary_range field whose rule explicitly stops the model from collapsing a range into one number.
Workflow
-
Check each section's job
SOURCE teaches reading, FIELDS define meaning, RULES handle the traps, MISSING DATA and AMBIGUITY remove improvisation, the EXAMPLE shows the shape.
-
Watch the salary_range rule
"Extract the range as written — do not collapse it to one number." Per-field rules exist for exactly these traps.
-
Swap in your own fields
Replace the job-posting fields with yours; the engine derives new rules from the names you choose.
Why This Works
- Each section closes a specific failure mode instead of adding generic words
- Per-field rules catch the traps generic instructions miss — ranges, formats, identifiers
- A strict ambiguity policy makes "I'm not sure" produce a blank, not a guess
Best for
- Anyone whose extraction prompt is currently one sentence long
- Prompts that work in testing and drift in production
- Fields with traps — ranges, relative dates, lists inside prose
Not for
- Quick one-off questions about a text — anatomy is overhead there
- Output-format depth (types, strictness) — pair with the JSON Output Prompt Builder
Use cases
- Learning the structure before writing your own extraction prompts
- Auditing an existing extraction prompt against the six sections
- Tracking job postings, listings, or announcements into a sheet