Turn a failed case into a fix — diagnose where in the agent's flow it went wrong, categorize the failure, and point at the prompt, tool, or context that caused it.
Overview
An agent that fails a case isn't useful feedback until you know WHY it failed — bad instruction, wrong retrieval, a tool error, or a reasoning slip. This prompt analyzes a failure: it walks the agent's trace to find where it went off the rails, categorizes the failure type, identifies the likely root cause (prompt, context, tool, or model), and recommends where the fix belongs — so evaluation feeds improvement instead of just a red mark.
Why This Works
Finding the first failure point stops you fixing a downstream symptom
Attributing to a layer tells you where the fix actually belongs
Spotting a pattern turns one failure into a permanent test
Best for
Debugging agent failures during evaluation
Multi-step agents where failures cascade
Turning eval red marks into actionable fixes
Not for
Detecting that a failure occurred — use the Scorecard or Scenario prompts
Code-level debugging — use a debugging prompt
Use cases
Diagnosing why an agent failed a test case
Attributing a failure to prompt, retrieval, tool, or model
Deciding where a fix should go after a bad output
Tip: Save time by exploring related resources and tools that integrate with this workflow.
Found a bug, have a suggestion, or want to report something confusing? Send a short note.
Cookie preferences
NewPrompt uses optional Google Analytics cookies to understand site usage and improve the tools.
The site works normally if you decline analytics cookies.
Read more in our Cookie Policy.