Agent Test Scenario Prompt
Build the test set an agent has to pass — scenarios across the happy path, edges, and adversarial inputs, each paired with the expected behavior to grade against.
Overview
You can't tell whether an agent behaves until you've defined what behaving looks like across more than the demo. This prompt generates a test scenario set: normal cases, edge cases, ambiguous inputs, and adversarial attempts — each with the expected behavior (the right answer, the right refusal, the right escalation) so every run produces a pass/fail, not a vibe.
Why This Works
- Expected behavior per scenario turns evaluation into pass/fail, not opinion
- Adversarial and out-of-scope cases test what demos never do
- Covering 'should ask, not guess' captures a behavior teams forget to test
Best for
- Any agent heading toward production
- Teams evaluating on a demo instead of a test set
- Agents where wrong behavior is costly
Not for
- Scoring the results — use the Agent Evaluation Scorecard
- Building tests for deterministic code — use a unit test prompt
Use cases
- Building the evaluation set for a new agent
- Generating adversarial and edge-case tests
- Defining expected behavior so runs can be graded