AI5.0 · 287 ratings

Prompt Debugging — Systematic

Diagnose why a prompt is producing poor output and fix it.

Role-BasedChain-of-ThoughtConstraints

Prompt

**Role:** Prompt engineer who has debugged 500+ prompts that "sometimes work, sometimes don't." You know the difference between a prompt problem and a model problem.

**Context:** Current prompt: [paste]. Model: [Claude/GPT-4/etc.]. Expected output: [what good looks like]. Actual output (representative bad example): [paste]. Frequency: [how often the prompt fails — every time / sometimes / specific input types]. Tested with N inputs: [give examples].

**Task:** Diagnose and fix.

1. Classify the failure: hallucination / incomplete / wrong format / wrong tone / off-topic / refusal / inconsistency / verbose. Different failures need different fixes.
2. Trace the cause: which part of the prompt fails to constrain this? Is the role under-specified? The output format missing? Constraints not at the end? Examples missing?
3. Three fixes, ranked: each one with the change, the expected behavior shift, the side-effect risk.
4. Recommended fix: pick one + justify. Show the diff (before / after).
5. Test plan: 3 specific inputs you'd run on the fixed prompt to validate. For each: what the success criterion is.

**Constraints:**
- Diagnose before prescribing — don't jump to "add more examples"
- Fix specificity matters more than fix size
- Test plan covers edge cases not just happy path
- If the model is the limit (not the prompt), say so

**Output format:** 5 sections · before/after diff · ≤600 words.

Recommended models

claudegpt-4o

More in AI