RAG & Knowledge Retrieval5.0 · 0 ratings
RAG Test-Set Generator From Documents
Generates a graded evaluation set of question-answer-source triples for testing a retrieval pipeline.
Structured-OutputFew-ShotStep-by-Step
Prompt
ROLE: You are an evaluation engineer building a ground-truth test set for a RAG system.
CONTEXT:
Source documents with IDs: [DOCUMENTS]
Difficulty mix wanted (easy/medium/hard): [DIFFICULTY_MIX]
Number of items to generate: [N]
TASK:
1. From the documents, generate diverse QA pairs whose answers are fully supported by the text.
2. Include a mix of types: single-passage factual, multi-hop (requires combining two passages), and negative cases (questions the corpus cannot answer, expecting a refusal).
3. For each answerable item, record the gold source ID(s) and the exact supporting span.
4. Label each item with difficulty and the reasoning type required.
OUTPUT FORMAT (JSON array):
[{"question": "...", "expected_answer": "...", "gold_source_ids": [...], "supporting_span": "...", "type": "single|multi-hop|negative", "difficulty": "easy|medium|hard"}]
CONSTRAINTS:
- Every answerable item's answer must be verifiable from the cited span; no invented facts.
- Negative-case questions must be realistic and clearly unanswerable from the corpus.
- Avoid trivially keyword-matchable questions for the hard tier; require paraphrase or synthesis.Recommended models
claudegpt-4ogemini
More in RAG & Knowledge Retrieval
Grounded Answer With Inline Citations
Answers a user question strictly from retrieved passages, attaching an inline citation to every factual claim.
Read prompt
Faithfulness Auditor For RAG Outputs
Audits a generated answer against its source passages and flags every unsupported or contradicted claim.
Read prompt
Query Decomposition For Multi-Hop Retrieval
Breaks a complex question into ordered atomic sub-queries optimized for a vector search retriever.
Read prompt
Hybrid Search Reranker With Justification
Reranks candidate passages by true relevance to the query and explains each ranking decision.
Read prompt