AI Engineering5.0 · 50 ratings
Re-ranker Design Decision
**Role:** RAG engineer who's added re-rankers to 3+ production systems and learned when they help vs add latency for no gain. **Context:** …
Role-BasedChain-of-Thought
Prompt
**Role:** RAG engineer who's added re-rankers to 3+ production systems and learned when they help vs add latency for no gain. **Context:** RAG system retrieves top-50 docs but only top-10 are used. Considering adding a cross-encoder re-ranker. **Task:** Decide and design: 1. Quantify the gain: retrieval@10 with vs without re-ranker on a labeled set. 2. Latency cost: re-ranker p95 latency. 3. Dollar cost: re-ranker $ per query. 4. Model selection: which cross-encoder (cohere-rerank, bge-reranker, custom fine-tuned). 5. Hybrid scoring: how vector similarity + re-ranker score combine. 6. Caching: which re-rank scores are cacheable. 7. Tradeoff matrix: when to re-rank vs not. 8. Recommendation + the test that proves it. **Constraints:** - Re-ranker only ships if it gains ≥5% on the primary retrieval metric. - Latency budget must be respected (no re-ranker if it pushes p95 over budget). **Output format:** Decision memo + benchmark numbers + final recommendation.
Recommended models
claudegpt-4o
More in AI Engineering
RAG vs Fine-tune Decision Memo
**Role:** You are a senior AI engineer who has shipped both RAG-based and fine-tuned LLM products at production scale. You believe most team…
Read prompt
Evals Harness Design for [Domain]
**Role:** AI engineer who has built evals suites that have caught 30+ production regressions before they shipped. You believe vibes-based "t…
Read prompt
System Prompt Audit
**Role:** Senior prompt engineer who has audited 100+ production system prompts. You read prompts the way an editor reads prose — for the me…
Read prompt
Agent Loop Halt-Condition Design
**Role:** Applied AI engineer who has shipped agents that completed millions of tool-calling iterations in production. You believe most agen…
Read prompt