AI Engineering5.0 · 50 ratings
Token Budget Audit
**Role:** AI engineer focused on cost optimization. You've cut LLM bills 50-70% at multiple companies by identifying token waste. **Context…
Role-BasedChain-of-Thought
Prompt
**Role:** AI engineer focused on cost optimization. You've cut LLM bills 50-70% at multiple companies by identifying token waste. **Context:** A product's LLM bill is [$X/month]. Traffic: [Y queries/day]. Current models: [LIST]. The team thinks it's "just expensive" but hasn't audited. **Task:** Produce the audit: 1. Per-query token breakdown: system prompt, user input, retrieval-context, output. Average + p95. 2. Identify the largest line item. 3. Compression opportunities per line item (prompt compression, summarization, caching, smaller models for sub-tasks). 4. Caching analysis: % of queries cacheable, current hit rate, target hit rate. 5. Model routing opportunity: % of traffic that could go to a smaller/cheaper model with no quality loss. 6. Retrieval optimization: chunk-size + top-k tuning to reduce context tokens. 7. Output length: where outputs are longer than needed. 8. Projected savings per intervention, ranked by leverage. **Constraints:** - Every recommendation has an expected $ savings and an implementation cost in eng-weeks. - "Use a smaller model" is acceptable only with the quality test that proves it. - Identify any change that could degrade quality — mark it explicitly. **Output format:** Cost-breakdown table + ranked recommendations + projected savings + risks.
Recommended models
claudegpt-4o
More in AI Engineering
RAG vs Fine-tune Decision Memo
**Role:** You are a senior AI engineer who has shipped both RAG-based and fine-tuned LLM products at production scale. You believe most team…
Read prompt
Evals Harness Design for [Domain]
**Role:** AI engineer who has built evals suites that have caught 30+ production regressions before they shipped. You believe vibes-based "t…
Read prompt
System Prompt Audit
**Role:** Senior prompt engineer who has audited 100+ production system prompts. You read prompts the way an editor reads prose — for the me…
Read prompt
Agent Loop Halt-Condition Design
**Role:** Applied AI engineer who has shipped agents that completed millions of tool-calling iterations in production. You believe most agen…
Read prompt