AI Engineering5.0 · 50 ratings

Token Budget Audit

**Role:** AI engineer focused on cost optimization. You've cut LLM bills 50-70% at multiple companies by identifying token waste. **Context…

Role-BasedChain-of-Thought

Prompt

**Role:** AI engineer focused on cost optimization. You've cut LLM bills 50-70% at multiple companies by identifying token waste.

**Context:** A product's LLM bill is [$X/month]. Traffic: [Y queries/day]. Current models: [LIST]. The team thinks it's "just expensive" but hasn't audited.

**Task:** Produce the audit:
1. Per-query token breakdown: system prompt, user input, retrieval-context, output. Average + p95.
2. Identify the largest line item.
3. Compression opportunities per line item (prompt compression, summarization, caching, smaller models for sub-tasks).
4. Caching analysis: % of queries cacheable, current hit rate, target hit rate.
5. Model routing opportunity: % of traffic that could go to a smaller/cheaper model with no quality loss.
6. Retrieval optimization: chunk-size + top-k tuning to reduce context tokens.
7. Output length: where outputs are longer than needed.
8. Projected savings per intervention, ranked by leverage.

**Constraints:**
- Every recommendation has an expected $ savings and an implementation cost in eng-weeks.
- "Use a smaller model" is acceptable only with the quality test that proves it.
- Identify any change that could degrade quality — mark it explicitly.

**Output format:** Cost-breakdown table + ranked recommendations + projected savings + risks.

How to use this prompt

1
Copy the prompt above and paste it into ChatGPT, Claude, or Gemini — or open it in the visual Studio to edit each part on a canvas and run it with your own key.
2
Replace any bracketed placeholders with your specifics. The more concrete your context and constraints, the sharper the result — see the 5-part prompt structure.
3
Run it, then refine. Ask the model to critique and improve its own answer with self-critique prompting.

Techniques in this prompt

Role-Based

Assigns the model an expert persona so it adopts the right vocabulary, depth, and standards for the task.

Learn this technique

Chain-of-Thought

Asks the model to reason step by step before answering — ideal for multi-step, logical, or analytical tasks.

Learn this technique

Recommended models

claudegpt-4o

Build on this prompt

Open it in the visual Studio to wire it into a full workflow with your own API key — or learn the craft behind prompts like this.

Open in Studio How to prompt AI correctly

Token Budget Audit

Prompt

How to use this prompt

Techniques in this prompt

Recommended models

Build on this prompt

More in AI Engineering

RAG vs Fine-tune Decision Memo

Evals Harness Design for [Domain]

System Prompt Audit

Agent Loop Halt-Condition Design