The Model Ladder

Estimated time: 25 minEstimated cost: ~$0.50Tool: Claude Code (all three models)

After this drill, you can:

After this drill, you have run the same task across Haiku, Sonnet, and Opus and can make cost-vs-quality trade-offs consciously.

Why this matters

Not every task needs Opus. Not every task is safe to run on Haiku. The model choice is a trade-off between cost, speed, and capability — and making it consciously is part of professional AI use. This drill runs the same task on all three tiers and builds the instinct for matching model to task: Haiku for repetitive operations, Sonnet for standard development work, Opus for complex reasoning and architecture decisions.

How to do it

1
Choose a task that requires some reasoning but has a verifiable answer
Good options: explain a complex function in the codebase, refactor a component for clarity, write tests for a specific function, debug a specific error.
2
Run on Haiku
claude --model claude-haiku-4-5 "YOUR TASK" — note quality, speed, and cost.
3
Run on Sonnet
claude --model claude-sonnet-4-5 "YOUR TASK" — same task, compare.
4
Run on Opus
claude --model claude-opus-4-5 "YOUR TASK" — compare all three and build your decision rule.

The prompt

PROMPT — The Model Ladder Test (run 3 times with different models)Model: Run on all three: claude-haiku-4-5, claude-sonnet-4-5, claude-opus-4-5Est. cost: ~$0.50 total (Opus is the most expensive)

Analyze the following code and explain:
1. What it does in plain language (for a non-developer)
2. What could go wrong (edge cases, failure modes)
3. How you would improve it

Code:
[PASTE YOUR CODE SAMPLE]

Success criteria

✓You ran the same task on all three models
✓You can describe the quality difference you observed
✓You have a personal decision rule for when to use each model tier

Common mistakes

Always defaulting to Opus for everything

→ Opus on routine tasks is like using a sledgehammer to push a thumbtack. Haiku and Sonnet handle most tasks as well or better for most use cases — and at a fraction of the cost.

Choosing models by price alone

→ Cheap models that fail require expensive re-runs. Match model capability to task complexity — Haiku failing at a complex task costs more than Sonnet succeeding on the first try.