Back to blog

Story Points Are Theatre

Henrik··6 min read

Your team is sitting in a room. Someone shares their screen showing a Jira ticket. "User can filter dashboard by date range." The tech lead says "I'm thinking a 5." A junior developer holds up a 3. The senior backend engineer says 8. Nobody agrees, so you "discuss" for ten minutes, compromise on a 5, and move to the next ticket. Repeat forty times.

Congratulations. You've just spent three hours generating fictional numbers that will be wrong by Wednesday.

The estimation ritual

Planning poker is one of those practices that sounds reasonable on paper and falls apart in practice. The theory: by having everyone estimate independently and then discussing differences, you surface hidden complexity and arrive at a shared understanding of the work.

The reality: estimates anchor to whoever speaks first. Junior developers defer to seniors. Seniors round up to protect themselves. The actual estimate is usually decided by social dynamics, not technical analysis. And when the estimates are "done," you've produced a number that pretends to be precise but is really just the group's collective anxiety expressed as a Fibonacci sequence.

Here's what makes it theatre: everyone in the room knows the estimates are unreliable. The project manager knows. The developers know. The stakeholders who see the velocity charts know. But we keep doing it because the ritual feels productive, and the numbers give us something to put on a chart.

What velocity actually measures

Velocity is the sum of story points completed per sprint. It's supposed to measure team throughput. What it actually measures is: how many imaginary points did the team assign to tasks, and how many of those tasks did they finish?

If the team consistently overestimates (which most teams learn to do, because it makes velocity look good and creates buffer), velocity goes up without any actual productivity gain. If the team takes on a genuinely hard problem that they underestimated, velocity drops. Neither case tells you anything useful about whether the team is building the right things or building them well.

The worst outcome is when velocity becomes a target instead of a measurement. "Our velocity was 38 last sprint, let's try for 42." Now you've incentivised the team to inflate estimates, cherry-pick easy work, and avoid risky tasks that might blow up a story mid-sprint. You've turned a descriptive metric into a prescriptive one, and the team optimises for the metric instead of for outcomes.

The alternative: done or not done

When we switched to crates, we dropped estimation entirely. A crate is a piece of work loaded onto a flight. It's done or it's not. There's no "it's 80% done" or "it's a 5-point story and we've completed 3 points worth." Binary states eliminate the ambiguity that estimation introduces.

"But how do you know if you've taken on too much work?" By looking at the crates and the landing date. If you have twenty crates on a two-week flight, your captain will notice. Not because a burndown chart told them, but because twenty things in two weeks is obviously too many. You don't need estimation to spot an overloaded flight. You need common sense and a person responsible for the outcome.

"But how do you track progress?" You count crates. Six loaded, four delivered, two remaining. Everyone can understand that without a chart. The captain knows which two are left and whether they'll land on time. If they won't, the captain makes a call: drop a crate, extend the flight, or call for help. No velocity discussion required.

Estimation has a cost

The hidden cost of story point estimation isn't just the time spent in planning sessions - though that's significant. It's the cognitive overhead of maintaining the fiction.

Developers spend mental energy calibrating their estimates to match the team's implicit baseline. "Is this a 3 or a 5?" is a question that has no right answer, but teams act like it does. That energy could be spent on the work itself.

Project managers spend time updating burndown charts, calculating velocity trends, and producing reports that look scientific but aren't. That time could be spent removing blockers or talking to stakeholders about what actually matters.

And the organisation develops a false confidence in timelines derived from these numbers. "Based on our velocity, we'll ship the feature in Sprint 52." No, you won't. You'll ship it when it's done, and the velocity-based projection will be off by anywhere from one sprint to six, just like every other velocity-based projection you've made.

What to do instead

If you must estimate, estimate in time - "this will take roughly a week" - not in abstract points. Time is a unit everyone understands, and it has the honest quality of being obviously imprecise. Nobody pretends a "roughly a week" estimate is precise. But a "5-point story" estimate carries an aura of mathematical certainty that it hasn't earned.

Better yet, skip estimation and focus on scoping. Break work into small, deliverable pieces. A well-scoped crate is one that a single person can complete in a day or two. If it's bigger than that, split it. You don't need to estimate its size - the scoping process itself ensures that each piece is manageable.

The crates section of our handbook goes deeper into how we scope and track work without estimation. It's one of the most freeing changes a team can make - dropping the pretence that estimation adds value, and replacing it with something that actually does: clear ownership, binary completion, and a captain who cares about landing on time.