Chris Wu

What Pareto Doesn't Say

In 1971, Italian economist Vilfredo Pareto made the simple observation that 20% of the population owned 80% of the land. Originally meant as an observation about centralization of wealth in an economy, this observation has evolved into one of the most quoted, most misunderstood rules of thumb.

The 80-20 rule generally states that 80% of the effects in a system can be attributed to 20% of the agents in said system. Illustrative examples of this are:

  • 80% of bugs come from 20% of the features
  • 80% of a company's revenue comes from 20% of the customers
  • 80% of the world's health is held by 20% of the population

If your system follows this pattern then it is very powerful when trying to optimize effort over a large system. Focus on those 20% features with fixes or features cuts, offer bonuses and premium service to your best customers - and so on.

Let us not forget that rules of thumb are approximations. They are at best social science, at worst guessing. So what are the limits of this model of thinking? I see three issues: data that doesn't follow the model, inability to actually use the model and incorrect interpretation of the model.

First, not every system follows the Pareto principle. Pareto works when items follow a power law distribution. If you are an established publisher then your #1 bestseller will likely dwarf even your #2 bestseller. If you are an independent bookseller, your sales across titles will tend to be flat. This is critical if you are trying to optimize sales effort on 20% of titles and expect 80% return. A simple way to address this is to validate ahead of time that you are, in fact, working with an 80-20 situation.

More subtle is a slavish adherence to the actual 80-20 numbers. The actual 80 and 20 are no more special than 60-30 - yes the numbers don't have to add up to 100! In a 60-30 you can asset that 60% of your revenue is from 30% of your customers. This is still a very useful conclusion as it still allows for a focus optimization on that 30% with an disproportionately high expected payoff.

Most importantly is not to assume that the Pareto effect applies when your distribution doesn't follow the inverse power law (say when things are more of less level). Visually, when the thing you care about looks like this:

then you're good. If it looks like this:

then you probably shouldn't assume you can pull some 80-20 optimization.

Secondly, you can't always apply more gas to a certain set of customers (or bugs etc). If you have a set of customers that are accounting for much of your revenue, you might have tapped them out. The principle only assert something about how the data is distributed but not what you can do to affect change.

Overly focusing on the top 20% means that you might miss out on increase revenue in your bottom 80%. If you can get 50% more revenue from that bottom 80%, that's still 10% revenue growth, which can be pretty good. Who knows, you might actually be one of these.

The third mistake I see is when there is the complete misapplication of the principle to how at task might be done. Imagine a software project that is expected to take one year. Someone might suggest that in the past they've observed "people really only need 20% of the features to get 80% of the value". While incredibly hard to measure (not all features take the same time but I digress), it is a conceivably true statement. They might continue with the suggestion "let's try and prioritize the 20% features and then we could stop development after 3 months".

Here the Pareto principle implies is that we can unlock significant customer value at only a small fraction of the work! Amazing! Here's the problem: that's not what it says. Pareto only works when you look at the system as a whole with full information. You can, after the work is complete, list every feature and compare customer use them conclude that the completed project satisfies the Pareto principle. What you cannot do is assume that what you identify on the fly is that critical 20%.

Because all projects - software or otherwise - are performed under imperfect information, no person can know a priori what tasks will be part of that effective 20%. Lean (or any iterative method of building) only advises you to build end-to-end experiences to validate value. It does not assert that it is possible to prioritize the highest value elements first - only the riskiest ones.

Making coherent product experiences means picking some non-critical features to support the critical ones. Going in with a Pareto mindset with ensure that those end up on the cutting floor. Proper agile, lean or basic-logic development means delivering a stunted but effective release to test out the viability of the product or technology - the so-called cupcake release. This effectiveness is often marred or even nullified if they don't contain basic functionality (e.g. ability to sign up via password, enter some personal information, etc).

The Pareto principle is an interesting observation but is prone to overuse just like any other mental model. We should always remember that it only states something about how thing are distributed but can't help you change a system while it is being played out. It's best limited to post-hoc analysis and not as some panacea for doing the minimum possible work.