7 posts on Product • Lea Verou

A framework for User-Centered Decisions

26 June 2025 24 min read Report broken page

TBD: Lacks a conclusion, illustrations, and examples.

I was recently asked to describe the guiding principles I would use to weigh three different solutions to a specific user pain point, with an emphasis on the user/customer. There are typically other factors to consider, such as engineering effort, business goals, etc., but these were out of scope in that particular case.

Since most prioritization frameworks include a user-centered / impact component, the framework discussed here can complement them nicely, by simply replacing some of the factors with its outcome. For example, if using RICE, you can use this framework to calculate R×I, then proceed to multiply by C/E as usual (being mindful of units).

The Three Axes

Utility and Usability (which Nielsen groups under Usefulness) are considered the core pillars of Product-led Growth. However, both Utility and Usability are short-term metrics and do not consider the bigger picture, so only using them as a compass could result in short-sightedness. I think there is also a third axis, which I call Evolution, and refers to how well a feature fits into the bigger picture, by examining how it relates to the product’s past and (especially) future.

Alt text

Utility (aka Impact): How many use cases and user pain points does it address, how well, and how prominent are they?
Usability: How easy is it to use? Evaluating usability at the idea stage is tricky, as overall usability will vary depending on how each idea is implemented, and there is often a lot of wiggle room within the same idea. At this stage, we are only concerned with aspects of usability inherent in the idea itself.
Evolution: How does it relate to features that we have shipped in the past and features we may ship in the future? Being mindful of this prevents feature creep and ensures the mental models exposed via the UI remain coherent.

These are not entirely independent, there are complex interplays between them:

Evolution affects Usability: Features that fit poorly into the product’s past and future will create later usability issues. However, treating it as a separate factor helps us catch these issues much earler, and at the right conceptual level.
Utility and Usability can often be at odds: the more powerful a feature is, the more challenging it is to make it usable.

Now let’s discuss each axis in more detail.

Utility

Utility measures the value proposition of a feature for users. It can be further broken down to:

Raising the ceiling: What becomes possible? Does it enable any use cases for which there is no workaround?
Lowering the floor: What becomes easier? Does it provide a better way to do something for which there is already a workaround? How big is the delta?
Widening the walls: Does it serve an ignored audience or market? Does it broaden the set of use cases served by the product?
Use Case Significance: How important are the use cases addressed?

While this applies more broadly, it is particularly relevant and top priority for creative tools.

In evaluating the overall Utility of an idea, it can often be helpful to list primary and secondary use cases separately, and evaluate Significance for them separately.

Primary & Secondary use cases

Primary use cases are those for which a solution is optimal (or close), and have often been the driving use cases behind its design. This is to contrast with secondary use cases, for which a solution is a workaround. Another way to frame this is friction: How much friction does the solution involve for each use case? For primary use cases, that should be close to 0, whereas for secondary use cases it will be higher.

A good design will ideally have a healthy amount of both. Lack of secondary use cases could hint that the feature may be overly tailored to specific use cases (overfitting).

The north star goal should obviously be to address all use cases head-on, with zero friction. But since resources are finite, enabling workarounds buys us time. There is far less pressure to solve a use case for which there is a workaround, than one that is not possible at all. The latter contributes to churn far more directly.

It is not unheard of to ship a feature with a low number of primary use cases, simply because it has a high number of secondary use cases, and will buy us time to work on better solutions for them. In these cases, Evolution is even more important: when we later have addressed all these use cases head-on, does this feature still serve a purpose?

Use Case Significance

This is a rough measure of how important the features addressed are. This needs to be evaluated holistically: an incremental improvement for a common interaction is far more impactful than a substantial improvement for a niche use case.

Some ways to reason about it may be:

Frequency: How frequently do these use cases come up in a single user journey?
Reach: What percentage of users do they affect?
Criticality: How much do they matter to users? Are they a nice-to-have or a dealbreaker?
Vision: How do the use cases relate to the software’s high level goals?

Vision may at first seem more related to the business than the user. However, when software design loses sight of its vision, the result can be a confusing, cluttered user experience that doesn’t cater to any use case well.

Usability

There are many ways to break usability down into independent, quantifiable dimensions. I generally go with a tweaked version of the one I first learned at MIT’s UI Design & Implementation course I took in 2016 (and then taught in 2018 and replaced in 2020 😅), bringing it one step closer to the original Nielsen dimensions by re-adding Satisfaction:

Learnability: How easy is it for users to understand?
Efficiency: Once learned, is it fast to use?
Safety (aka Errors): Are errors few and recoverable?
Satisfaction: How pleasant is it to use?

Some examples of usability considerations and how they relate to these dimensions:

Learnability

Compatibility: Does it re-use existing concepts or introduce new ones?
Internal Consistency: How consistent is it with the way the rest of the product works?
External Consistency: How consistent is it with the environment (other products, related domains, etc.)?
Memorability: When users return to the design after a period of not using it, how easily can they reestablish proficiency?

Efficiency

Speed: How many steps does it take to accomplish a task and how long does each step take?
Cognitive Load: How much mental effort does it require?
Physical Load: How much physical effort does it require?

Safety

Error-proneness: How hard is it for users to make mistakes?
Error severity: How severe are the consequences of mistakes?
Recoverability: How easy is it to recover from mistakes?

Satisfaction

Aesthetics: How visually pleasing is it?
Ergonomics: How comfortable is it to use?
Enjoyment: How fun is it to use?

Satisfaction is a bit of an oddball. First, it has limited applicability to certain types of UIs, e.g. non-interactive text-based UIs (programming languages, APIs, etc.). Even where it applies, it can be harder to quantify. But most importantly, when deciding between ideas there is rarely enough signal to gauge satisfaction. If it’s not helpful for your use case, just leave it out.

Each idea will rarely have universally worse or better usability than another. More commonly, it will be better in some dimensions and worse in others. To evaluate these tradeoffs, we need to understand the situation and the user.

The situation

“Situation” here refers to the use case plus its context.

The more repetitive or common the task, the higher the importance of Efficiency. For example text entry is an area where efficiency needs to be optimized down to individual keystrokes or minute pointing movements. On the other end of the spectrum, for highly infrequent tasks, users don’t have time to develop transferable knolwedge across uses and thus Learnability is very important (e.g. tax software, visa applications). Last, the more there is at stake, the more important Safety becomes. Some examples of cases where Safety is top priority would be missile launches, airplane navigation, healthcare software on a macro scale, or privacy, data integrity, finances on a micro scale.

There is granularity here as well. For example, a visa application is used infrequently enough that learnability matters far more than efficiency for the product in general. However, if it includes a question where it expects the user to enter their last 20 international trips, efficiency for trip entry is important.

Sometimes, two factors may genuinely be equally important. Consider a stock trading program used on a daily basis by expert traders. Lost seconds translate to lost dollars, but mistakes also translate to lost dollars. Is Efficiency or Safety more important?

Note that there are also interplays between different dimensions: the more effort a task involves (efficiency), the more high stakes a mistake is perceived to be (safety). You have likely experienced this: a lengthy form losing your answers feels a lot more frustrating than having to re-enter your email in a login form.

The user

As a general rule of thumb, novices need learnability whereas for experts other dimensions of usability are more important. But who is an expert? Expert in what?

Application expertise is orthogonal to domain expertise. Tax software for accountants needs good learnability in terms of application features, but can assume familiarity with tax concepts (but not necessarily recall). Conversely, tax software for regular taxpayers needs both: as software that is typically only used once a year, learnability in terms of application features is top priority. But abstracting and simplifying tax concepts is also important, as most users are not very proficient in them.

Generally speaking, the more we can rely on training, the less important learnability becomes. This is why airplane cockpits are so complex: pilots have spent years of training learning to use these UIs, so efficiency and safety are prioritized instead (or at least should be — sadly that is not always the case).

That said, there is often an opportunity for disruption here, by taking a product that has the potential to bring value to many but currently requires lengthy training, and creating one that requires little to none. Creator tools are prime candidates for this, with no-code/low-code tools being a flagship example right now. However, almost every mainstream technology went through this kind of democratization at some point: computers, cameras, photo editing, video production, etc.

This distinction does not only apply to the product as a whole, but also individual product areas. For example, an onboarding flow needs to prioritize learnability regardless of the priorities of the rest of the product.

Evolution

Evolution is a bigger picture measure of how well a proposed feature fits into the product’s past, present and future, with an emphasis on the latter, since relationship to the past and present is also the Internal Consistency component of Learnability.

When evaluating compatibility with potential future evolution, it’s important to not hold back. Ten years down the line, when today’s implementation constraints, technology limitations, or resource limits are no more, what would we ship and how does this feature relate to it? Does it have a place in that future, is it entirely unnecessary, or — worse — does it actively conflict with it?

This is to avoid feature creep by ensuring that features are not designed ad hoc, but they contribute towards a coherent conceptual design.

The most common way for a feature to connect to the product’s past, present, and future is by being a milestone across a certain axis of progress:

Level of abstraction (See Layering):
- Is it a shortcut to a present or future lower level primitive?
- Is it a lower level primitive that explains existing functionality?
Power: Is it a less powerful version of a future feature?
Granularity: Is it a less granular version of a future feature?

If we have a north star UI, part of this is to consider whether a proposed feature is compatible with it or actively diverges.

A feature could also be entirely orthogonal to all existing features and still be a net win wrt Evolution. For example, when it helps us streamline UX by allowing us to later remove another feature that has been problematic.

Weighing tradeoffs

While all three are very important, they are not equally important. In broad strokes, usually, Utility > Usability > Evolution. Here’s why:

Utility > Usability: If a product does not provide value, people leave, even if it provides a fantastic user experience for the few and/or niche use cases it actually serves.
Usability > Evolution, since Evolution is a long-term / more speculative concern, whereas Usability a more immediate / higher confidence one.

Depending on the product and the environment however, this trend could be reversed:

Competition: If a product is competing in a space where use cases are already covered very well, but by products with poor usability, Usability becomes more important. In fact, many successful products were actually usability innovations: The Web, Dropbox, the iPhone, Zoom, and many others.
Mutability: Change is always hard, but for some products it’s a lot harder, making a solid Evolution path more important. Web technologies are an extreme example: it is almost impossible to remove or change anything, ever, as there are billions of uses in the wild, no way to migrate them, and no control over them. Instead, changes have to be designed as new technologies or additions to existing ones.
Complexity: The more complexity increases, the more important it becomes to keep further increase at bay, so Evolution becomes more important.

Ok, now make the darn decision already!

So far we’ve discussed various tradeoffs, so it may be unclear how to use this as a framework to make actual decisions.

Decision-making itself also involves tradeoffs: adding structure makes it easier to decide, but consumes more time. To balance this, I tend to favor an iterative approach, adding more precision and structure if the previous step failed to provide enough clarity. For simple, uncontroversial decisions, just discussing the three axes can be sufficient, and the cost-benefit of more structure is not worth it. But for more complex higher stakes decisions, a more structured approach can pay off.

Let’s consider the goals for any scoring framework:

Compare and contrast: Make an informed decision between alternatives without being lost in the complexity of their tradeoffs.
Drive consensus: It is often easier for a team to agree on a rating or weight for an individual factor, than the much bigger decision of which option to go with.
Communicate: Provide a way to communicate the decision to stakeholders, so they can understand the rationale behind it.

Calculating things precisely (e.g. use case coverage, significance, reach etc.) is rarely necessary for any of them, and thus not a good use of time. Remember that the only purpose of scores is to help us compare alternatives. They have no meaning outside of that context. In the spirit of an iterative approach, start with a simple 1-5 score for each factor, and only add more granularity and/or precision if that does not suffice for converging to a decision.

We can use three tables, one for each factor, with a row for each idea. Then the columns are:

Utility

Primary use cases
Secondary use cases
Utility Score (1-5)

Usability

Learnability
Efficiency
Safety
Usability Score (1-5)

Evolution

Past & Present
Future
Evolution Score (1-5)

We fill in the freeform columns first, which should then give us a pretty clear picture of the score for each factor.

Finally, using the 3:2:1 ratio mentioned above, the overall score would be:

O v e r a l l_S c o r e = \frac{3 \cdot U t i l i t y_S c o r e + 2 \cdot U s a b i l i t y_S c o r e + 1 \cdot E v o l u t i o n_S c o r e}{3 + 2 + 1}

Template: User-Centered Decision Worksheet

I have set up a Coda template for this, which you can copy and fill in with your own data.

Why Coda instead of something like Google Docs or Google Sheets?

I don’t have to repeat each idea in the multiple tables, I can set them up as views and they update automatically
Rich text (lists, etc) within table cells make it easier to brainstorm
One-click rating widgets for scores (great when iterating)
I can output the overall score for each feature with a formula, and it updates automatically. No need to clumsily copy-paste it across cells either, I can just define it once for the whole column. I can even use controls for the weights that are outside the table entirely.
This may be subjective, but I find Coda docs more well designed than any alternative I’ve tried.

Screenshot of Coda tooltip — As a bonus, I can then even @-mention each feature in the rest of the doc, and hovering over it shows a tooltip with all its metadata!

Tradeoff scorecard: The ultimate decision-making framework

26 June 2025 5 min read Report broken page

Every decision we make involves weighing tradeoffs, whether that is done consciously or not. From evaluating whether an extra bedroom is worth $800 extra in rent, whether being able to sleep lying down during a long flight is worth the $500 upgrade cost, to whether you should take a pay cut for that dream job.

For complex high-stakes decisions involving many competing tradeoffs, trying to decide with your gut can be paralyzing. The complex tradeoffs that come up when designing products ^[1] fall in that category so frequently that analytical decision-making skills are considered one of the most important skills a product manager can have. I would argue it’s a bit broader: analytical decision-making is one of the most useful skills a human can have.

Structured decision-making is a passion of mine (perhaps as a coping mechanism for my proneness to analysis paralysis). In fact, one of the very first demos of Mavo (the novice programming language I designed at MIT) was a decision-making tool for weighed pros & cons. It was even one of the two apps our usability study participants were asked to build during the first Mavo study. I do not only use the techniques described here for work-related decisions, but for any decision that involves complex tradeoffs (often to the amusement of my friends and spouse).

Screenshot of the Decisions Mavo app — The Decisions Mavo app, one of the first Mavo demos, is a simple decision-making tool for weighed pros & cons.

Before going any further, it is important to note a big caveat. Decision-making itself also involves tradeoffs: adding structure makes decisions easier, but consumes more time. To balance this, I tend to favor an iterative approach, adding more precision and structure only if the previous step failed to provide clarity. Each step progressively builds on the previous one, minimizing superfluous effort.

For very simple, uncontroversial decisions, just discussing or thinking about pros and cons can be sufficient, and the cost-benefit of more structure is not worth it. Explicitly listing pros and cons is probably the most common method, and works well when consensus is within reach and the decision is of moderate complexity. However, since not all pros and cons are equivalent, this delegates the weighing to your gut. For more complex or controversial decisions, there is value in spending the time to also make the weighing more structured.

The tradeoff scorecard

What is a decision matrix?

A decision matrix, also known as a scorecard, is a table with options as rows and criteria as columns, with a column in the end that calculates a score for each option based on the criteria. These are useful both for selection, as well as prioritization, where the score is used for ranking options. In selection use cases, the columns can be specific to the problem at hand, or predefined based on certain principles or factors, or a mix of both. Prioritization tends to use predefined columns to ensure consistency. There is a number of frameworks out there about what these columns should be and how to calculate the score, with RICE (Reach × Impact × Confidence / Effort) likely being the most popular.

The Tradeoff Scorecard is not a prioritization framework, but a decision-making framework for choosing among several options.

Qualitative vs. quantitative tradeoffs

Typically, tradeoffs fall in two categories:

Qualitative: Each option either includes the tradeoff or it doesn’t. Thing of them as tags that you can add or remove to each option.
Quantitative: The tradeoff is associated with a value (e.g. price, effort, number of clicks, etc.)

Not all tradeoffs are all equal. Even for qualitative tradeoffs, some are more important than others, and the differences can be quite vast. Some strengths may be huge advantages, while others minor nice-to-haves. Similarly, some weaknesses can be dealbreakers, while others minor issues.

We can model this by assigning a weight to each tradeoff (typically a 1-5 or 1-10 integer). But if quantitative tradeoffs have a weight, doesn’t that make them quantitative? The difference is that the weight applies to the tradeoff itself, and is applied the same to each each option, whereas the value of quantitative tradeoffs quantifies the relationship between tradeoff and option and thus, is different for each. Note that quantitative tradeoffs also have a weight, since they also don’t all matter the same.

In diagrammatic form, it looks a bit like this: Simplified UML-like diagram showing that each tradeoff has a weight, but the relationship between option and quantitative tradeoff also has a value

These categories are not set in stone. It is quite common for qualitative tradeoffs to become quantitative down the line as we realize we need more granularity. For example, you may start with “Poor discoverability” as a qualitative tradeoff, then realize that there is enough variance across options that you instead need a quantitative “Discoverability” factor with a 1-5 rating. The opposite is more rare, but it’s not unheard of to realize that a certain factor does not have enough variance to be worth a quantitative tradeoff and instead should be modeled as 1-2 qualitative tradeoffs.

The overall score of each option is the sum of the scores of each individual tradeoff for that option. The score of each tradeoff is often simply its weight multiplied by its value, using 1/-1 as the value of qualitative tradeoffs (pro = 1, con = -1).

While qualitative tradeoffs are either pros or cons, quantitative tradeoffs may not be universally positive or negative. For example, consider price: a low price is a strength, but a high price is a weakness. Similarly, effort is a strength when low, but a weakness when high. Calculating a score for these types of tradeoffs can be a bit more involved:

For ratings, we can subtract the midpoint and use that as the value. E.g. by subtracting 3 from a 1-5 rating we get value from -2 to 2. Adjust accordingly if you don’t want the switch to happen in the middle.
For less constrained values, such as prices, we can use the value’s percentile instead of the raw number.

Explicit vs implicit tradeoffs

When listing pros and cons across many choices, have you noticed that there is a lot of repetition? First, several options share the same pros and cons, which is expected, since they are alternative solutions to the same problem. But also because pros and cons come in pairs. Each strength has a complementary weakness (which is the absence of that strength), and vice versa.

For example, if one UI option involves a jarring UI shift (a bad thing), the presence of this is a weakness, but its absence is a strength! In other words, each qualitative tradeoff is present on all options, either as a strength or as a weakness. The decision of whether to pick the strength or the weakness as the primary framing for each tradeoff is often based on storytelling and/or minimizing effort (which one is more common?). A good rule of thumb is to try to avoid negatives (e.g. instead of listing “no jarring UI shift” as a pro, list “jarring UI shift” as as con).

It may seem strange to view it this way, but imagine you were trying to compare and contrast five different ideas, three of which involved a jarring UI shift. You would probably list “no jarring UI shifts” as a pro for the other two, right?

This realization helps cut the amount of work needed in half: we simply assume that for any tradeoff not explicitly listed, its opposite is implicitly listed.

Putting it all together

Your choice of tool can make a big difference to how easy this process is. In theory, we could model all tradeoffs as a classic decision matrix, with a column for each tradeoff. Quantitative tradeoffs would correspond to numeric columns, while qualitative tradeoffs would correspond to boolean columns (e.g. checkboxes).

Indeed, if all we have is a grid-based tool (e.g. spreadsheets), we may be stuck doing exactly that. It does have the advantage that it makes it trivial to convert a qualitative tradeoff to a quantitative one, but it can be very unwieldy to work with.

If our tool of choice supports lists within cells, we can do better. These boolean columns can be combined into one column as a list of all relevant tradeoffs. Then, a separate table can be used to define weights for each tradeoff (and any other metadata, e.g. optional notes).

I currently use Coda for these tradeoff scorecards. While not perfect, it does support lists in cells, and has a few other features that make working with tradeoff scorecards easier:

Thanks to its relation concept, the list of tradeoffs can be actually linked to their definitions. This means that hovering over each tradeoff displys a popup with its metadata, and that I can add tradeoffs by selecting them from a popup menu.
Conditional formatting allows me to color-code tradeoffs based on their type (strength/weakness or pro/con) and weight (lighter color for smaller impact).
Its formula language allows me to show and list the implicit tradeoffs for each option (though there is no way to have them be color-coded too).

There are also limitations however:

While I can apply conditional formatting to color-code the opposite of each tradeoff, I cannot display implicit tradeoffs as nice color-coded chips, in the same way as explicit tradeoffs, since relations can only display the primary column.
Weights for quantitative tradeoffs have to be baked in the formula (there are some ways to make them editable, but )

I use product in the general sense of a functional entity designed to fulfill a specific set of needs or solve particular problems for its users. This does not only include commercial software, but also things like web technologies, open source libraries, and even physical products. ↩︎

Evolution: The missing component of Product-Led Growth

26 June 2025 6 min read Report broken page

TBD: Lacks a conclusion, illustrations, and examples.

What is Product-Led Growth?

In the last few years, Product-Led Growth has seen a meteoric rise in popularity. The idea is simple: instead of relying on sales and marketing to acquire users, you build a product that sells itself. As a usability advocate, this makes me giddy: Prioritizing user experience is now a business strategy, with senior leadership buy-in!

NN/G considers Utility and Usability the core components of Product-led Growth, which Nielsen groups under a single term: Usefulness. Utility refers to how many use cases are addressed, how well, and how significant these use cases are. If you thought that sounds very much like the RI in RICE, you’d be right, they are indeed roughly the same concept, from a different perspective. Usability, as you probably know, refers to how easy the product is to use, and can be further broken down into individual components, such as Learnability, Efficiency, Safety, and Satisfaction.

Indeed, maximizing Utility and Usability are crucial for creating products that add value. However, both suffer from the same flaw: they are short-term metrics, and do not consider the bigger picture over time. It’s like playing chess while only thinking about the next move. You could be making excellent choices on each turn and still lose the game. Great Utility and Usability alone do not prevent feature creep. We can still end up with a convoluted user experience that lacks a coherent conceptual model; all it takes is enough time.

Therefore, I think there is also a third component, which I call Evolution. Evolution refers to how well a feature fits into the bigger picture of a product, by examining how it relates to its past, present and future (or, more accurately, its various possible futures). By prioritizing features higher when they are part of a trajectory or greater plan and deprioritizing those that are designed ad hoc we can limit complexity, avoid feature creep, and ensure we are moving towards a coherent conceptual design.

Introducing entirely new concepts is not an antipattern by any means, that’s how products evolve! However, it should be done with caution, and the bar to justify such features should be much higher.

The three axes are not entirely independent. Evolution will absolutely eventually affect Usability. The whole point of treating Evolution as a separate axis is that this allows us to catch these issues early and prevent them in the making. By the time conceptual design issues create usability problems, it’s often too late. The changes required to fix the underlying design are a lot more substantial and costly.

The weight of Evolution

The importance of Evolution was really drilled into me while designing web technologies, i.e. the technologies implemented in browsers that web developers use to develop websites and web applications. We do not have a name for it, but the consideration is very high priority when designing any feature for the Web.

In general, Utility and Usability matter more than Evolution. Just like in chess, the next move is far more important than any subsequent move. The argument this post is making is that we should look further than the current roadmap, not that we should stop looking at what’s right in front of us. However, there are some cases where Evolution may become equally important as the other two, or even more.

Low mutability is one such case. Change is always hard, but for some products it’s a lot harder. Web technologies are an extreme example, where you can never remove or change anything. There are billions of uses in the wild, that you have no control over, and no way to migrate users. You cannot risk breaking the Web. Instead, changes must be designed as either additions to existing technologies, or (if substantial enough) as entirely new technologies. The best you can hope for is that if you deprecate the old technology, and you heavily promote the new one, over many years usage of the old technology will drop below the usage threshold that allows considering removal (< 0.02%!). I have often said that web standards work is “product work on hard mode”, and this is one of the reasons. If you do product work, pause for a moment and consider this: How much harder would shipping be if you knew you could never remove or change anything?

Another case is high complexity. Many things that are complex today began as simple things. The cost of adding features without validating their Evolution story is increasing complexity. To some degree, complexity is the fate of every successful product, but being deliberate about adding features can curb the rate of increase. Evolution tends to become higher priority as a product matures. This is artificial: keeping complexity at bay is just as important in the beginning, if not more. However, it is often easier to see in retrospect, after we’ve already felt the pain of increasing complexity.

The value of a North Star UI

In evaluating Evolution for a feature, it’s useful to have alignment on what our “North Star UI(s)” might be.

A North Star UI is the ideal UI for addressing a set of use cases and pain points in a perfect world where we have infinite resources and no practical constraints (implementation difficulty, performance, backwards compatibility, etc.). Sure, many problems are genuinely so hard that even without constraints, the ideal solution is still unknown. However, there many cases where we know exactly what the perfect solution would be, but it’s simply not feasible, so we need to keep looking.

In these cases, it’s useful to document this “North Star UI” and ensure there is consensus around it. You can even do usability testing (using wireframes or prototypes) to validate it.

Why would we do this for something that’s not feasible? First, it can still be useful as a guide to steer us in the right direction. Even if you can’t get all the way there, maybe you can close enough that the remaining distance won’t matter. And in the process, you may find that the closer you get, the more feasible it becomes.

Second, it ensures team alignment, which is essential when trying to decide what compromises to make. How can we reach consensus on the right tradeoffs if we are not even aligned on what the solution would be if we didn’t have to make any compromises?

Third, it builds team momentum. Doing usability testing on a prototype can do wonders for getting people on board who may have previously been skeptical. I would strongly advise to include engineers in this process, as engineering momentum can literally make the difference between what is possible and what is not.

Last, I have often seen “unimplementable” solutions become implementable later on, due to changes in internal or external factors, or simply because a brilliant engineer had a brilliant idea that made the impossible, possible. In my 11 years of designing web technologies, I have seen this happen so many times, I now interpret “cannot be done” as “really hard — right now”.

Mini Case study 1: CSS Nesting Syntax

My favorite example, and something I’m proud to have personally helped drive is the current CSS Nesting syntax, now shipped in every browser. We had plenty of signal for what the optimal syntax was for users (North Star UI), but it had been vetoed by engineering across all major browsers due to prohibitive performance, so we had to design around certain parsing constraints. The original design was quite verbose, actively conflicted with the NSUI syntax, and had poor compatibility with another related feature (@scope). Instead of completely diverging, I proposed a syntax that was a subset of our NSUI, just more explicit in some (common) cases. Originally discussed as “Lea’s proposal”, it was later named “Non-letter start proposal” but became known as Option 3 from its position among the five options considered. After some intense weighing of tradeoffs and several user polls and surveys, the WG resolved to adopt that syntax.

Once we got consensus on that, I started trying to get people on board to explore ways (and brainstorm potential algorithms) to bridge the gap. A few other WG members joined me, with my co-TAG member Peter Linss perhaps being most vocal. We initially faced a lot of resistance from browser engineers, until eventually a couple Chrome engineers closed on a way to implement the north star syntax 🎉, and as they say, the rest is history.

It was not easy to get there, and required weighing Evolution as a factor. There were diverging proposals that in some ways had better syntax than that intermediate milestone. If we only looked at the next move, if we had only used Utility and Usability to guide us, we would have made a suboptimal long-term decision.

Evaluating Evolution

To evaluate Utility, we can look at the use cases a feature addresses, and how significant they are. Evaluating Usability is also a matter of evaluating its individual components, such as Learnability, Efficiency, Safety, and Satisfaction. This can be done via usability testing, or heuristic evaluation, and ideally both. But how do we evaluate Evolution for a proposed feature?

How well it fits with the product’s past and present overlaps with Usabilty (through Internal Consistency, a component of Learnability), but is also important to consider.

When evaluating how well a feature fits into the product’s future, we can use the north star UI if we have one, as well as other related features that could plausibly be shipped in the future (e.g. have already been discussed, or are natural evolutions of existing features).

Does this feature connect to the product’s past, present, and future across a certain axis of progress? For example:

Level of abstraction (See Layering):
- Is it a shortcut to a present or future lower level primitive?
- Is it a lower level primitive that explains existing functionality?
Power: Is it a less powerful version of a future feature?
Granularity: Is it a less granular version of a future feature?

Other considerations:

Opportunity cost: What does introducing this feature prevent us from doing in the future?
Simplification: What does it allow us to remove?

TBD: Lacks a conclusion, illustrations, and examples.

The Hovercar Framework for Deliberate Product Design

25 June 2025 13 min read Report broken page

Many teams start with the MVP. But what if the key to shipping great products wasn’t starting small — but starting big? Could great products start at the finish line?

You may be familiar with this wonderful illustration and accompanying blog post by Henrik Kniberg about good MVPs:

It’s a very visual way to illustrate the age-old concept that that a good MVP is not the one developed in isolation over months or years, grounded on assumptions about user needs and goals, but one that delivers value to users as early as possible, so that future iterations can take advantage of the lessons learned from real users.

From Hovercar to Skateboard

I love Henrik’s metaphor so much, I have been using a similar system to flesh out product requirements and shipping goals, especially early on. It can be immediately understood by anyone who has seen Henrik’s illustration, and I find it can be a lot more pragmatic and flexible than the usual simple two tiered system (core requirements and stretch goals). Additionally, I find this fits nicely into a fixed time, variable scope development process, such as Shape Up.

🛹 The Skateboard aka the Pessimist’s MVP: What is the absolute minimum we can ship, if need be? Utilitarian, bare-bones, and somewhat embarrassing, but shippable — barely. Anything that can be flintstoned gets flintstoned.
🛴 The Scooter aka the Realist’s MVP: The minimum product that delivers value. Usable, but no frills. This is the target.
🚲 The Bicycle aka the Optimist’s MVP: Stretch goals — UX polish, “sprinkles of delight”, nonessential but high I/E features. Great if we get here, fine if we don’t.
🏍️ The Motorcycle: Post-launch highest priority items.
🚗 The Car: Our ultimate vision, taking current constraints into account.
🏎️ The Hovercar aka the North Star UI: The ideal experience — unconstrained by time, resources, or backwards compatibility. Unlikely to ship, but a guiding light for all of the above.

Please note that the concept of a North Star UI has no relation to the North Star Metric. While both serve as a guiding light for product decisions, and both are important, the North Star UI guides you in designing the product, whereas the North Star Metric is about evaluating success. To avoid confusion, I’ll refer to it as “North Star UI”, although it’s not about the UI per se, but the product vision on a deeper level.

The first three stages are much more concrete and pragmatic, as they directly affect what is being worked on. The more we go down the list, the less fleshed out specs are, as they need to allow room for customer input. This also allows us to outline future vision, without having to invest in it prematurely.

The most controversial of these is the last one: the hovercar, i.e. the North Star UI. It is the very antithesis of the MVP. The MVP describes what we can ship ASAP, whereas the North Star describes the most idealized goal, one we may never be able to ship.

It is easy to dismiss that as a waste of time, a purely academic exercise. “We’re all about shipping. Why would we spend time on something that may not even be feasible?” I hear you cry in Agile.

Stay with me for a moment, and please try to keep an open mind. Paradoxical as it may sound, fleshing out your North Star can actually save you time. How? Start counting.

Core Idea

At its core, this framework is about breaking down tough product design problems into three more manageable components:

North Star: What is the ideal solution?
Constraints: What prevents us from getting there right now?
Compromises: How close can we reasonably get given these constraints?

One way to frame it is is that 2 & 3 are the product version of tech debt.^[1]

It’s important to understand what constraints are fair game to ignore for 1 and which are not. I often call these ephemeral or situational constraints. They are constraints that are not fundamental to the product problem at hand, but relate to the environment in which the product is being built and could be lifted or change over time. Things like:

Engineering resources
Time
Technical limitations (within reason)
Performance
Backwards compatibility
Regulatory requirements

Unlike ephemeral constraints, certain requirements are part of the problem description and cannot be ignored. Some examples from the case studies below:

Context Chips Survey UI: Efficiency and discoverability
CSS Nesting Syntax: Conciseness and readability

While these may be addressed differently in different solutions, it would be an oxymoron to have a North Star that did not take them into account.

Benefits

1. It makes hard product problems tractable

Nearly every domain of human endeavor has a version of divide and conquer: instead of solving a complex problem all at once, break it down into smaller, manageable components and solve them separately. Product design is no different.

This process really shines when you’re dealing with the kinds of tough product problems where at least two of these questions are hard, so breaking it down can do wonders for reducing complexity.

2. It makes the product design process robust and adaptable

By solving these components separately, our product design process becomes can more easily adapt to changes.

I have often seen “unimplementable” solutions become implementable down the line, due to changes in internal or external factors, or simply because someone had a lightbulb moment.

By addressing these components separately, when constraints get lifted all we need to reevaluate is our compromises. But without this modularization, our only solution is to go back to the drawing board. Unsurprisingly, companies often choose to simply miss out on the opportunity, because it’s cheaper (or seems cheaper) to do so.

3. It facilitates team alignment by making the implicit, explicit

Every shipping goal is derived from the North Star, like peeling layers off an onion. This is whether you realize it or not.

Whether you realize it or not, every shipping goal is always derived from the North Star, like peeling layers off an onion. In some contexts the process of breaking down a bigger shipping goal into milestones that can ship independently is even called layering.

The process is so ingrained, so automatic, that most product designers don’t realize they are doing it. They go from hovercar to car so quickly they barely realize the hovercar was there to begin with. Thinking about the North Star is taboo — who has time for daydreaming? We must ship, yesterday!

But the hovercar is fundamental. Without it, there is no skateboard — you can’t reduce the unknown. When designing it is not an explicit part of the process, the result is that the main driver of all product design decisions is something that can never be explicitly discussed and debated like any other design decision. In what universe is that efficient?

A skateboard might be a good MVP if your ultimate vision is a hovercar, but it would be a terrible minimum viable cruise ship — you might want to try a wooden raft for that.

A wooden raft, then a simple sailboat, then a speedboat, then a yacht, and finally a ship. — A skateboard may be a great MVP for a car, but a terrible MVP for a cruise ship.

Making the North Star taboo doesn’t make it disappear (when did that ever work?).

It just means that everyone is following a different version of it. And since MVPs are products of the North Star, this will manifest as difficulty reaching consensus at every step of the way.

The product team will disagree on whether to ship a skateboard or a wooden raft, then on whether to build a scooter or a simple sailboat, then on whether to work on a speedboat or a yacht, and so on. It will seem like there is so much disconnect that every decision is hard, but there is actually only one root disconnect that manifests as multiple because it is never addressed head on.

Two people arguing. One has a speech bubble with a skateboard, the other a speech bubble with a wooden raft. The first also has a thought bubble with a car, the second a thought bubble with a ship. — When the North Star is not clearly articulated, everyone has their own.

Here is a story that will sound familiar to many readers:

A product team is trying to design a feature to address a specific user pain point. Alice has designed an elegant solution that addresses not just the problem at hand, but several prevalent longstanding user pain points at once — an eigensolution. She is aware it would be a little trickier to implement than other potential solutions, but the increase in implementation effort is very modest, and easily offset by the tremendous improvement in user experience. She has even outlined a staged deployment strategy that allows it to ship incrementally, adding value and getting customer feedback earlier.

Excited, she presents her idea to the product team, only to hear engineering manager Bob dismiss it with “this is scope creep and way too much work, it’s not worth doing”. However, what Bob is actually thinking is “this is a bad idea; any amount of work towards it is a waste”. The design session is now derailed; instead of debating Alice’s idea on its merits, the discussion has shifted towards costing and/or reducing effort. But this is a dead end because the amount of work was never the real problem. In the end, Alice wants to be seen as a team player, so she backs off and concedes to Bob’s “simpler” idea, despite her worries that it is overfit to the very specific use case being discussed, and the product is now worse.

Arguing over effort feels safer and less confrontational than debating vision — but is often a proxy war. Additionally, it is not productive. If the idea is poor, effort is irrelevant. And once we know an idea is good and believe it to our core, we have more incentive to figure out implementation, which often proves to be easier than expected once properly investigated. Explicitly fleshing out the Hovercar strips away the noise and brings clarity.

When we answer the questions above in order and reach consensus on the North Star before moving on to the compromises, we know what is an actual design decision and what is a compromise driven by practical constraints. Articulating these separately, allows us to discuss them separately. It is very hard to evaluate tradeoffs collaboratively if you are not on the same page about what we are trading off and how much it’s worth. You need both the cost and the benefit to do a cost-benefit analysis!

Additionally, fleshing the North Star out separately ensures that everyone is on the same page about what is being discussed. All too often have I seen early design sessions where one person is discussing the skateboard, another the bicycle, and a third one the hovercar, no-one realizing that the reason they can’t reach consensus is that they are designing different things.

4. It can improve the MVP via user testing

Conventional wisdom is that we strip down the North Star to an MVP, ship that, then iterate based on user input. With that process, our actual vision never really gets evaluated and by the time we get to it, it has already changed tremendously.

But did you know you can actually get input from real users without writing a single line of code?

alt text

Believe it or not, you don’t need to wait until a UI is prototyped to user test it. You can even user test a low-fi paper prototype or even a wireframe. This is widely known in usability circles, yet somehow entirely unheard of outside the field. The user tells you where they would click or tap on every step, and you mock the UI’s response by physically manipulating the prototype or showing them a wireframe of the next stage.

Obviously, this works better for some types of products than others. It is notably hard to mock rich interactions or UIs with too many possible responses. But when it does work, its Impact/Effort ratio is very high; you get to see whether your core vision is on the right track, and adjust your MVP accordingly.

It can be especially useful when there are different perspectives within a team about what the North Star might be, or when the problem is so novel that every potential solution is low-confidence. No-one’s product intuition is always right, and there is no point in evaluating compromises if it turns out that even the “perfect” solution was not actually all that great.

5. It paves the way for future evolution

So far, we have discussed the merits of designing our North Star, assuming we will never be able to ship it. However, in many cases, simply articulating what the North Star is can bring it within reach. It’s not magic, just human psychology.

Once we have a North Star, we can use it to evaluate proposed solutions: How do they relate to it? Are they a milestone along a path that ends at the North Star? Do they actively prevent us from ever getting there? Prioritizing solutions that get us closer to the North Star can be a powerful momentum building tool.

Humans find it a lot easier to make one more step along a path they are already on, than to make the first step on an entirely new path. This is well-established in psychology and often used as a technique for managing depression or executive dysfunction. However, it applies on anything that involves humans — and that includes product design.

Once we’re partway there, it naturally begs the question: can we get closer? How much closer? Even if we can’t get all the way there, maybe we can close enough that the remaining distance won’t matter. And often, the closer you get, the more achievable the finish line gets.

In fact, sometimes simply reframing the North Star as a sequence of milestones rather than a binary goal can be all that is needed to make it feasible. For an example of this, check out the CSS Nesting case study below.

Case studies

In my 20 years of product design, I have seen ephemeral constraints melt away so many times I have learned to interpret “unimplementable” as “kinda hard; right now”. Two examples from my own experience that I find particularly relevant below, one around Survey UI, and one around a CSS language feature.

Context Chips and the Power of Engineering Momentum

The case study is described at length in Context Chips in Survey Design: “Okay, but how does it feel?”. In a nutshell, the relevant bits are:

Originally, I needed to aggressively prioritize due to minimal engineering resources, which led me to design an extremely low-effort solution which still satisfied requirements.
The engineer hated the low-effort idea so much, he prototyped a much higher-effort solution in a day, backend and all. Previously, this would have been entirely out of the question.
Once I took the ephemeral constraints out of the question, I was able to design a much better, novel solution, but it got pushback on the basis of effort.
Prototyping it allowed us to user test it, which revealed it performed way better than alternatives.
Once user testing built engineering momentum and the implementation was more deeply looked into, it turned out it did not actually require as much effort as initially thought.

Here is a dirty little secret about software engineering (and possibly any creative pursuit): neither feasibility nor effort are fixed for a given task. Engineers are not automatons that will implement everything with the same energy and enthusiasm. They may implement product vision they disagree with, but you will be getting very poor ROI out of their time.

Investing the time and energy to get engineers excited can really pay dividends. When good engineers are excited, they become miracle workers.

In fact, engineering momentum is often, all that is needed to make the infeasible, feasible. It may seem hard to fit this into the crunch of OKRs and KPIs but it’s worth it; the difference is not small, it is orders of magnitude. Things that were impossible or insurmountable become feasible, and things that would normally take weeks or months get done in days.

One way to build engineering momentum is to demonstrate the value and utility of what is being built. All too often, product decisions are made in a vacuum, based on gut feelings and assumptions about user needs. Backing them up with data, such as usability testing sessions is an excellent way to demonstrate (and test!) their basis. When possible, having engineers observe user testing sessions firsthand can be much more powerful than secondhand reports.