Why Beats What: Prioritize Qualitative User Research Over Quantitative

Jakob Nielsen
Jul 31
13 min read

Updated: Nov 25

Summary: Qualitative research is your only path to design improvement. Watching users struggle reveals solutions; surveys and metrics don’t. Testing with 5 users enables rapid iteration: test, fix, repeat. Focus groups lie, observed behavior tells truth. AI systems require even more qualitative focus, as their randomness undermines statistical validity. Mature UX teams can spend 10% on metrics; beginners need 100% qualitative.

Qual beats Quant for user research. (GPT Image-1)

Short song about this article (YouTube, 2 min.)

Let me be perfectly clear, so you can stop reading now if you are in a hurry: your user research budget, your team’s time, and your intellectual energy must be overwhelmingly dedicated to qualitative user research methods.

For 36 years, I have held that qualitative studies are the only way to gain deep insight into the usability of a product: simply watch users attempting to do real tasks with the design. They are the only way to discover why your design succeeds or, more commonly, why it fails. This fundamental truth has been the central tenet of my work from the outset, and the recent AI explosion has only strengthened my conviction.

Listen and observe users, and you shall learn. (GPT Image-1)

Quantitative methods, which measure what users did (e.g., success rates, time on task) or how they felt on a numeric scale, are of distinctly secondary importance. They can alert you to the presence of a problem, but they offer absolutely no guidance on how to fix it.

Knowing that 73% of users failed to complete a purchase is a useless statistic for a designer. Knowing why they failed is everything: was the problem that they couldn’t find the “checkout” button, or that the shipping costs were a surprise, or that the form timed out, or a fourth issue? The redesign would be different in each of these cases.

The slogan that should be printed and posted on the wall of every UX department is simple: Why Beats What. This is not just a catchy phrase; it is the foundational principle for all effective usability engineering and the only reliable path to a positive return on your investment in user experience. Knowing what happened is analytics. Knowing why it happened is insight. We are in the insights business. (Apologies to UserTesting for stealing their slogan, but it’s the truth.)

We’re in the insights business, and qualitative insights drive business success. (GPT Image-1)

Insight shapes innovation, and you gather the most insights for the buck from qualitative user research. (GPT Image 1)

Why Beats What: in UX research and design metrics whisper, but meaning roars. If you want to print this as a poster for your team area, feel free to do so. (GPT Image-1)

My philosophy on user research was forged in the 1980s, when budgets were small, driven by a practical need for speed, efficiency, and real-world impact. The principles of “discount usability” are not about cutting corners; they are about maximizing the value of every research dollar and every minute spent. This entire system logically leads to the prioritization of qualitative methods.

Discount usability is a matter of spending your UX budget wisely, no matter its size. The ROI is so much higher from qualitative research that it should be your focus. (GPT Image-1)

UXR budgets are bigger now than when I started the discount usability movement, but we also have many orders of magnitude more UI designs to improve. Being fast and cheap still counts.

Finding Flaws, Not Figures: The “5 Users” Principle Revisited

My 1989 recommendation, that you only need to test with 5 users, is perhaps the most famous and most misunderstood piece I have ever written. The mathematical model Tom Landauer and I presented four years later shows a clear law of diminishing returns:

Testing with a single user reveals, on average, about 31% of the usability problems you will ever find in usability-testing an interface.
Testing with 5 users reveals around 85% of the problems.
Testing with 15 users reveals close to 100% of the problems.

For practical design projects, testing with 5 users will suffice to discover the usability insights you need to create the next iteration of the design, and thus set the stage for the next usability test. (GPT Image-1)

The crucial point, which many people miss, is that this is a principle for qualitative insight discovery, not for statistical sampling. The goal of this type of study is not to prove a hypothesis with 95% confidence; the goal is simply to find usability problems. Once you have observed three out of five users stumble over the same confusing icon, you have found a problem. It doesn’t matter if the “true” percentage of users who would stumble is 55% or 65%. You have identified a flaw in the design, and your job is to fix it. Continuing to test with more users to watch them fail in the same way is a waste of time and money.

Watch closely, and you can learn a great deal from a small number of users. (GPT Image 1)

This discount usability approach is inherently qualitative. It favors running many small, fast tests over one large, slow, and expensive one. The most cost-effective way to spend a budget for 15 users is not to run a single 15-user study. It is to run three separate studies with 5 users each. This enables iterative design:

Test the initial design with 5 users.
Find the most critical usability problems.
Fix those problems.
Test the revised design with 5 new users to see if your fixes worked and to find the next set of problems.
Repeat.

This entire cycle of rapid, iterative improvement is fueled by direct, qualitative observation. Large-scale quantitative measurement is too slow, too expensive, and provides the wrong kind of information (the “what,” not the “why”) to fit into this agile and efficient development model.

[Note that I write “agile” with a lower-case “a” instead of the capitalized “Agile,” because I’m not referring to any particular development methodology. Fast qualitative usability insights help steer the course of design, no matter which process you employ.]

Speed is your moat in the AI world, where anybody can build anything. The faster you achieve PMF (product–market fit), and the faster you adjust to changing conditions, the higher your chance of survival. Iterative design is the way to go, and you can iterate much faster with qualitative user research than with metrics. (GPT Image-1)

This preference for speed and informality is another pillar supporting a qualitative focus. The value of user research is realized only when it leads to an improved design. A 100-page report filled with statistics, charts, and detailed transcripts that takes weeks to write will often sit on a shelf, unread. Its findings become obsolete before they can be implemented.

In contrast, a “top-5 problems” email sent to the development team the afternoon of a one-day study can lead to changes being checked into the codebase the very next day.

The goal must always be to shorten the cycle between observation and action. This requires prioritizing the rapid communication of qualitative insights over the painstaking documentation of quantitative metrics.

Performance Over Preference: Watching What Users Do

A foundational rule of usability is this: watch what users do, don’t just listen to what they say. People are notoriously poor at predicting their own behavior or even explaining it accurately after the fact. Behavioral data is almost always more valuable than preference data.

Subjective preference data (how much people say they like something) versus objective performance data (how well the thing actually works for people) are both interesting, but they are separate. When the goal is to design a better-working product, it’s insufficient to simply listen to what users say. (GPT Image-1)

In a 1994 paper I wrote with Jonathan Levy, “Measuring Usability: Preference vs. Performance,” we demonstrated that what users say they like often does not correlate with how well they can actually use a design. It is common for users to give higher subjective ratings to a visually appealing design, even if that same design causes them to take longer and make more errors on critical tasks.

Don’t go by what users say; that’s often wrong, because they don’t know what they need. You must watch real user behavior and deduce for yourself how to design something that will be better for the users. This will rarely be what they asked for. (GPT Image-1)

This is one of the strongest arguments for prioritizing qualitative observation over quantitative surveys and focus groups. A focus group might tell you that users find your new color scheme “modern” and “clean.” A usability test will show you that the low-contrast text in that same color scheme is unreadable and causes users to abandon their tasks in frustration. Which finding is more valuable? The answer is obvious. The most valuable data comes from seeing a user struggle, not from hearing them rate their satisfaction on a 7-point scale.

There’s truth in clicks: what users actually do when confronted with your design will quickly prove whether it’s good or bad. (GPT Image-1)

These principles from the discount usability era are not merely a collection of cost-saving tips. They form a coherent philosophical system. The goal of usability is to improve design. Improvement requires knowing what is wrong and, crucially, why it is wrong. The “why” is revealed by observing user behavior, which is qualitative research. To maximize the return on investment, these improvements must be made quickly and cheaply. This means running small, iterative tests and communicating the findings rapidly. This entire system, from goal to execution, is built on a foundation of qualitative, observational methods. Any organization that claims to be “agile” or “lean” but is not prioritizing small-scale, iterative qualitative testing has fundamentally misunderstood the principles of efficient product development.

We have a limited amount of time with our users. The time budget for direct user exposure is even more precious than our money budget. We should spend almost the entire user-time budget extracting nuanced, qualitative insights. It is a waste to spend half of a rare user session having them fill out long questionnaires that provide shallow, quantitative data that is not actionable.

The Modern Imperative: Why “Why” Is Paramount in the Age of AI

Some might think that our new, powerful AI technology would render my old principles obsolete. The opposite is true. AI makes the case for deep, qualitative research stronger than ever before.

AI’s Probabilistic Nature Creates Chaos for Quantitative Measurement

Traditional user interfaces are deterministic: click the same button, and the same thing happens every time. Traditional quantitative usability testing assumes a stable system where the UI is the main variable. AI introduces a second, random variable: the AI’s probabilistic response. The same prompt can yield different results on different occasions. A user’s success is now a function of both the UI and their "luck" with the AI’s output. This introduces a massive, uncontrollable independent variable into any user interaction.

AI is probabilistic: every time you roll the dice, you get a new result. User research should not be concerned with whether the user rolls a Yahtzee: individual results are irrelevant. Design should focus on systematic issues, such as whether the goggles help the duck fly through storms, or whether they make it harder for the duck to see where it’s flying and thus make it harder for it to avoid storms. (GPT Image-1)

This inherent randomness is a disaster for most quantitative research methods. It dramatically increases the statistical noise in the data, which makes it much harder to achieve statistically significant results. If one user succeeds at a task because the AI gave them a brilliant answer by pure chance, and another user fails because the AI “hallucinated,” what does an A/B test of the button label tell you? Very little. The variability from the AI’s performance will often overwhelm any small effect from the UI change you are trying to measure (unless you have so huge sample sizes that the study becomes prohibitively expensive).

Measuring simple outcomes, such as success rate, becomes far less meaningful. The truly valuable research question is no longer “Can users succeed?” but “How do users navigate, interpret, and make sense of this probabilistic system?” This is a question about sense-making, mental models, and coping strategies; all of which are best explored through qualitative, think-aloud observation. Teams building AI products who rely on A/B testing and analytics as their primary research method are flying blind. They are measuring the noise of a random system rather than the signal of human behavior.

For this reason, the balance between qual and quant research tips more in favor of qualitative studies when AI enters into the picture.

The New Goal: Identifying Patterns of Interaction

With AI, the object of our study changes. We are no longer testing a static, predictable artifact. We are observing how a human being copes with an uncertain, probabilistic partner. The goal of a qualitative AI study is not to document a single bad AI response. It is to identify the underlying patterns of user behavior when they encounter certain types of AI behavior.

For example, our research should seek to answer questions like:

What mental models do users form about the AI’s capabilities and limitations?
How do users react when the AI provides a factually incorrect but confident-sounding answer?
What strategies do users employ to course-correct the AI when it goes astray?
How does a single bad experience impact a user's long-term trust in the system?

Answering these questions requires a high-IQ user researcher who can perform “double pattern matching”: recognizing patterns of behavior across multiple users as those users encounter various patterns of AI behavior. This is the very definition of deep qualitative insight.

User research with AI systems requires double pattern-matching: identifying patterns of user behavior (and mental models) as users encounter patterns of shifting (and sometimes unreliable) AI behaviors, with the goal of redesigning the AI patterns. (GPT Image-1)

Aligning Research Methods with UX Maturity

As a general rule, I recommend allocating 90% of a user-research budget to qualitative methods and 10% to quantitative. However, applying this rule universally is a mistake. A company’s research strategy should align with its current stage of UX maturity. Attempting to use methods that are too advanced for your organization’s culture, skills, and infrastructure is a recipe for wasted money and diminished influence.

Spending 10% of your budget on Quant is a sound target, but it is only appropriate for companies at a high level of UX maturity. For organizations at earlier stages, the allocation must be different. Attempting large-scale quantitative benchmarking when your company doesn’t even run all design changes through a 5-user qual test is like trying to get a pilot’s license before you’ve learned to drive a car.

The following discussion provides a clear, scannable guide for leaders to assess their stage and allocate resources accordingly. It will prevent you from misallocating your budget by providing a stage-by-stage roadmap for building a research practice. Before you ask, “How should we spend our budget?” you must first ask, “What is our true maturity level?”

Low UX Maturity: 0% Quant Spend

You have no budget and no influence. Your only goal is to find a few glaring, undeniable usability catastrophes to convince one manager that UX is worth a tiny investment. Quantitative research is an unaffordable and irrelevant luxury.

If your organization has low UX maturity, remain standing on the one leg of qualitative research: cheaper, faster, and (most important of all) more persuasive, so that management will gradually award you the budget to move up the maturity ladder. (GPT Image-1)

The focus must be on finding big problems cheaply to justify getting a real UXR budget. Every dollar you can lay your hands on without an approved formal budget must go toward finding actionable design flaws that, when fixed, provide clear value. Wasting money on metrics at this stage is counterproductive.

Numbers tally, but stories talk. Storytelling is a powerful way to make usability findings memorable for stakeholders, and the best stories often emerge from qualitative research. (GPT Image-1)

Medium UX Maturity: 5% Quant Spend

At this stage, your company has an official UX design process with a more-or-less defined UX research process, which has been allocated a small budget. Qualitative observation is still king, but you can start allocating up to 5% of your budget to Quant. Five percent of a small research budget produces a tiny quant budget. Probably only enough for methods like a simple post-task satisfaction question (such as the Single Ease Question) to start building a baseline. These numbers should be for tracking over time, not for driving design decisions.

I have primarily promoted usability testing in this article because it’s the most cost-effective qualitative research method and the easiest to learn for organizations with low UX maturity. But many other great qualitative methods should be embraced as you gain higher maturity and stronger research skills, the primary one being field studies: go to the users’ home or office and watch their natural behavior in their everyday habitat. (GPT Image-1)

High UX Maturity: 10% Quant Spend

The organization now has the stability, skill, and management support to run larger quantitative studies like benchmark tests, targeted A/B tests, or large-scale surveys to complement the deep insights from ongoing qualitative work.

Count yourself lucky if your UX efforts are mature enough to run a sophisticated, mixed-methods research program. Quantitative data should be used strategically to track overall UX health and identify areas that require deeper qualitative investigation. The budget for quant can increase because a high-maturity company has the expertise to use it wisely.

Ultimately, at the highest level, quantitative data (e.g., market trends, lifetime value) and qualitative data (e.g., in-depth ethnographic understanding of user lives) are seamlessly integrated to drive every aspect of the business. The budget will then be allocated by strategic initiative, not by research methodology. Even so, watch that you don’t overspend on Quant, which is always hungry for more resources.

When your company has matured enough to employ advanced user research, I recommend allocating up to 10% of the research budget to quantitative methods. (GPT Image-1)

Recommendations: How to Invest Your Next UX Dollar

Here are my main recommendations:

Prioritize “Why” Above All Else. Allocate the vast majority of your time, money, and brainpower to qualitative, observational studies. Your primary goal is to understand the reasons for user behavior. This is the only path to truly innovative and successful design. Remember: Why Beats What.

It’s worth repeating my favorite UX slogan one more time: Why Beats What. Watch users’ behavior, don’t just listen to what they say in a survey. (GPT Image-1)

Embrace Frugality and Iteration. It is always better to conduct three separate tests with 5 users each than a single test with 15 users. The goal is rapid, iterative improvement fueled by a steady stream of fresh qualitative insights. This is the essence of discount usability.
Use Quant to Track, Not to Discover. Reserve quantitative methods for when you have a mature team and a relatively stable product. Use metrics to monitor the health of your user experience over time and to identify “smoke,” such as a drop in a key metric, that signals the need for a qualitative “firefighter” to investigate and find the cause of the issue.
Assess Your UX Maturity, Honestly. Before planning any research or setting budgets, determine your organization’s UX maturity. Do not fool yourself. Overestimating your maturity is a common cause of wasted research budget and failed UX initiatives. Ask yourself whether you want to use advanced methods for your personal enjoyment or whether your team has truly exhausted the insights to be gained by much more straightforward and cheaper (qualitative) means.
The 90/10 Rule Is a Goal, not a Starting Point. The 90/10 budget split between qualitative and quantitative research is a sound guideline for a mature UX organization. Do not attempt to apply it if you are at an earlier stage. Your resource allocation must evolve with your capabilities, starting at 100% qualitative to find problems and prove value. Only gradually should you introduce quantitative methods when your team has the skill and your organization has the culture to support them properly.

Since humans (or badgers) first sat around the campfire and exchanged stories, storytelling has been a powerful means of communication. People remember stories, whereas statistics often end up in an unread pile. The best stories often emerge from qualitative observations, offering colorful anecdotes. (It’s an ethical mandate only to use anecdotes that align with your broader qualitative analysis.) (GPT Image-1)

Qualitative research is the workhorse of UXR. Keep riding it for success. Above all, resist the lure of number fetishism. Don’t squander resources chasing tiny percentage gains when there are opportunities for much larger gains from fixing glaring issues found by qualitative studies. Design research is closer to ethnography than physics; its power lies in understanding people, not in counting clicks.

Remember that we’re in the insights business when doing user research. Qualitative studies are usually the best way to create profits for your company. (GPT Image-1)

Short song about this article (YouTube, 2 min.)