Usability tests are necessary if we are to create valuable and successful products. However, conducting such tests can become a headache if they are not done efficiently by recruiting an adequate number of usability testing participants.
The 5-user rule: Cost-effective and optimal usability testing
According to the reputable Nielsen Norman Group, ‘testing with 5 people lets you find almost as many usability problems as you’d find using many more test participants.’
The logic behind their ‘5-user’ suggestion is that as you test more and more people, you uncover fewer new insights at a higher cost. After testing up to 5 people, the same usability issues would continue to be mentioned by additional participants with very little significant change.
So it’s economical and optimal to test just enough participants who can give you sufficient insights at a low cost. Thus, the 5-user rule.
This rule has been analyzed further by Laura Faulkner in her study. As a result, we can see how the increase in the number of participants influences the percentages of the found usability problems:
Number of Participants | Minimum % Found | Mean % Found |
5 | 55 | 85.55 |
10 | 82 | 94.686 |
15 | 90 | 97.050 |
20 | 95 | 98.4 |
30 | 97 | 99.0 |
40 | 98 | 99.6 |
50 | 98 | 100 |
According to this table, testing with 5 users will in most cases be enough. However, if you are testing a portion of UI, where user errors would have a significant impact, it might be a good idea to test with 10 or 15 participants, to increase the minimum percentage to over 80%.
If we consider the increase in problem detection rate per additional user, the most cost-effective number would be closer to 10. This number represents a substantial improvement over just five users while avoiding the diminishing returns seen as you approach 20 users. Thus, for most projects, using around 10 participants may provide the best balance between cost and effectiveness in usability testing.
When to use more or less than 5 participants
There are instances, of course, where the 5-user rule is not applicable: In some situations, the type of product tested requires a larger number of participants to obtain significant test results. For example, e-commerce websites can be much more complex than software products, so testing as few as five participants may only uncover 35% of their usability problems.
You may also need more or less than 5 participants for your usability study if you:
Have different target audience segments
When a product has multiple user groups, that behave differently, you’ll need to test participants that represent each group.
For instance, a marketplace app would need participants for both the buyer and seller profiles. In this case, you don’t need to have five participants for each user group since there would be overlapping observations, and you could use as few as 3 or 4 participants for each group.
The overlap of the findings will differ based on the similarity of the user subgroups you need to test. The more similar the groups are, the higher the overlap will be.
Need quantitative data
In conducting quantitative usability studies, your results must be statistically significant to obtain accurate insights. For this, you will need at least 20 to 40 test participants.
Since the focus is on measurable metrics and not qualitative findings, you will need many participants to get enough data that can accurately predict the behaviors of your overall target users.
Use an Agile UX process
In an Agile UX approach, you run multiple usability tests as the product develops since it is an iterative process. In this development style, you conduct tests, use the insights obtained to make changes, and then test the new version – and on and on it goes.
So rather than using five or more participants for each test, you can run multiple tests with as little as 3 participants each since there would be overlapping insights and discoveries. Using fewer participants also helps to save costs on your study.
Pros of using 5±2 participants in your usability study
There have been different optimal numbers of participants suggested by researchers over time. Having a minimum of 3 participants ensures that you capture diversity in your user group while having as many as 7 participants ensures that you uncover almost all the usability problems in the product.
So the ideal number of participants lies in a baseline of 3 to 7 participants for the following reasons:
1. It maximizes the law of diminishing marginal returns
In the usability testing context, the law of diminishing marginal returns states that as you test participants, your study will reach a point where increasing the number of participants will lead to fewer insights obtained. This graph put forward by the Nielsen Norman Group explains it better.
(NN/g)
Here,
- Testing the first participant gives fresh new insights,
- Testing a third participant uncovers many problems the first two participants may have missed,
- Testing the sixth participant still uncovers new problems the first five participants may have missed,
- But by the 12th participant, there would be little to no new insights as most usability problems have already been discovered.
So for practical reasons, testing between 3 to 7 participants gives you that sweet spot where you obtain sufficient insights without falling into the diminishing marginal returns trap.
2. Testing 3 to 7 users is cost-efficient
When developing a product, one main reason usability testing faces a lot of pushback from stakeholders is the cost of conducting usability tests. Depending on the type of usability test, the number of participants recruited, as well as the product type among other factors, study costs can range from as little as $25 U.S dollars to $10K U.S dollars and even much more.
From recruiting agency fees to software costs and procuring incentives for your participants like gift cards, there are a lot of expenses and hidden costs incurred in running a single usability test. For example, if you are running a moderated usability test, you need to include the cost of the moderator’s time as well. And since costs rise as the number of participants increases, testing a few participants helps you minimize costs and maximize your budget.
Pro tip: Run a pilot test with 1-2 participants, before you recruit your target audience to test the clarity and comprehensiveness of your tasks, questions, and study overall. This will allow you to eliminate issues before the study goes live – minimizing the risk of running the study and collecting invalid or unusable data leading to wasted time and money.
3. Testing 3 to 7 users is time-efficient
Usability tests often take a lot of time, from setup to test sessions to analyzing the results. Depending on the number of participants and the type of study conducted – moderated or unmoderated, it can take a day or many days for completion.
It is estimated that conducting usability tests could take as long as 11 to 48 hours for only 5 participants. So rather than spending a lot of time testing many participants for your study, test only a few participants that would give you more or less the same results at a lesser time.
Does the 5-user rule hold for different UX research methods?
UX research methods include focus groups, surveys, A/B tests, card sorting and many more, so the type of method carried out plays a huge role in determining the number of participants selected. For instance, card sorting has a different optimal number of participants compared to eye-tracking or usability testing.
How many participants for qualitative research?
These methods rely on personal accounts, not on numerical data. Therefore, you don’t need as many participants as with quantitative methods, since reaching statistical significance is not one of the goals of this research approach. The rule of 5 users should work rather reliably with these types of studies.
Website or mobile usability testing
With website usability testing or mobile app testing you can safely stick to the number of ±5 participants to uncover most issues. Go as high as 8 or 9 if you have the resources and want to go more in-depth. However, make sure you don’t overwhelm one participant with too many questions. If you need to test a complex website or app, rather split the testing process into multiple sessions.
How many participants for quantitative research?
Quantitative research focuses on cold hard numerical data. This form of results can however be only trusted after it has managed to reach statistical significance. For this to occur, you need to gather much more responses, not just the “golden” 5.
Card Sorting
Since card sorting is a generative method with a goal to find out how people organize and find their way around content, you would ideally need significantly more than just 5 participants for this research method if you’re to get representative results, since people’s mental models widely vary.
The Nielsen Norman Group proposes testing with 15 participants – three times more than the suggested standard of 5 participants, while other researchers insist on testing 20 – 30 participants to get sufficient insights for the test. The difference is that 15 participants would give you a correlation of 0.90 between study results and the actual real-life results, while 20 participants would give you a correlation of 0.93.
However, if you’re dealing with a large project and have a lot of risks involved, you could test as many as 30 participants.
Tree testing
Tree testing is another quantitative method, you need a larger sample of testers to uncover statistically significant results. At NNGroup they recommend at least 50. The ideal number of participants for a tree test lies somewhere between 50 – 150.
Preference test
Preference testing is another research method, where testers can help you decide between 2 (or more) design options. As a quantitative method, it relies on having a large pool of responses to provide a reliable result. 40 participants for preference testing is a good number (as NNgroup suggests), however, other sources quote 20-30 as also being acceptable.
When running a Preference Test with UXtweak we use the Chi-square test goodness of fit test (https://www.itl.nist.gov/div898/handbook/eda/section3/eda35f.htm) to calculate the statistical significance of the Preference Test. The significance scales as follows:
- 99%-100% – The results are surely statistically significant
- 95%-98% – The results are probably statistically significant
- 90%-94% – The results tend toward statistical significance
- 51%-89% – The results probably aren’t statistically significant
- 0%-50% – The results aren’t statistically significant
To properly calculate the significance you need to gather the number of responses which is at least 5 times the number of tested designs in a task.
To assure statistical significance for different numbers of designs we suggest gathering at least the following numbers of responses:
Number of designs to compare in a single task | Recommended number of respondents for the task |
2 | 69 |
3 | 89 |
4 | 98 |
5 | 108 |
6 | 116 |
You can learn more about statistical significance with UXtweak’s Preference Test here.
Mixed methods
Some methods cannot be categorized as inherently qualitative or quantitative. This is caused by the fact that different approaches that fall under these umbrella terms can be either qualitative or quantitative.
Prototype usability testing
If you are running your Prototype test in a fashion similar to website usability testing (screen recording with think-aloud, ideally a face cam as well), then ±5 participants is enough.
Pro tip: UXtweak Prototype Testing Tool allows you to run usability tests with think-aloud.
However, if you decide to rely on the quantitative approach focusing on the path aggregations and heatmaps, then the 5 participants won’t be anywhere near enough. In this case, you should be aiming at at least 25 participants. For both approaches, you can always go slightly higher to be sure.
Eye-tracking
The eye-tracking study can be used to get both qualitative and quantitative data, and as a result, the type of data you want affects the number of participants you need. In the real world you would be most likely running eye-tracking based research in a qualitative format, since gathering enough participants to generate statistically significant heatmaps from eye tracking would be simply too expensive. For qualitative data, Pernice and Nielsen suggest using 5 participants, and these participants can then be tested for a long duration.
Heatmaps
If you’re looking to generate heat maps, Pernice and Nielsen suggest recruiting and testing as many as 39 participants. Since heat maps are classified as quantitative tests, you need more responses to ensure the results are statistically significant.
Survey
Survey research can be also approached both as a qualitative and as a quantitative research technique for gathering data from a group of respondents.
In a survey, the number of respondents also matters. If you don’t recruit enough, you have a chance of getting an inaccurate result. This is especially the case for quantitative surveys consisting of single choice, multichoice or scale-based questions. According to NN group’s research, the following types of studies require an average of 40 respondents. If you choose to run your survey in a more qualitative manner (just a few open ended questions), then the rule of 5 Users should be applicable to a reasonable extent.
How many respondents does your study need?
Depending on the study question, the test’s design, and the level of confidence that researchers want to place in the results, a different number of participants may be required for usability studies. A general rule of thumb is that 3 to 7 participants are a commonly recommended sample size for usability testing.
However, it is also important to understand that a larger number of participants may be required depending on the research method and the type of data you’re looking to gather. While in usability tests where you’re gathering qualitative data about the user’s behavior 3-7 participants may be enough, it’s different for the quantitative research studies. In card sorting, for example, to get accurate statistics you may need around 30 to 40 respondents and sometimes even more.
Conclusion: The perfect number doesn’t exist
For decades, we have had many debates in the research world, and still, there is yet to be a consensus on the number of participants needed for usability studies. Several factors such as the product type or size, the stage in the usability life cycle, the tasks conducted, the skill of the researcher, and even the personality of the participants all impact the appropriate number of participants necessary for your study.
But one thing all UX researchers agree on is that there is no one-size-fits-all number of participants for your usability study. So always take your own context into consideration.