In the first 4 episodes of the AI in UX research series we asked industry experts questions on the AI-UX research relationship. Make sure not to miss the previous episodes and the final report from our AI in UX research survey.
Our series covers the following topics:
- Episode 1: Is the rise of AI use a benefit or a detriment to UX research?
- Episode 2: What would be the one aspect of UX research that is best compatible with using AI?
- Episode 3: Can UX researchers remain market viable if they don’t choose to adopt AI?
- Episode 4: Thoughts on AI-generated responses / AI-based users
- Final report: Results of the AI in UX research survey
In this episode, we will look at the answers of industry experts to the question:
“What are your general thoughts on AI-generated responses / AI based users? ”
With the first questions we wanted our experts to provide a general opinion, their general stance on the AI discourse in the scope of UXR. With the second question we wanted them to think of at least one aspect where AI can truly be a benefit. For the third question we wanted to look at the impact a wide introduction of AI could have on the job stability of the members of our UX community. The final question took on the most polarizing and possibly controversial topic of using generative AI instead of human participants.
We had our own stance on this topic, but we wanted to provide as objective a view as possible by gathering the opinions of industry experts for you. The responses we received were not as polarized as we thought that they might be, but they confirmed the reputation of this topic as controversial.
Here are the industry experts and thought leaders we asked for their opinions:
- Darren Hood, MSUXD, MSIM, UXC
- Caitlin D. Sullivan
- Dr Gyles Morrison MBBS MSc
- Stéphanie Walter
- Kevin Liang
- Debbie Levitt, MBA
- Joel Barr
- Nikki Anderson-Stanier, MA
- Julian Della Mattia
- Kelly Jura
- Ben Levin
In the next paragraphs, we will list the answers we managed to gather and at the end, we will let you in on our stance as well.
Darren Hood, MSUXD, MSIM, UXC
The only place I can think of where AI-generated responses could possibly be useful is for the development of proto personas. It should be noted that there’s currently a “war” being waged against personas-at-large by clients and stakeholders, so engaging via this method must be treaded carefully.
Caitlin D. Sullivan
Founder of User Research Consultancy and UX Research Advisor. You can find Caitlin on her LinkedIn.
I look to AI-generated responses as interesting indicators and summaries of information out there in the world, but usually on a superficial level. I don’t see them yet as reliable enough to say they can be the sole source of information for research we can trust.
At this point, I am firmly against the concept of AI-based users, because there are too many contexts, scenarios, life experiences that I don’t believe AI can “understand” in a way that replicates the responses and behavior that a real user can produce. That makes no sense to me, to be honest. The entire point of running any kind of discovery with users is to understand what drives their behavior, which is based on their life experience.
Dr Gyles Morrison MBBS MSc
Clinical UX Strategist and UX Mentor. You can find Gyles on his LinkedIn.
I generally think AI responses should only be used for questions one would ask Google or another search engine. It can help with literature review as long as the system doesn’t hallucinate.
Getting generative AI to come up with user responses is literally generating stories the AI thinks is right. It’s completely unacceptable and shouldn’t be tolerated.
I don’t think they should be used at all. There are a couple of issues. First, user research is about talking to the right users, at the right time. AI-generated response tools today are mostly based on LLMs, trained on data that is, well, most presumably not your users. The dataset is usually a black box, you have no idea what it was precisely trained on. Why would you want to take a design decision based on data that doesn’t reflect your actual target audience?
The second issue is the way it works. When you ask something to tools based on LLM, the answer is constructed with “what is the next word that is statistically plausible after this one”. They are not trying to answer the question, but to find the next word that works best in the sentence. So here again, do you want to take design decisions or product decisions based on that?
People (including PMs and Designers) have asked about AI like “synthetic users” – to use it in place of primary research. Synthetic Users was created by product managers to conduct user research with “AI users” to validate their product ideas.
As an aside, we should stop approaching development with the mindset of validating something, at least, I don’t like that word “validate”. In essence, semantically, that biases us to go looking for stuff that works, rather than going in with an empirical, testing mindset.
Anyway, with S.U. you’re essentially chatting with a bot about your ideas. And a bot that has “averaged” others’ responses, which may amplify invalid research that was provided into the AI, or worse, amplify stereotypes. Whatever comes out of such tools should be treated as secondary research.
Primary research, in its definition, means collecting data straight from the source. I would assume that your source should be humans, not AI. So my question to you would be: are you designing for AI as your end user? Or humans as your end user?
Debbie Levitt, MBA
These might call themselves UX research, CX research, or user research, but they’re not because those types of research, and especially qualitative research, require us to interface with our actual customers, users, or target audiences. We don’t want AI to represent target audiences because AI is not sophisticated enough at this time to represent them well. We have problems of racism, sexism, ableism, and things like that in AI.
We already don’t understand our customers very well. We don’t take the time to do that. We claim we have empathy, but we don’t want to deeply research our users because someone decided that was too slow or it takes too much time and money. Customers are begging to be understood so that we can make something better for them, and bring more value for the money they’re paying us. But we’ve decided, let’s just do this as fast as possible, which is a mismatch.
While there are machine learning systems that claim they can represent your users, my question is, are they representing your users well? It’s best to put our users into segments, buckets, or typologies by their behavior. I’m a 51-year-old woman and let’s say I’m going to shop for a car. What if I shop for a car the same way that a 20-year-old man does? Is someone going to say, “No, no, no, we have to present this and advertise this and build this differently because that’s a 50-year-old woman!”? If we were to focus more on segmenting by behaviors, then we wouldn’t be so caught up on how old someone is, how much money they made, or whether or not they’re married.
These are old-fashioned ways of looking at our users. It’s really left over from when marketing and advertising just wanted to promote stuff to white men and sometimes white women. Back then, the main questions were, “How much money did they earn? Where did they live?” AI has inherited all of this, and that’s what’s going to be baked into your fake AI users. If you wanted to create an “AI user” based on human behavior and decision-making, are we really there yet? Can it do that? Can that AI hear about your potential idea and say, “Oh yes, this person would like that.”
We don’t know, and even our users don’t know. We survey them all the time, we call them up and we say, “Hey, we’ve got this idea. Do you like it? Would you buy it?” We get lots of yes. And then people at a meeting say cool, 87% of people said they want this, let’s build it. We have no idea what people envisioned when they said yes. They might want what they envisioned – a product or a service or a task – and we don’t know what’s in their head. We only know that they generally like the idea of where that solution might go. “Hey, would you like a more comfortable backpack for traveling?” Sure. Hey, everybody wants a comfortable backpack. Then you design a backpack that you think is great, and it turns out it’s not right for your users. Now what? Did we really get to know those users? Did we understand their tasks? Did we understand their decisions? Did we understand their environment and context? Did we understand their disabilities, diagnoses, or conditions?
It’s nearly insulting to ask AI to pretend to be these things. That’s also why you tend to see these AI users’ services and systems mostly advertised to Marketing people and Product Managers. They know they’re not going to win over the UX people. They know that the UX people are going to look at this and say, “Why do you want to replace my users, the precious users that I so desperately want to know better, with an algorithm that pretends to be them based on minimal and often demographic information?” These companies know that they’re unlikely to sell it to us. So they sell it to someone else on the basis of speed and cost, not necessarily on the basis of accuracy. Just read some of these companies’ terms and conditions, and you can see they don’t want to be responsible for any of the advice they give you, but they would love to sell you some AI advice.
Joel is Lead User Researcher. You can find him on his LinkedIn.
AI-Generated responses as users are fake. There is no such thing as an AI-based user. Any company that tells an enterprise otherwise is asking for that business to trade the family cow for a handful of magic beans. (Ask Jack how that worked out for him.) A C-suite or PM that falls for it deserves neither title nor position to serve humanity as a business leader, because it demonstrates that they/those people ultimately don’t care about the user. They care about pushing useless, poor products out the door for a buck, damn the user and what they need.
Nikki Anderson-Stanier, MA
AI generators lack so much context, empathy, flexibility, and nuance, so solely utilizing AI-based insights can have a very negative impact on making the best decisions for users and for a business. Not only are AI-generated insights and users way too vague, they don’t take into account the context behind the study, such as the goals, open-ended interview questions, or precise participant criteria. It also has a hard time taking into account business goals and making sure the most relevant data it surfaces is tied to those goals.
Additionally, we have to be very careful when inputting data into AI as user research data is meant to be highly confidential!
Julian Della Mattia
Using AI based users for research has the same value as going to a fortune teller to get life advice. AI tools out there are language models, so there are numerous things it won’t be able to reproduce. If real humans are sometimes not able to properly articulate their needs or thoughts, do you think a language model would?
We design products that cater to the needs of multidimensional humans. The idea of AI-based users makes me uneasy, as AI is based on data, and that data can be biased. If AI users are built on biased data, there is a risk of further excluding underrepresented groups when designing products.
UX Researcher & Strategist. Managing Partner, Chamjari. You can find Ben on his LinkedIn.
Lately, there’s been some brouhaha about the idea of using Generative AI as an “artificial user”.
At the outset, I’ll say I’m highly skeptical of the utility of such a thing, in most contexts. But let’s keep an open mind.
Automated testing tools are not new, but they have relied on the ability of a QA engineer, for example, to write a test script, which an automated set of software can interpret and execute.
So as an extension of that kind of “user”, there’s a possibility that an “AI user“ could have some utility. As I’ve discussed in the past, this really is more akin to asking a text generation tool to complete the sentence, “a user would accomplish [specific task] by…“ and waiting for the completion.
There’s a world in which a Generative AI tool could speed up some aspects of usability testing, and maybe even preference testing for a well-defined use case (i.e., one for which you are not developing a completely novel solution.) Again, remember that LLM‘s are text completion engines; they are searching and information space for an “best match“ for the query that you’re presenting Think of them as “best practice evaluators.”
The more you are working on a novel problem or use case for which there is no known ideal solution, the more generic and less useful an artificial user will be.
In a nutshell, this is a sort of Mobius strip kind of problem: the closer we get to having a general intelligence AI that can serve as a useful Artificial User, the closer will be to having that same AI design good solutions to user problems that won’t need testing in the first place.
Our two cents
We agree with basically every general opinion that our experts provided. To sum it all up we think all that needs to be said is that these responses/participants simply aren’t human. This lack of humanity has multiple aspects:
- Compared to the cold hard calculating AI, humans are simply chaotic. They like to decide based on a whim just as often as based on logic and it’s hard to tell upfront which side will “win”. Humans make decisions that are objectively incorrect or bad, AI would never do that, because its inherent aim is to find the “correct” answer.
- AI is biased. Yes, you could argue that humans are all biased as well, but that’s why you are conducting the studies with multiple participants, to compensate for this bias. If you generate multiple responses from one model trained on one dataset by one team of experts, you will have the same bias hidden in every answer. Also, you can always ask a human respondent additional questions to learn more about how biased they are, why they are biased, and how their biases may influence their behavior and decisions. AI is not able to answer these questions.
- Creativity, curiosity, improvisation, and intuition are strictly human attributes, which cannot be simulated (at least not yet). When you are conducting research with human participants they will find unusual solutions to your tasks. These solutions may not be the most efficient or 100% correct, but they provide you with a view you might not have discovered in 100 years and AI wouldn’t for sure either. Many of you must remember a moment from one of your studies when you asked your user why they did something or behaved in a certain way and they simply answered: “Because it felt right”. This is an interaction you will never get from a mathematical construct.
- The whole is greater than the sum of the parts. This has been true since the times of Ancient Greece. If you put all the demographic attributes your user should have into an LLM and ask them to behave like this persona you will get the sum of parts. The missing aspects are what makes humans human. Yes, you can ask your model to be a 50 years old taxi driver from NYC and it will make the decisions based on what it thinks it means to be this person, but only a 50 years old taxi driver from NYC knows what it means to be them, to be the whole.
- Each human is unique. We all know that if we want our research to truly cover all aspects of our user base we need to address this uniqueness. From the different socioeconomic backgrounds through all the unique neurodiversities all the way to conditions such as dyslexia or daltonism you cannot simulate this. Generating an average answer would destroy this uniqueness.
To simplify it all: You are creating products for humans, talk to them!
What to look forward to?
Next, we will bring you a comprehensive report containing the results of the survey on how the UX community views the current state of AI. Stay tuned!
Let us know your answer to our question in the comments on our LinkedIn!