3 Wishes for AI UX
Summary: A visionary course for AI's future requires three transformative shifts: UX as the linchpin of AI development, a hybrid interface marrying GUI and natural language, and Super-AI integration.
I dreamt that I was visited by the Lilac Fairy. She granted me the customary 3 wishes, stipulating that I could not wish for additional wishes. Faced with this constraint, my aspirations crystallized into three transformative shifts in AI that will help the world the most.
The Lilac Fairy, iconic from the ballet The Sleeping Beauty, will hopefully bestow upon us three paradigm shifts in AI User Experience. (Ballerina by Leonardo.)
Gold Medal Wish: UX Driving AI Products
By far, the most crucial wish is #1: having strong UX involvement in planning, designing, and implementing all AI products, from foundation models to vertical applications. This is my gold medal goal, and if it happens, the silver and bronze goals will be less critical because similar improvements will follow due to a strong UX impact on AI.
Ideally, I would wish for UX to drive the development of all future AI products, such that they meet user needs and are easy to use. However, that’s too much of a fairy tale dream to happen any time soon. Even traditional software is rarely developed this way, and the labyrinthine arena of AI is currently the geeks’ playground. All we can realistically hope for is that business requirements make them think more about their customers and that they hire competent UX teams to influence (but not drive) product development.
Right now, all the prominent AI products have terrible usability. They often defy elementary design principles that even a fledgling UX designer would know to avoid. The glaring faux pas committed by platforms like ChatGPT and Midjourney are nothing less than an insult to the collective wisdom of the UX community.
Immediate UX Opportunities
AI companies only need to hire a minimal UX team and conduct the cheapest qualitative usability studies with 5 users to derive a plethora of actionable insights. Some of these design improvements will be fast and cheap to implement. Others will require major redesigns of the applications. Finally, some deep UX advances will be difficult and expensive to implement because they must be built on top of a redesigned AI engine with human-centered capabilities.
This mirrors the evolution of web usability. The indispensable role of search engines in managing the web's burgeoning data became evident almost immediately after the web transitioned to a graphical interface. But it wasn't until Google introduced PageRank in 1998 that search became accessible to the average user.
The current state of AI UX is so miserable that two days of usability testing will identify plenty of low-hanging fruit. (I will happily advise OpenAI and Midjourney on how to do this. For free, no less, since I will save more than two days of my life if these products would only improve their usability.)
The cliché of the low-hanging fruit is true for all current AI products. (Generally, it’s true for any design that has never been subjected to usability testing: one or two days of testing will give you enough insights to improve the user experience dramatically.) In fact, for Midjourney, some fruit is so low that it’s on the ground, ready to be picked up, as demonstrated by the examples in my article about current AI product usability. (Fruit tree by Leonardo.)
Long-Term Opportunity: Trillions of Dollars at Stake
Even though the low-hanging fruit is real and should be picked first, the most critical issue for the future of artificial intelligence is for these companies to rapidly advance up the UX maturity scale and build high-powered UX teams and repeatable UX processes. There are easily several hundred percent productivity gains for the world’s users (which will soon be virtually every company in the world) if AI products could be built from the ground up to serve human needs and respect human limitations and capabilities.
The gain to the world economy from true AI UX will be somewhere between one trillion and ten trillion dollars per year. The investment needed to achieve these gains will be about $100 M annually for OpenAI or other big foundation models and $10 M annually for Midjourney or other specialized tools. This sounds expensive, but the total cost across all main AI projects will be less than 1% of the world’s gains. While most of this economic surplus will accrue to companies implementing improved AI products, the vendors will realize some of it from larger sales volume at higher prices. The vendors will make much more than they need to spend.
But even rich companies like OpenAI or Microsoft can’t just open the checkbook and write a check for $100 M to be spent improving their user experience. It takes time to achieve the big gains. The deep redesigns will most likely require more than 5 years of foundational research, many design attempts, and substantial rewrites of the underlying technology.
But remember the story of the French general whom Napoleon had awarded a château. He wanted majestic chestnut trees planted along the road to his new front door. However, his master gardener told him, “Mon Général, it’ll take a hundred years for these trees to grow tall.” To which the General responded, “In that case, plant them today!”
The anecdote of the General and the tall trees reminds us that some of the most significant and impactful actions will not bear fruit immediately but are nonetheless essential for long-term success. (Château by Leonardo.)
Whether you identify more with the French general or the low-hanging fruit gatherer isn’t the question. AI companies must embrace both perspectives: immediate action to fix their worst usability problems and sustained investment to build a high-maturity UX capability.
Silver Medal Wish: Hybrid UI for AI, Mostly GUI-Based
My second wish is something that can be achieved reasonably soon, though to do it well will require more UX effort than a two-day usability test followed by some quick fixes of the worst flaws. We must abandon the prompt-based UI, which presents an articulation barrier to most users.
The current linear UI requires endless scrolling, which complicates common user behaviors such as Accordion Editing (expanding or shortening the AI’s initial response) and Apple Picking (selecting individual elements from multiple past answers from the AI). This cannot stand. We need a 2-dimensional graphical user interface that supports point-and-click, selection with the mouse, visible menus and dialog boxes, and other interaction techniques that have proven superior for 40 years since the introduction of the Macintosh in 1984.
We should retain some element of natural-language prompting to support the intent-based outcome specification that is the first new UI paradigm in 60 years and a major advancement caused by AI and its ability to create new responses based on short or long prompts. Indeed, we should embrace the probabilistic nature of AI, which is one of the main reasons it supports unlimited creativity, leading to the realization that ideation is free with AI. We can ask it to provide as many good ideas as we like.
Thus, we want both:
A GUI that supports classic usability virtues like visibility of system state. GUI controls enable users to finetune the interaction and control changes at any desired level of granularity.
Prompts that free users from having to specify all the steps but shortcut the process of closing on the goal through natural language support.
In other words, I call for a hybrid user interface for interacting with and controlling AI, combining the best of these two UI paradigms. Such a UI will be worth a silver medal and bring us a long way toward the goal of high-usability AI. (At least the hybrid AI UI will be an advance if designed by competent UX professionals who employ the recognized UX design process with plentiful user testing driving an iterative design process. If “designed” by an engineer over the weekend, the hybrid UI will probably be no better than the current AI products.)
Bronze Medal Wish: Integration of Separate AI Models into a Super-AI
As a third wish, I would like to see integrated solutions rather than the current state of separate products for each type of work product. Supposedly, ChatGPT will be integrated with DALL-E 3 soon, which hopefully will mean that it will be possible to use proper natural language input to specify image generation.
For example, it should be possible to tell the AI where to place various objects in relation to each other. (This was impossible when I created the visual below to show my 3 medals. I needed endless iterations and false attempts before I could get the medals to appear in the relative positions of the award podium at the Olympic Games with gold on top and in the middle, silver to the left, and bronze to the right.)
The history of computing shows the superiority of integrated software, with the bestselling product of the last 30 years being Microsoft Office. Integration is also a significant selling point of enterprise software like SAP and Salesforce. But with AI, each product is a world unto itself and doesn’t talk to other AI products. Users are left with rudimentary copy-paste controls and must export half-baked AI deliverables into external applications for finetuning and integration. This must stop.
We need a super-AI that can create and understand text, images, video, audio, and any other media form, as well as integrate real-time updated data from the web and the ability to read and process the user’s own data — whether an individual user’s data from his or her PC or the cumulative data of an entire enterprise.
My three medals are to be awarded for vastly improved user experience of AI. (Medals by Leonardo.)
Three Goals, One Outcome: More Users & Profits
To summarize, my 3 main goals for AI user experience are:
Gold: employ UX professionals and apply UX methods to design AI products.
Silver: a hybrid AI UI, mixing GUI controls and prompting.
Bronze: integrating separate AI apps into a single super-app — or alternatively developing seamless integration between the various apps so that they work together.
The first of these is the easiest, in that we know exactly what to do from 50 years of experience integrating UX with complex software development. (You don’t have to read my work; you can read recommendations from thousands of younger UX professionals whose advice echoes 90% of what I have been saying for 35 years.) Ironically, it may be the least likely to happen because hardcore engineers lead the current AI companies without appreciating human factors.
Whenever I complain about usability defects in current AI products, I’m met with the same comeback: ChatGPT was the fastest product in history to reach 100 M users, in only 2 months (even TikTok required 9 months for this feat). Since so many people use it, it can’t be so bad — the same for Midjourney, which has about $5 M in revenue per employee. Despite my many negative comments on Midjourney’s UI, I root for them because I want independent companies in the running for the next computing paradigm instead of leaving it all to the tech giants who want to rule humanity instead of helping us.
Undoubtedly, the AI revolution of 2023 is impressive, and the products have tremendous utility. Witness this article’s illustrations, which I generated in a few minutes in Leonardo and Midjourney. In the past, my pieces were always text-only because I couldn’t draw. But I can prompt and winnow, which is how old people have regained their creativity through AI. A vast service to our aging society right there.
If the AI products didn’t have superior utility, they would have been long forgotten because of their terrible usability. The usefulness of a product is the combination of its utility and usability, and the two can trade off against each other. (Though a truly winning product will score high on both metrics.)
My metaphor equates a product's usability flaws to a hurdle the user must overcome. People will gladly jump if the hurdle is low, and almost everybody can do so. A steep hurdle will only be attempted by highly motivated users, and only the most athletic (cognitively strong) can clear it.
Runners in the Olympic finals are very motivated to jump and will likely clear the hurdle. But athletes may not jump high enough if they don’t care about the race. It’s the same with users’ willingness to struggle to overcome the barrier imposed by substandard usability in a user interface. (Runners by Midjourney.)
The impact of poor usability can be seen in the DAU/MAU metric, of daily active users divided by monthly active users. This is a measure of whether usage has become habituated. For generative AI, this metric is only 14%, according to Sequoia Capital, whereas good consumer apps are in the 60–65% range. Even though many people use generative AI, most of them don’t use it very often, indicating that the products aren’t good enough.
Leaning on Sequoia’s numbers, AI would be used about 5 times as much with better usability. By weaving AI more seamlessly into daily routines, tech companies could significantly ramp up their pricing. The raw number of users would also grow substantially if the products were easier to use. Let’s say by another factor of 5, even though I think the full potential is much larger, on a worldwide scale. Multiplying the two scaling numbers leads me to an estimate of 25 times higher use of current AI products if they could improve their user experience.
Will this translate to a 25x revenue growth? Doubtful, given the new users’ economic demographics. But a 10x revenue increase? Absolutely achievable. Time to get cracking and make AI straightforward.