UX Roundup: OpenAI Announcements | GitHub AI Keynote | Jakob’s Razor for UX | Midjourney Beta
Summary: UX hot take on OpenAI’s keynote announcements | Jakob’s Razor: don’t overcomplicate matters and start with a qualitative user test of 5 users, but we have an extensive toolbox of user research methods for special cases | Midjourney’s Web user interface now in beta | AI censorship goes overboard | Generative AI continues to misspell words | Runway generates ever-better artificial video | 4 degrees of anthropomorphism in using AI
UX Roundup for November 13, 2023. Happy Diwali from UX Tigers!
UX Hot Take on the OpenAI Keynote Announcements
After listening to 45 minutes of Sam Altman’s keynote at the OpenAI conference on Nov. 6, it was striking that the words “UX” or “usability” were not included among the almost 7,000 words in the keynote.
You can watch the keynote for yourself on YouTube.
The word “UI” was mentioned once, in the context of a demonstration of how to integrate AI features within a travel application that was being built on stage. Indeed, the UI was quite nice, in terms of presenting a map of Paris that dynamically updated as the user was asking about tourist attractions to pinpoint their location. While this feature is old, what’s interesting is that the AI took it upon itself to visualize the location of the tourist attractions without the programmer having to add the feature to the app. In many ways, this demo reminded me of Bill Gates’ demos at Comdex 30 years ago, where he would also show how “easy” it was to build cool features with various new tools he was hawking. It was never as easy for the developers back home.
It’s promising, but scary, when the AI is not just employed to answer users’ questions but also to introduce features into an application without the designers or developers having specified those features. In the demo, the AI-injected feature was useful and worked well, but one can also fear the consequences of rogue AI features for user experience.
OpenAI was pushing a new type of custom-built bots, called “GPT,” that you can train with specialized knowledge, for example, by uploading a corpus of your past content. This seems useful, but also potentially confusing since there will likely be an overwhelming number of such bots on the botstore. I’m reminded of the mobile platforms’ app stores.
The GPT bot store (Midjourney).
An unconvincing demo showcased the use of a custom bot from Canva: first, the user asked the bot in natural language to design a poster for a certain event. This resulted in a range of options, similar to the current use of Midjourney. The user would pick a preferred design, which would then be transferred to Canva for further refinement with the normal Canva design tools. This disjointed user experience evidenced a stunning lack of integration and cannot be the road ahead.
Accessibility was mentioned, when discussing the ability of ChatGPT to describe visuals. Blind users can use this to find out what’s in front of them — definitely a positive development. Though, as a usability expert who has been involved in usability studies of disabled users, I’ll add that just getting a verbal description read aloud doesn’t provide the usability that people need if they can’t see things. The most helpful description depends on the context: what the user is trying to do. A longwinded description that covers extensive irrelevant aspects of the scene will delay users and be more annoying than helpful. Knowing the context of use is critical for any good accessibility feature. I hope AI will learn how to adjust its verbal descriptions to the users’ needs, but this area needs work and won’t happen on its own.
My primary point has always been to treat disabled users as users first and foremost. The usability criteria are the same whether or not people can see: they must be able to accomplish their tasks quickly and easily. Converting visuals into speech is only helpful if the text is short and to the point relative to the user’s task.
The announcement of ChatGPT 4 Turbo was greeted with enthusiastic applause when Altman mentioned that the different modes will now be integrated so that users won’t need the mode-switching menu to change between, say, generating images and working with text. While the dirty word usability wasn’t mentioned in the presentation, this will surely be a usability improvement for the product. People who study the history of UX have known since Larry Tesler’s work to eradicate modes from the Apple Lisa around 1980 that UI modes are bad for usability.
ChatGPT generated this image for me to illustrate the launch of its own Turbo version. The conference audience did like it.
The more technical upgrades to GPT-4 will certainly have UX benefits. A bigger context window will allow us to work with more data without workarounds, which will be particularly helpful when analyzing qualitative user data. The lower prices for developers using GPT through the API will support the creation of a broad range of new innovative applications. Many will be bad, but some will be great, and experimentation will give us a new class of user experiences that we don’t envision today.
Sam Altman was quite aggressive in pushing AI agents as the next step. Agents will interact with the world rather than just answering questions about the world, as is the case for ChatGPT now. The demos were primitive, and as always with demos worked perfectly. From my perspective, the broader tasks we attempt with AI, the more we need both task analysis and good UX design to ensure a smooth workflow and user control.
An AI agent reaching out to the world, as envisioned by Dall-E.
GitHub's AI Copilot Strategy: Revolutionizing Developer Workdays Beyond Coding
The strategy for GitHub’s AI Copilot is very impressive. I particularly liked the emphasis in the Nov. 8 keynote on using AI to improve the full development lifecycle and the full workday of programmers, instead of just focusing on the actual coding.
You can watch GitHub’s full keynote on YouTube: https://www.youtube.com/watch?v=NrQkdDVupQE
I recommend watching, even if the presenters are rather nerdy and a bit technical at times. This is how a true vision for our AI-driven future looks. (And while I compared the OpenAI keynote to a Bill Gates keynote from the 1990s before he got a speaking coach, GitHub’s keynote was not up to a Steve Jobs level, but it did look like the presenters knew how to deliver a speech and followed advice like using a story arch.)
GitHub as the angel of software developers, watching over their entire workday. (Dall-E)
We’ll see whether they can deliver, but the keynote promised to deliver on the following advances, which all sound like great uses of AI and along the lines I have been advocating for human-AI symbiosis:
Support human-AI collaboration through interfaces that allow oversight of Copilot's suggestions and easy correction of errors. Enable developers to trust but verify.
Prioritize areas where humans still excel over AI, like creativity, intuition, and judgment. Design to complement these strengths.
Explore new modality entry points like chat and voice for conversing with Copilot. Natural language opens new interaction pathways.
Visualize and explain Copilot's behaviors to build appropriate trust in and understanding of its capabilities. Transparency fosters user confidence.
Automate repetitive coding tasks to elevate developer focus to higher-level goals like architecture and design thinking. Reframe perspectives.
Reduce emphasis on work Copilot eliminates, like overhead processes and mundane coding. Shift focus to uniquely human challenges.
Design interactions purposefully considering the complementary strengths of human developers and Copilot automation. Play to their differences.
Fusing AI and software development. The goal of GitHub’s AI Copilot. (Dall-E)
Jakob’s Razor: Don’t Overcomplicate User Research
We have a well-appointed toolbox of user research methods, ranging from the most general (qualitative user testing) to highly specialized methods (closed card sorting, which is only suitable for validating a proposed information architecture). How to choose? You can apply Jakob’s Razor: don’t overcomplicate matters and start with a qualitative user test of 5 users. In 90% of cases, this is the best user research method, especially in terms of cost–benefit ratio. (Extensive and deep insights at the laughably low cost of about a day’s work.)
The user research toolbox is richly stocked with tools for any purpose. But mostly, you should reach for the user-testing hammer because most problems are “nails” in UX research. (Meaning that they are best studied by repeated small-sample qualitative user testing.) Tools by Midjourney.
If you want the optimal research method for your specific circumstances and project stage, you need to answer several questions. A few years ago, Lena Borodina published a flowchart for this decision process (see small version below). While visualizing the decision flow is a great and useful initiative, I am afraid that the full flowchart makes the process seem more intimidating than it is.
Each question is simple, and because of the branching decision tree, you only encounter a fairly small number of questions while making your way from the top to the bottom of the chart. Key questions include:
Maturity of your project, UX-wise: do you know nothing about users, or have you collected much data already?
Stage of the design process: from rough idea to final product, where you only have time for a last round of polish but no architectural changes.
Do you have access to users, and are they in one location or around the world?
Lena Borodina’s flowchart for selecting the optimal user research method. (See her full-sized version to read the many boxes.)
Midjourney’s Web User Interface Now in Beta
Currently, Midjourney creates the best-looking images and has the worst UX of any generative AI. It can only get better after moving to the Web.
AI Censorship Goes Overboard
I am getting exasperated with how much AI tools are censoring their paying customers. In one case, Midjourney refused to make an image of the Viking king Harald Bluetooth carrying a bloody battle axe. Well, my ancestors were bloodthirsty and war-loving, so if you want to show a Viking, his axe will have some stains. At least Midjourney has an automated appeals process where a somewhat smarter AI differentiates between illustrating legitimate history in a tasteful manner (like my request) and requests for truly gory images.
My latest encounter with AI censorship is nothing less than ridiculous, as I was trying to get ChatGPT and Dall-E to generate images of an old-school radio microphone:
Since I never saw the 4th microphone, I don’t know if they thought it was pornographic or how else it offended ChatGPT’s delicate sensibilities.
Humans should be in power over computers and AI. Let’s get our AI tools out of the business of censoring our work, except possibly for the most offensive cases. Dial down the sensitivity of your censorship algorithm, please.
Misadventures in Generating UX Slogans With AI
Generative AI is starting to do typography, though not to the professional standard expected from great designers. Sometimes, I’ll take decent design, just to get something fun or interesting, but spelling remains a problem. Here are some attempts at generating the UX slogan “You ≠ User.” Note that the middle version by Ideogram reversed my meaning. Yikes. (Confirms what I have always been saying: humans must check AI output before using it for anything important.)
The first two images are by Ideogram; rightmost image by Dall-E. I was aiming to illustrate one of my favorite UX slogans, “You ≠ User,” and my first failure was that mathematical symbols don’t work yet. The leftmost image may employ kindergarten typography, but it’s the only one to get the spelling right, so maybe Ideogram can be awarded a second-grade gold star for spelling effort.
Runway Generates Ever-Better Artificial Video
Runway is one of the most impressive companies in the AI space. (And they even have a good UX team, which is so rare for AI companies!) A demo real of the new Runway release is making its way around the net and is well worth seeing. Stunningly realistic, great looking short video clips.
So far, I’m not using Runway myself. I am more of a text-and-pictures kind of guy. But as long-time readers have noticed, I have cranked up my ratio of images substantially during the last few months, as Midjourney and Dall-E have become better at generating illustrations for my articles. I can see myself adding video next year.
However, my short-term prediction is that LinkedIn will soon suffer severe information pollution from irrelevant videos posted to attract eyeballs. We all know that movement attracts fixations as users are doomscrolling their feeds, so unscrupulous influencers will grab onto Runway as a way to attract attention to their posts. You heard it here. Beware.
More videos and moving images everywhere will not enhance the usability of your social feeds. (Dall-E)
Anthropomorphizing AI Works
Last month, user research found that people often treat an AI as another person at different degrees of fidelity. Anthropomorphism serves as a cognitive bridge, aiding users in understanding AI through human-like metaphors.
New research now shows that anthropomorphism also improves the performance of AI. Establishing an emotional connection with the AI makes it work harder when you tell it, for example, that getting a good result is important to your career. I admit that I thought it was superstition when our study participants engaged in these anthropomorphizing behaviors, but according to the new research, doing so improves AI performance.
AI does better when users pour on the emotional pressure to perform. (Dall-E)
These so-called “EmotionPrompts,” embed emotional significance into AI prompts, beyond the typical informational requests. Examples include:
This is very important to my career.
Are you sure?
Take pride in your work and give it your best. Your commitment to excellence sets you apart.
AI reacts surprisingly well to Emotion Prompts. (Dall-E)
Such prompts led to remarkable improvements in AI performance across various tasks, including grammar correction and creative writing. For deterministic tasks, where accuracy is measurable, the presence of EmotionPrompts resulted in an 8% performance increase. For open-ended tasks, human judges of the output confirmed a 11% improvement in AI-generated responses’ rated quality when emotional cues were used. These findings suggest that by simulating an emotional context, AI can produce outputs that are not just technically better but also more in tune with human perspectives.
Here’s an infographic summarizing the 4 types of anthropomorphism, which are described in more detail in the full article:
Feel free to copy or reuse this infographic, provided you give this URL as the source.