Summary: Educational use of AI-generated songs | Weakness of manual UX design | New in-depth course on AI and human-computer interaction | UX certification programs integrating AI | Filmmakers attempt to make AI videos with Sora, to disappointing results
UX Roundup for March 29, 2024. Happy Easter. (Midjourney)
Educational Use of AI-Generated Songs
Suno has launched a teachers’ program, granting free song-creation credits to educators. Suno is currently the leader in AI-generated songs. You can paste your own lyrics and have it set them to music, or you can let it create the entire song (lyrics, music composition, and performance) based on a short description of what you want.
Here are two songs I just made in a few seconds, both based on Jakob Nielsen’s 5th usability heuristic, error prevention:
Danger no More, a rather literal interpretation of the heuristic
Safe and Sound, a more metaphorical interpretation of the heuristic
It’s certainly motivational for children to get the instant gratification of listening to a song about their prompt, less than a minute after entering it. Since Suno creates two songs for each prompt, it could also lead to great classroom or group discussions about the relative merits of the two, including different interpretations, as in my example.
In its X post announcing the new teachers’ program, Suno mentioned some other ideas for using song generation in schools: 🧠 a custom song about calculus (similar to my example) and 🎂 personalized songs for student birthdays. Both are great examples of AI-driven multi-craft specialized content and the changing nature of authorship in the AI era.
Generating songs with AI is likely to have many educational benefits, especially for smaller children. However, I can easily see it used at a college level as well, for example in discussing songs about my usability heuristics. (Midjourney)
Weaknesses of Manual UX Design
I just came across an article about a project at Salesforce four years ago called Einstein Designer. This project aimed at producing individualized user interfaces through Generative UI. First, it’s good to be reminded that generative AI didn’t start with GPT 4 in March 2023 (or even GPT 3.5 in November 2022). Others were working on this class of useful AI products before then, even if more as experiments than as products.
Second, Einstein Designer showcases some of the potential for Generative UI, in moving beyond what we can do with manually-designed user interfaces. In a recent article, I mentioned the likely benefits for disabled users from moving to Generative UI instead of relying on the expensive (thus low-ROI, thus usually neglected) current approach to accessibility.
Lazy human designer, busy AI work crew. Metaphorical depiction of the difference between manual UI design and Generative UI design. (Ideogram, followed by Leonardo upscale)
The problem is that manual design is expensive, meaning that too few variations get explored. We know that experimentation with a profusion of design alternatives is the way to usability, so we need to get more design variations made more cheaply.
The Salesforce project published a striking demonstration of the downsides of expensive manual design. In trying to quantify current best practices for web design, they analyzed the websites of about 1,000 big websites. The font sizes used were predominantly even sizes: 76% of text had an even font size, whereas only 24% had an odd number as its font size.
Font sizes between 11 and 20 were the most common in the Salesforce sample of 1,000 big websites. The extreme bias in favor of even font sizes is obvious when color-coding the chart with different colors for odd and even. It begs any reasonable belief that 14 and 16 would be as superior to 15 as indicated by the empirical evidence for what manual design produces. (Data source: Sönke Rohde’s article about the Salesforce Einstein Designer project, June 23, 2020.)
This extreme bias in favor of even font sizes is almost certainly the result of laziness on the side of the web designers. Of course, what I call “lazy,” is really just the optimal allocation of time in a manual design project. It’s too expensive to experiment with all font sizes, so the designers stick to the ones included as choices in the font-size menu. The benefits of changing the font size by one point are relatively small, so most designers don’t bother.
Lazy design does leave money on the table, though, and the hope is that when Generative UI makes design cheaper, we will get more fully-optimized designs in the world.
Generative UI may result in equal rights for odd font sizes, making it more likely to ship profit-optimizing design, rather than giving preferences to the design choices that are easier for manual design. (Ideogram)
New In-Depth Course on AI and Human-Computer Interaction
During Year 1 of the AI era, we suffered from a dearth of training options about the human side of AI. (There were always plenty of courses on the machine side of things.)
Luckily, things are looking up in Year 2 of the AI era: for last week’s newsletter I was able to find 4 courses on AI & UX that I can recommend. Just a week later, I’m now able to add a 5th recommended course:
May 15, 2024: In-person full-day course at the CHI conference in Hawaii: Human-Computer Interaction and AI. ($75 addon fee, if already attending the conference.)
This new course suffers from an embarrassment of riches in terms of speakers and expertise, with 5 super-smart instructors:
Daniel M. Russell, expert on sensemaking, adjunct faculty Stanford University, former Principal UX Researcher at Google and Senior Research Manager at IBM
Chinmay Kulkarni, Senior Staff Engineer at Google, UX for Gemini
Elena L. Glassman, Assistant Professor at Harvard University, focusing on big data
Hari Subramonyam, Assistant Professor at Stanford University, focusing on learning, creativity, and sensemaking
Nikolas Martelaro, Assistant Professor at Carnegie Mellon University, focusing on augmenting designers’ capabilities through new technologies
5 high-talent speakers crowding the podium: I worry about information overload, but you won’t be bored that day. (Ideogram)
Webinar with Jakob Now Online
Two days ago, I was on a live webcast produced by the UX Design Institute. Quick turnaround, because the recording is already available to watch on YouTube (64 min. video).
Four points I made in this webcast:
We will likely see a shift from focusing on conversion rates to focusing on loyalty rates, as SEO and search-fueled traffic decline due to users preferring AI-driven answer engines that will probably refer less than 1/4 of the traffic. Customer journeys will have a different starting point in the future.
In the century-long journey of UX from birth (1950) to mature discipline (2050), we’re about 30% of the way in terms of design quality. We’re 74% of the way in number of years, so I predict accelerating usability improvements, now that UX is getting cheaper and more widely deployed due to AI (see next bullet). We have 26 years to achieve the remaining 70% in usability.
Right now, knowledge workers (including UX’ers) get about 40% productivity lift from using AI. (Which is why you’ll absolutely deserve to be fired if you don’t acquire AI skills pronto.) In 5-10 years, this is likely to reach 100%, or a doubling of what each UX professional can achieve in a year. This does not imply unemployment (except for any obstinate creatures who refuse to change with the times). In fact, I predict there will be 3x as many UX jobs in the world in 10 years. Multiply these two numbers (2x productivity x 3x the staff) to get 6 times as much UX work done per year in 10 years. That’s how we’ll realize the higher quality growth predicted in my second bullet. The need is there, and as the price of UX drops dramatically, the demand will come, including from many new product categories we can’t even envision today.
As AI takes over more of the lower-level tasks in UX research and design, UX professionals can focus on higher-level problems, which will likely lead to higher job satisfaction.
Watch the show for the full story behind these bullets.
I was on a webcast expertly hosted by Rachael Joyce from the UX Design Institute. (Real photo, photo credit Michael Möller.)
UX Certification Programs Integrating AI Skills
Besides producing webcasts with me, the UX Design Institute has an interesting series of uncommonly-in-depth and detailed UX certification programs:
Professional Diploma in UX Design (130 hours, $3,950)
Professional Certificate in User Research (48 hours, $2,400)
Professional Certificate in Content Design (30 hours, $2,400)
I was very excited to see that all 3 certification programs include a module on how to use AI (for UX design, user research, and content, respectively). I think this is the way forward: to view AI as an integrated part of our professional skills, not as something strange or separate.
Even while I applaud the UX Design Institute for its forward-thinking curriculum design, I still think there’ll be a need for a few more years for more narrowly-targeted AI-specific UX courses (like the one I recommended above), as we upskill the many people who already have substantial UX knowledge and don’t want a big, integrated training program.
Integrated curriculum design (left) constructs a single large sheet of knowledge that gives students a firm conceptual understanding of the full picture. This is probably the best approach in the long run to teaching how to integrate AI with UX work. Separated curriculum design (right) cuts the knowledge into smaller, separate pieces that can be consumed (learned) individually. This may be a better way in the short run to add a specific knowledge item (such as how to use AI) to people who already have an understanding of most of the older knowledge elements. (Ideogram)
Filmmakers Attempt to Make AI Videos with Sora: Disappointing
OpenAI has released new 7 AI-generated videos made with its still-prerelease Sora product. The initial demos (stampeding woolly mammoth; miniature pirate ships fighting in a coffee cup) were intriguing and impressive in sheer video quality, exactly because they were unpretentious. They just wanted to show us a few seconds of amazing video that could not exist in the real world.
The new videos are pretentious and “artistic,” made by professional filmmakers. This equals boring. Nobody watches film school projects except for the professors who have to grade the art films, and these new videos were of an equivalent (lacking) entertainment grade.
The only video I liked was the first one about a man with a balloon for his head. The actual video quality was poor and made the balloon look like amateur-hour special clip-on effects, rather than an organic part of the person in the movie. But it was fun and told a bit of a story, surreal as it was.
My conclusion from watching one mediocre (good story, bad special effects) and 6 boring AI videos: AI video has huge potential, but for now stick to visualizing things that are inherently interesting with a purpose external to the video. For example, that woolly mammoth video would be great for educational use, particularly in elementary schools where it could bring the past to life.
I am also thrilled about the educational uses of video creation, as students will soon be able to make their own videos to illustrate topics they learn about in school.
Videos that people will enjoy watching intrinsically (as opposed to some extrinsic benefit, such as education) are much harder to create. We know that millions of people have the talent to make enjoyable videos that users watch for no reason: YouTube publishes a little below a million hours of fresh video content every day, and YouTube videos are watched for a cumulative one billion hours daily. (From which we can derive that the average video is watched about a thousand times.) TikTok users upload 34 million new videos daily.
While the quality of YouTube and TikTok videos are variable, to say the least, there’s effectively infinitely much good content for any given viewer and his or her tastes. (I.e., more than 24 hours of content you’ll like are published each day.)
We know humans have the talent to create watchable videos. Once the technical requirements for video production go away and are replaced by AI, I expect even more captivating videos that are worth watching. We will experience a flowering of multi-media-format creation, with specialized content for every taste.
Sadly, the current batch of OpenAI videos is not it.
I’ve seen complaints on X that the current version of Sora consumes about 12 minutes of H100 compute per minute of video generated, which some people interpreted as “only big companies can afford to create AI video.” Quite the opposite: even at current prices, this equates to about US $0.40 per minute of generated video. (Lambda Labs charges $2 per hour of H100 compute.)
Let’s remember that the amount of compute needed for a certain amount of AI tends to be cut in half every 8 months due to software improvements. (It’s still early days in AI software, so developers are still on the very beginning of the optimization learning curve.) Further, the new NVIDIA Blackwell chip uses about 1/4 the electricity as Hopper for the same compute. While power isn’t the only cost of AI compute, it’s a major part. The newer, denser chips will also require less data center space and less support staff.
Combine all these, and it’s not unreasonable to expect the cost of AI video generation to drop to 10 cents per minute in a year and 1 cent per minute in 3 years.
We know that AI creation is strongly iterative, as the creator revises the prompt and uses inpainting and other features to tweak the results. Initially, more iteration will be needed for video than for image generation, since prompt adherence will likely be weaker and automated aesthetics will be poorer. (But see how quickly Midjourney has improved image quality — expect similar results in video aesthetics with progressively less need for extreme iterations in the future.)
Right now, let’s say that 50 iterations are needed to produce a minute of video good enough to upload to YouTube and attract enough views to be worth your time. This will cost $20 per minute of published video.
In 3 years, I expect that the iteration need will drop to 10 iterations per published minute, equating to a cost of 10 cents per minute. (Dropping the cost by a factor of 200 in 3 years due to a combination of improved UX leading to fewer iterations, improved hardware consuming less electricity and other costly resources, and improved software requiring less compute for a given outcome.)
Right now, it’s certainly not a problem for even the smallest company (or a serious individual creator) to pay $20 per minute of published video. Manual video production is vastly more expensive, and yet companies upload millions of minutes of video every day.
10 cents per minute in 3 years, and even a high school kid can pay for as much video creation as he or she feels like unleashing on the world.
The floodgates of user-generated video (and other content formats) are about to open even wider with AI-augmented authorship. (Midjourney)