top of page
Writer's pictureJakob Nielsen

UX Roundup: AI Songs | 5-AI Combo | Jakob Live Thursday | AI User Interviews | 100 Articles

Summary: AI-generated songs: Long vs. short | Combining contributions from 5 AI tools into one design | Jakob live on ADPList this Thursday | Conducting user interviews at scale with AI | Jakob has published 100 articles since May 2023 reboot

 

UX Roundup for February 19, 2024: focus on AI music. (Midjourney)


AI-Generated Songs: Long vs. Short

I am presenting live with Sarah Gibbons on ADPList on February 22. (See subsequent news item.) To announce this event, I made this song (1:29 min.):



I regret making this a 4-verse song lasting 89 seconds. Users viewing the song on LinkedIn only listened for about 25 seconds on average. (Some would listen all the way through, but many would listen for only maybe 10 seconds.) Suno’s default song with 2 verses lasting 40-60 seconds would have been a better choice and might have generated longer mean viewing times by having a less discouraging progress bar.


My normal guidelines for Internet video recommend 2 minutes as a good length that will retain many viewers, whereas 4 minutes is pushing it unless you have exceptional content. However, this is an AI-generated song, not a Chuck Berry tune. The truth is that AI is still not good enough to create great songs. A decent song will do when it has highly specialized content that is intriguing for a narrowly targeted audience (e.g., UX freaks). 


Here’s a shorter two-verse song I made very quickly, using ChatGPT to shorten the original lyrics and Suno to set it to music:


Watch on Instagram (you probably have to click the unmute button): https://www.instagram.com/reel/C225nEPOFAH/


My recommendation: until AI-generated songs get better, stick with very short songs.


Until AI songs approach the quality level of human songs, they should be kept short and focused on long-tail style domain-specific content. (Midjourney)


Combining Contributions from 5 AI Tools into One Design

As you can see from my example of a short song above, the default music videos you get from Suno have very busy backgrounds that make it hard to read the lyrics. Also, I thought the default video for my long video was too boring to keep viewers’ interest for those long 89 seconds. Therefore, I decided to create my own video. This, I should never have done.


I first made the lyrics with ChatGPT, simply asking for a 4-verse rock song about that ADPList session. I provided ChatGPT with the basic bios of the two speakers from the event announcement, as well as a short description of the event. I only needed to make a few edits, and I had basic lyrics for a good song. So far, about 3 minutes were spent, mostly because I needed to reprompt for better rhymes and still make a few edits.


I imported the lyrics into Suno and — using my principle that ideation is free with AI — made 16 different versions of the song with different styles. (“Lively rock,” “1950s rock and roll,” “classic rock and roll.”)  Like most current AI systems, Suno is slow, so mainly, I did email while waiting for new songs to be ready for a listen. This took about half an hour (not counting waiting time away from Suno). Some songs were so bad that they could be discarded after 10-20 seconds of listening. Others included bugs, like singing “UX” like one syllable instead of pronouncing the two letters separately.


Song composition with Suno is a breeze: it doesn’t quite make you into a K-pop idol, but almost. (DallE)


I also made a few illustrations with my favorite AI graphics tools, DallE, Ideogram, and Midjourney. These were versions of images I had made for other projects and thus only took around 20 minutes more.


In total, I spent less than an hour creating the digital assets for my song project. While not too bad, this pales in comparison to the few minutes needed to get a default song, particularly if you’re willing to let Suno write the lyrics as well as compose and perform the music. In my case, I wanted more specifics in the prompt than Suno supports and more control over the details of the lyrics, which is why I turned to ChatGPT for this component.


So far, so good. The real horror is the 3 hours I needed to stitch the assets together into an actual video using Adobe Premiere Elements. I could actually have completed the project in less than an hour, except for one little detail: I wanted subtitles. It’s beyond me that an expensive video editor from a high-end software provider like Adobe does not have a dedicated subtitle feature: it should be easy to recognize the words in a video and create initial subtitles that could then quickly be corrected by the user for accuracy and tweaked for design if so desired.


The current video editing user experience is a usability abomination. This conceptual image I made with DallE captures the user’s feelings, if not the literal reality.


However, Adobe Premiere Elements doesn’t do captions. One has to manually produce each line, color-correct it to be legible on the shifting backgrounds of a video, and endlessly adjust the timings when each line should appear and disappear. I spent about two hours just on the subtitles. I only did this because it’s strongly against my principles to publish a video without subtitles.


A video professional could have completed this editing job faster, but one of the key goals of AI-generated songs is to populate the long tail of domain-specific content. Thus, we should assume that most song creators will be specialists in some other topic and not dedicated video pros.


Lesson: The AI part of this project worked, and it was easy to gather components from 5 different AI tools, utilizing the relative strengths of each. You need to have a well-stocked toolbox of AI tools because they can each do different things. (Sadly, the old-school non-AI software failed me; next time, I may try CapCut.)


One AI tool does not fit all. You need a well-stocked toolbox so that you can reach for the best AI service for each project — or for each component of a project. (Midjourney)


Live on ADPList This Thursday

I will appear in a live discussion with superstar designer Sarah Gibbons, hosted by ADPList, this Thursday, February 22, 2024 at 10 AM USA Pacific time. (See the corresponding time in your time zone.)


🎟️ The event is free, but advance registration is required 🎫



Conducting User Interviews at Scale With AI

User interviews are a powerful data collection method, especially during early stages of a project where you don’t have a design prototype that can be the basis for user testing. The downsides are the time it takes and the need to synchronize the participant’s and the interviewer’s availability.


A new service by Outset offers to conduct user interviews by AI, which eliminates both of these problems. Even better, interviews can be in text form or by video, which is likely superior because most people don’t like to type. Asking questions through video and recording the answers will likely elicit more in-depth answers.


I say “likely,” because I don’t actually know how well this service works. If you give it a try, please let me know in the comments or by email how it goes.


AI can conduct and analyze user interviews at scale, turning qual data into quant. (Midjourney)


Outset claims to be able to generate follow-up questions through AI. I am a bit skeptical about whether these follow-up questions are as good as those from a skilled human interviewer. (Though, realistically, many companies don’t have a skilled interviewer on staff, even if they have a good user research team, because interviewing is a fairly rare skill.)


I am more optimistic about another touted application of AI for these automated interviews: transcribing the answers and auto-classifying them into categories and count how many respondents provided each answer. This automated processing will allow for user interviews at scale, with hundreds or thousands of respondents if you can recruit that many. This again transforms qual data into quant which is one of the true benefits of using AI in user research.


To reiterate: this is not an endorsement of Outset. I simply think their service sounds very promising. Please do let me know what you think.


In the future, I envision customizable settings for AI interviewers to change their interviewing style, depending on the needs of the current research project. Do we need a completely objective and neutral interviewing style, or will it be better to have AI empathize with the user to elicit more emotional answers? It may be harder for human interviewers to change their style and keep a consistent approach through a series of interviews, so a style setting could become a competitive advantage for AI-driven user interviews.


100 Articles Published

This newsletter is my 100th article published since I started writing again on May 17, 2023, after a 10-year hiatus where I suppressed my creativity in favor of growing the business of a consulting company.


All the 100 articles I wrote during this 279-day period are published on uxtigers.com, whereas the newsletter has slightly fewer issues since I didn’t start the email channel until June 21, 2023.


Celebrating 100 UX articles written in 279 days. About 180,000 words or two full books. (Midjourney)

Comments


Top Past Articles
bottom of page