UX Roundup: Latent Affordances | Automating Competence | Entry-Level UXR Job | Direct Manipulation | One-Shot AI Slideshow | Veo 3.1 | $100M Avatars
- Jakob Nielsen

- Oct 20
- 12 min read
Summary: Latent affordances | Automating best-practice dissemination by having AI learn from top staff | Entry-level user researcher job opening at LEGO | Direct manipulation: the opera | One-shotting a narrated slideshow with Google’s NotebookLM | New video model Veo 3.1 | Avatar company HeyGen scores $100M annual run rate

UX Roundup for October 20, 2025: Many of this week’s news items relate to the process of going from idea to video with AI. (GPT Image-1)
Latent Affordances
New explainer video about latent affordances as a way to hit the middle ground for discovery of abundant features between overly loud design that shows perceived affordances for everything and overly quiet design that hides everything inside an empty chat box.
I received another of the infamous notices of violating Google’s supposed “community” standards when I tried to create a B-roll clip of Goldilocks, the fairytale character, for this video. (Goldilocks famously thought that one bowl of porridge was too cold, one was too hot, but one was just right!)

Goldilocks banned by Google. Luckily, the Chinese Seedream model is less censored than the American Google, so it had no problems rendering Goldilocks for this illustration. (Seedream 4)
One more example of AI censorship gone overboard. Hard to think of anything more wholesome than Goldilocks and her visit to the three bears’ house. (Luckily, I was able to modify the prompt to bypass the “community” censorship, so my final video does include the B-roll clip I had envisioned.)

It’s time to stand up for our heritage and insist that AI doesn’t oppress it. Humanity’s heritage is bigger than the platitudes that dominate certain Silicon Valley circles. (Seedream 4)
Automating Competence
At its recent DevDay, OpenAI showcased some of its internal AI tools:
GTM Assistant (Go-To-Market) for helping salespeople close deals
OpenHouse to help employees locate other staff with specific expertise
Support Agent to resolve customer support trouble tickets
These are not products, but interesting because they show what a leading-edge AI-Native company can do to improve its internal processes. In particular, these tools are based on the idea of codifying best practices. Instead of teaching an AI system generic rules, OpenAI is encoding what top human performers already do, then scaling that behavior.
OpenAI calls this “best-of behavior distillation.” The system identifies the habits and phrasing patterns of the company’s top performers and models them as templates. In essence, it learns how experts communicate, not just what they say.

“Best practices” have become a trite management-consulting term, because there’s so much more to excellent performance than a checklist of guidelines. However, AI is now becoming capable of capturing more nuance, such as why top performers do something, rather than just the specifics of what they did. (GPT Image-1)
If Sophie, a high-performing account executive, consistently closes deals using a particular demo narrative, that approach becomes part of the assistant’s internal playbook. AI analyzes how she phrases emails, how she frames demos, and how she sequences follow-ups; and then makes those patterns available to the rest of the team. New hires can instantly learn from Sophie’s success patterns without endless mentoring sessions.
This technique of codifying tacit knowledge (what management theorists call “organizational learning”) turns informal craftsmanship into reproducible skill. It’s the difference between “shadowing a master” and having the master’s playbook on tap through AI.
OpenAI begins by identifying its most effective people and capturing their implicit knowledge. That knowledge becomes structured data, comprising patterns, templates, best practices, and contextual cues. The AI agents then generalize those behaviors to the rest of the organization.
The process resembles what management scientists call benchmark replication: observing what top performers do differently and turning that into operational policy. But unlike traditional process manuals, these AI tools operate dynamically. They learn from every interaction and can adapt the benchmark as circumstances change.
In almost all jobs, a few employees vastly outshine all the others. While “shadowing the master” is a way to transfer some of their skills, this old apprenticeship-like approach puts a heavy burden on those masters who are your top producers and should be focused on producing even more. In contrast, AI is relatively cheap (and will soon be dirt cheap), so companies can afford to repeatedly prompt more middling employees with mastership lessons.
The Support Agent applies similar principles to customer service. Traditional customer support systems use scripted flows or static FAQs, which is why customers with a problem usually resent being routed to a computer rather than a human agent. OpenAI’s version introduces continuous learning loops: every time the system interacts with a user, it evaluates the quality of its response and updates its internal models accordingly.

Customer support agents juggle so many tasks that even human agents usually provide bad service. However, the best agents shine, and AI can learn what works and scale it across automated support systems. (GPT Image-1)
The process works through evaluation loops: periodic sampling of interactions to assess relevance, tone, and problem resolution. When the agent’s answer falls short, human reviewers flag it, and the correction becomes training data for the next iteration. Over time, the system gets better at predicting what users actually mean, not just what they literally type.
This approach turns customer support into a closed feedback system. Instead of relying solely on prewritten macros, Support Agent adapts based on outcomes, like whether a customer left a positive rating or whether a ticket required escalation. OpenAI tracks classic performance indicators such as deflection rate (how many issues are resolved automatically), time-to-resolution (TTR), and customer satisfaction (CSAT). Each metric becomes a control lever for refinement.
All three tools share a unifying architectural idea: instrumented feedback loops. Each system measures its own performance and feeds the data back into improvement cycles. In machine learning terms, this is known as reinforcement through evaluation. In organizational design terms, it’s continuous improvement at machine speed.
The advantage is that these loops keep the systems aligned with business goals. For instance, if Support Agent’s helpfulness starts to drift (say it begins answering quickly but inaccurately) evaluation sampling would detect the problem and adjusts the weights accordingly. If GTM Assistant’s recommendations become outdated due to a new pricing model, the system would identify inconsistencies between the top reps’ updated documents and its old playbooks, prompting retraining.
This is where OpenAI’s “AI on AI” philosophy shows up most clearly. The tools don’t just execute tasks: they study their own outputs, benchmark them, and evolve. Each becomes a living experiment in applied AI governance.
AI becomes organizational infrastructure. Instead of automating jobs, such AI tools are beginning to automate competence. They’re turning intangible know-how into tangible systems. What OpenAI demonstrated at DevDay wasn’t just a suite of internal tools, it was a new discipline: AI Operations.
Remember that OpenAI has access to internal AI tools that are not as limited by restricted compute as the tools they give us peons. This means that these internal tools could be as much as a full year ahead of what you can build. AI is measured in a kind of “dog years” where one AI year is worth more than a full decade’s worth of advancement in non-AI technology. Thus, OpenAI’s advantage from having direct access to next-generation AI cannot be overestimated. Still, we will all have this level of AI in a year, so it’s time to start planning and designing the tools now.
Entry-Level User Researcher Job Opening at LEGO
Entry-level jobs are getting to be rarer than hen’s teeth, but LEGO has announced an opening for an entry-level user researcher in Copenhagen, Denmark. Desired experience is listed as, for example, having had an internship in user research. Application deadline: October 22.
Copenhagen is my hometown, and it has a high quality of living, but beware that (as I say in one of my recent songs) it is cold, wet, and dark half the year.
I love that LEGO lists a discount on LEGO purchases as one of the employee benefits. To be honest, if you don’t have a playful nature, this is probably not the job for you. When I visited the LEGO Copenhagen office, I was struck by the absolutely fun and charming work environment. And great people, which is more important if you’re one of those entry-level folks, because you will learn much more from your colleagues in your first job than you ever did at university, which remains terrible at teaching useful UX skills.

LEGO usability study: This could be you, since they currently have a job opening for a junior user researcher. (Seedream 4)
Direct Manipulation: The Opera
I made an opera aria about Direct Manipulation (YouTube, 4 min.). I requested the music to be “in the style of a Mozart opera,” but I must admit that AI music is not yet as good as Mozart. Maybe in 10 years? For now, even though I’m a big Suno fan, it’s really only great at rock music, country, and some jazz. All genres where it has plenty of training data. However, this operatic aria is better than what I could generate only a month ago (which I didn’t publish, because it was not good enough).

Direct manipulation is a key interaction technique. We now have an opera about it. (Or at least one aria.) (GPT Image-1)
For more in-depth understanding, read my full article about Direct Manipulation from 2023. Even though this article is more than two years old, it’s still valid, because these GUI design principles don’t change. (Though they will become less important as traditional UI fades into the background and users mainly interact with their AI agents.)
Direct manipulation is a GUI interaction style where users engage with visual objects on screen as if handling physical items in the real world. Instead of typing commands or navigating menus, users directly interact through pointing, clicking, and dragging, receiving immediate visual feedback. Examples include dragging files into folders, resizing windows by grabbing edges, or moving sliders to adjust values.

Direct manipulation translates real-world skills and knowledge to the computer world and greatly strengthens UI metaphors. (GPT Image-1)
This approach increases usability by reducing cognitive load and making interfaces intuitive. Users don’t need to memorize commands or translate intentions into abstract syntax. Immediate visual feedback enables learning through experimentation and reduces errors. By leveraging people’s understanding of physical objects, direct manipulation creates natural, predictable interactions that help users build proficiency quickly through spatial and muscle memory.

(GPT Image-1)
Direct manipulation was enabled by the computer mouse, which is the fastest and most precise way of moving icons and drag handles around on a computer screen. Today, the largest number of direct manipulation operations are performed on touch screens with one or two fingers as the pointing device. While two fingers (for example, for pinch-zooming) are superior to the mouse’s single selection point for some operations, single-finger touches are awkward compared to mouse point-click-drag operations. Direct manipulation is such a superior interaction technique that it has survived this shift to an inferior pointing device.

Snapping is one way of alleviating the fat-finger problem when using direct manipulation on touchscreens. (GPT Image-1)
Listen to Direct Manipulation: The Opera.
One-Shotting a Narrated Slideshow with Google’s NotebookLM
Google’s NotebookLM has updated what they call “video overviews,” though the feature is better described as a narrated slideshow, because it only shows still images with a voiceover.
They now use Google’s cutting-edge Nano Banana image model to generate the slides, and you have a choice of visual style, including watercolor and “heritage” (old-fashioned illustration style). You can also choose between short (about 2 minutes) and long (6–8 minutes) overviews.
That’s it! Pick the illustration style and long/short, and those two clicks later (plus a very long wait), you have a complete video based on your source material. No editorial work needed. No cinematography, manuscript editing, or B-roll production. These videos are a one-shot deal. Narration and illustrations are amazingly on point.

One-shot video making with NotebookLM: choose the style and length, and that’s all the human input needed to make a video. The wait for the video to generate is very long, though. (GPT Image-1)
I tried making three different slideshows about my recent article “Slow AI: Designing User Control for Long Tasks.” This is a lengthy article, running 7,400 words, and I must say that NotebookLM effectively extracted the key points. Even the 2-minute overview provides the viewer with a fair idea of my main points, at only 6% of the word count.
Brief slideshow explainer, in watercolor visual style (2 min., 407 words)
Longer slideshow explainer, in heritage visual style (8 min., 1,430 words)
For comparison, watch the music video I edited myself, based on that same article, but with close attention to both the script and the B-roll design, rather than leaving all the editorial decisions up to AI. (YouTube, 4 min.)
There is a lot to be said for one-shotting a video fully with AI: no human effort needed (other than writing the original article, in my case) except for having the agency to will it into being. The NotebookLM automated slideshows are very good (especially the voiceover, which does beat ElevenLabs in emotional projection).

The reason one-shot video creation works in NotebookLM is that the Nano Banana video model is crazy good at prompt adherence. That said, I suspect Google of throwing additional compute at the banana for these videos (which is why the response time is so slow), because the images I generate myself with this model have not been nearly as perfect. Since there is no human editing of NotebookLM’s videos, every single slide has to be right, or the entire video is for nothing. (GPT Image-1)
As a creator, I must say that it’s less satisfying to simply click and be done. It is more fun to curate the B-roll clips, for example, even though I definitely appreciate not having to spend days on location for a few seconds of a fisherman steering his boat. And, of course, the cost savings make all the difference between the video actually being made or remaining a figment of the imagination.
On the other hand, I see many AI influencers posting advice on how to set up a fully automated workflow to generate social media posts with no human input. Since virality is somewhat of a crapshoot, producing 20 (or a thousand) different videos from the same one idea will improve the probability that one of them goes viral and collects millions of impressions and thousands of dollars in revenue share. That’s not my game, however: I create for the joy of it, not to make money.
Veo 3.1 Released
Google’s video model is now out in version 3.1, touting slightly improved image quality and a few new features.
As a short test of character consistency, I made a video montage where the Greek goddess of love, Aphrodite, explains usability and Jakob's Law of the Internet User Experience (YouTube, 41 seconds).
I generated a photo of Aphrodite with Grok and used this as an “ingredient” for the 5 Veo 3.1 clips that were cut together for this montage. The goddess looks the same throughout, but her voice changes, so we still don't have full character consistency.
The Veo “ingredients” are visuals that you upload to be included in the video generation. This is different from the older “start frame,” which simply made a video begin with a still image you uploaded (if you watch my recent music video, you’ll see that all the dance sequences are generated with the same start frame.). In this case, I could use the same photo of Aphrodite in several videos, but then upload different accessories (such as a handbag from my brand) to have her interact with them in different clips.
I originally wanted Aphrodite to appear in the video with her little son, Cupid. (Who is traditionally depicted with the bow and arrow that he uses to make people fall in love.) Unfortunately, Google does have overly strict censorship, so it doesn’t allow you to upload photos of minors for use in videos. Of course, Cupid is more than 2,000 years old, but as one of the immortal gods, he still looks young.
I made a compilation video comparing Sora 2 with Veo 3.1 rendering variations of a recent viral AI video idea (Instagram, 2 min.). Despite the extensive hype for Sora, I think Veo held up well. Sora goofed frequently in these clips (and I’m only showing you the best).
HeyGen $100M ARR
If you’ve watched my videos, you may have noticed that I animated most of them with HeyGen, which I think is likely the best current avatar tool. (For example, Sora 2 is useless as long as one cannot upload realistic-looking photos to be used as the avatars.)
HeyGen recently announced that it has reached an annualized run rate (ARR) of $100 million. That’s real money, showing that AI avatars are a real business satisfying a real customer need. HeyGen went from $1 M to $100 M in 29 months, which is fast, but fairly typical for the better AI companies.
Congratulations to Joshua Xu and his team for this achievement. The product is certainly not perfect yet, but compare the avatar video I made with HeyGen in December 2024 with one I made in early October 2025. Immense quality progress in only 10 months. These guys are executing! The only moat is speed.

One of the most popular sayings in Silicon Valley right now is that “the only moat is speed,” where the most is a metaphor for defending against competitors. With AI, anybody can build anything, but doing it faster will win. (Nano Banana)
Is HeyGen perfect? Not yet. Every time I make a video, I have a list as long as my arm with details that it messed up. For example, in the opera video I introduced above, the lip synch was poor while the avatar is singing in the operatic convention of melisma, where a single syllable is drawn out and sung while moving through a succession of pitches. (A prime example being Händel’s Messiah.) Admittedly not the main use case of corporate avatars, for which HeyGen was mainly developed.
(For another blooper in my opera, check the audience seated in the rearmost rows during the ovation at the end of the video — though that’s the fault of Kling 2.5 Turbo, not Heygen’s Avatar IV.)
While there are still bloopers in current AI content, the progress during 2025 has been astounding. Projecting forward, 2026 will be stunning.



