UX Roundup: GUI Heroes | Great AI Videos | AI Won’t Kill Software | Bing GEO | AI Capability Growth | Google AI UX | One-Shot Video | GPT 5.4 Writing
- Jakob Nielsen
- 3 minutes ago
- 16 min read
Summary: Using Nano Banana 2 to render the heroes of the graphical user interface as a monument | Two examples of great AI (short) videos | AI won’t kill software | Bing reports AI citations of your website | AI task horizons keep increasing | Google simplifies its AI product architecture | One-shot instructional AI video from NotebookLM | GPT 5.4 is probably a better writer than 5.2

UX Roundup for March 9, 2026 (Nano Banana 2)
Using Nano Banana 2 To Render the Heroes of the Graphical User Interface as a Monument
It’s a surprisingly hard challenge for current AI image models to render people as a granite monument. Here’s my best attempt to change Mount Rushmore into a monument to the heroes of the graphical user interface, using Nano Banana 2:

Heroes of the graphical user interface: Doug Engelbart (inventor of the computer mouse), Larry Tesler (modeless UI, cut-and-paste, the Lisa GUI), Steve Jobs (the Macintosh, touchscreen GUI), and Marc Andreessen (Internet GUI), rendered with Nano Banana 2.
Compare with the Mount Rushmore of AI leaders I made with Nano Banana Pro in November, and I think you’ll agree that Nano Banana 2 is a modest improvement over NB Pro, as I discussed in last week’s newsletter. The heads still don’t truly look carved out of a mountain, though, except maybe for Doug Engelbart.
(See my separate article for the full story of how the GUI came to be and a time-travel “class photo” of even more heroes of that saga.)
Great AI Shorts
Two good examples of what’s possible with current AI video models:
“The Bone Throne” by Lennard Smith (YouTube, 5 min.)
“Chloe vs. History” (Instagram channel)
Bone Throne is a short fantasy-style drama set in a desert village constructed around the skeleton of a truly impressive, gigantic dinosaur-like creature. The visuals are stunning, and the action scenes are compelling. (The lip-synch less so, and the dialogue has too many cliches to be compelling, though it works well enough.) It was made with the two Chinese models Seedance 2 and Kling 3, which are currently the leading video models. (I have planned several video projects that I want to make with Seedance 2, but I am still waiting on access.)
“Chloe” is a slightly annoying Instagram influencer (AI-generated) who posts travel videos recorded at various points in history, from the building of the pyramids to the black death to the French Revolution. I guess she’s annoying to me because I don’t like that influencer vlog style, but clearly many (presumably younger) users like the channel, since it’s racking up views. The historical settings are a great way of bringing the past to life, and should definitely be used in schools, where the vlog format will be attractive to many kids.

My mashup of “Bones” and “Chloe.” The monster skeleton is even more impressive in the film. Also, so far, Chloe only time-travels to real historical locations, not fantasy settings. (Nano Banana 2)
In my completely self-made combination of these two different AI video universes, the giant skeleton looks nowhere near as awesome as the one in Lennard Smith’s video. I tip my hat to his visual skills in making his skeleton. The image I’m showing you here is the best of 116 generations with Nano Banana Pro, Nano Banana 2, and Seedream 5, and it’s still not as good as Smith’s skeleton. (Of course, I don’t know how many images he went through.)
On balance, these two examples show that the best AI videos have now progressed to the point where they are enjoyable on their own merits, rather than just as demonstrations of an emerging technology. They are indications that McKinsey was right in predicting a $60 billion disruption to the legacy movie industry in the report I discussed in last week’s newsletter.
While I like these short-format AI videos and find them entertaining enough to watch (and recommend to you), they’re only good because they are limited to a few minutes. Technically, we could make a 100-minute feature film with AI today, and each scene would look great, but the story (plot and dialogue) would be insufficient to hold the audience’s attention for that long.
We need better writing for AI movies to truly take over. Sadly, recent releases of large language models are not living up to their name, because they have not emphasized language and writing. Instead, all the labs are chasing better software coding, mathematics, and scientific problem-solving, including medicine. I’m all in favor of STEM (I got my Ph.D. from the Technical University of Denmark, after all), and I expect to benefit from AI-accelerated drug discovery and improved medical diagnosis in a few years. But I also want better creative writing from AI.
Good stories and writing have always been the foundation of any form of filmed entertainment, and human-driven production is no guarantee of quality: the latest Star Trek series is basically unwatchable.

For AI film and video to meet its $60 Billion potential, the next generation of AI models needs better creative writing. (Nano Banana Pro)
AI Won’t Kill Software
The stock market valuation of major software companies has been declining each time new AI coding tools and agent services like OpenClaw and Perplexity Computer have made it easier to “vibe code” custom-made software.
Alex Immerman and Santiago Rodriguez recently wrote a good article arguing that, despite the many social media posts about this “SaaSpocalypse,” software companies as a whole won’t die. I don’t want to give you investment advice, because who knows whether the stock of any individual software company will be up or down next quarter. No doubt, several legacy software vendors will indeed collapse over the next few years as we accelerate toward superintelligence. But some will pivot hard enough to make it, and more importantly, new AI-native software vendors will emerge.
Overall, software is here to stay as a commercial venture. In that ultimate conclusion, I agree with Immerman and Rodriguez.

(Nano Banana 2)
As shown in the infographic above, the main reason to be optimistic about the software industry’s prospects (if not individual companies) is that when something gets cheaper, people buy more of it. There are endless unmet needs in the world because software used to be so expensive and convoluted to build. The total market for cheap software will be much bigger than the software market has ever been.
Furthermore, even though companies can vibe code their own solutions, and will do so for many specialized problems, the moats (protection against being outcompeted) are strong for many software companies. Some software has network effects, where the more it is used, the better it gets. Others benefit from proprietary data, making them combined data-code solutions. Finally, even though switching costs will be lower because AI agents will help customers move to new vendors, switching won’t be free, meaning that companies have an incentive to stay with vendors that treat them well and don’t overcharge.
Those two points (treating customers well and not overcharging) were routinely violated by legacy software vendors, creating substantial scope for creative destruction, where new AI-native startups replace those old software companies that can’t wean themselves off the old ways of doing business. Counterpositioning is a classic disruption strategy: as explained by Immerman and Rodriguez, “counterpositioning” is when the new company has a business model that is unattractive for the incumbent company to compete against. A typical example enabled by AI is charging customers for each problem solved, rather than the traditional SaaS model of charging by the “seat” (each authorized user).

There is a path from the current “port of panic” for software vendors to a future state of a vastly expanded software industry. (Nano Banana 2)
Bing Gives GEO Visibility Guidance
As users increasingly turn from traditional search engines to AI answer engines with their questions, the importance of having your brand and website represented in AI answers is growing. This is sometimes referred to as GEO (Generative Engine Optimization), an analogy to the old need for SEO (ranking highly in search results).
While it’s easy to measure the position of a website for relevant queries in various search engines, it’s hard to collect equivalent metrics for AI-driven answers. You can record on your own server how frequently the main AI services read your content. For example, on my own website (uxtigers.com), AI access has been increasing at an annualized rate of 1,257% this year. However, the clickthrough rate is abysmal, at only 0.5%, calculated as the number of visitors referred by AI services divided by the number of pages these services read.
However, we know users are reluctant to click through to sources while reading AI answers. They only do so when they are suspicious about an answer or when further details are very important to them. (And in the latter case, they are more likely to ask the AI to elaborate than to visit the source document.)
The flow is as follows: User asks a question → AI Scrapes (hopefully) content on your website for information that may help answer the question → AI (maybe) cites this content in its answers → The user (rarely) clicks on a citation in this AI answer.
The middle piece of the puzzle is missing: how many times does your content appear in the AI-written answers? AI services typically read 3–4 times as many sources as they cite for each user question, but you can’t estimate your AI visibility by dividing your AI scraping analytics by 3.5. The consult-to-citation ratio varies depending on how much each AI model likes your content.
Luckily, Microsoft’s Bing has released a tool to help you answer this question, in the form of a new “AI Performance” tab in the Bing Webmaster Tools. The new tool shows how many times a website is cited by Microsoft Copilot and “Partners,” and breaks these statistics down by individual pages. Bing also shows what they call “grounding queries,” which are the AI’s key phrases used to retrieve your content when it was cited. (This is often different from the user’s original question, and shows the AI’s interpretation of what it needs to locate and discuss in the answer.)
Unfortunately, Bing is distinctly a minority player. For traditional search, Bing has a market share of 11% in the U.S. and 5% worldwide (compared with Google at 84% in the U.S. and 90% worldwide). Among AI answer engines, Microsoft Copilot is estimated to have a 3.3% market share, compared with ChatGPT at 80% and Perplexity at 8%.

Microsoft is currently a small player in the AI space, but knowing how Microsoft AI treats your content is better than knowing nothing about how AI treats it. Just keep your eye out for contrasting data from the bigger set of AI services. (Nano Banana 2)
Thus, knowing about Bing is not nearly as important as knowing about Google, ChatGPT, and Perplexity. But knowing about Bing is better than knowing nothing, which is the current alternative. Micosoft’s AI is likely somewhat similar to the other AI services, since all the frontier AI models are roughly the same. Give Bing’s new AI Performance tool a try!
One website published a case study of how it was cited by AI according to the new Bing tool. The most striking result is that a single page accounted for 69% of this site’s AI citations and that the top four pages summed op 90%. The entire rest of the site only got 10% of AI citations. If your site shows a similar AI concentration, you might want to focus your GEO efforts on those pages, or at least that type of content. (This is in contrast to the common SEO advice to focus on vast masses of specialized “long tail” content.)

Limited data shows an extreme concentration in AI citations of website content. (Nano Banana 2)

Bing’s new AI Performance tool helps fine-tune your GEO strategy for visibility in AI answers. The two main caveats are that (a) Bing is a small player, so to the extent that other AI engines differ, the findings won’t generalize, and (b) that the strategy to focus on top content doesn’t have sufficient evidence yet to be a definitive best practice. But it’s worth trying. (Nano Banana 2)
AI Task Horizons Keep Increasing
The latest data from METR shows continued improvements in AI task horizons. The TH50 for February 2026 is now 12 hours, and the TH80 is 1 hour 10 minutes. These task horizons indicate the duration of expert human labor needed to perform a task that AI can do with a certain probability of success (50% and 80%, respectively).
Compare this with the December 2025 task horizons, which were TH50=6 hours and TH80=1 hour 6 minutes.
TH50 doubled in 2 months, which is above the recent trend line of AI capabilities doubling every 4 months. However, one data point doesn’t make a new trend. (Conversely, TH80 didn’t increase much with this single new data point. Again, not valid to generalize from.)
Why should you care about tasks where AI fails half the time? Because they are often worth doing with AI, as long as we keep human review followed by possible human execution as a backup option. (On the other hand, human review should be avoided for tasks that AI virtually always does better than humans, because human involvement will make things worse in these cases. However, this is currently only true for either very small tasks or very specialized tasks like driving a car.)
Let’s assume that a human takes an hour to review the AI performance of one of those tasks that requires 12 hours of human time to perform manually. In most cases, it is realistic to be able to review something in much less time than it takes to create it. For example, I spend less than a minute reviewing a set of 4 one-page comic strips that would probably have taken a human comic book artist around 4 hours each (or a total of 16 hours) to draw manually.
Under these assumptions, here’s how this task performance will play out, only counting the human time expenditure:
50% of the time: AI does the task correctly the first time. The human spent 1 hour reviewing it and concluding that it was fine.
25% of the time: AI fails the first time but succeeds on the second try. The human spent 2 hours reviewing both of these attempts.
25% of the time: AI fails repeatedly, and the human gives up and performs the task manually. In this case, the human spent 14 hours: two hours on reviewing the failed AI attempts, plus 12 hours to do the job manually.
The average human work time across these three scenarios is: 0.5 × 1 + 0.25 × 2 + 0.25 × 14 = 4.5 hours. Much less than the 12 hours it would take if a human performed the task manually every time.
There are obviously many other scenarios. If it takes 3 hours instead of 1 to review the AI results, the average human workload becomes 7.5 hours. Still better than 12 hours, but not overwhelmingly so.
On the other hand, we could also let AI have 3 attempts before we give up on it. In that case, the average human workload becomes 3.25 hours instead of 4.5 hours.
The question of how many times to let AI attempt the task before we give up will depend on many factors.
First, how long does the AI need to run to perform the task, and can we do something else while it’s running? If the AI is slow and the human can’t do other work in parallel, that counts in favor of allowing fewer AI attempts, because each one causes a delay.
Second, when reviewing the AI output, can the human judge whether the AI will, in fact, have a chance of doing better next time (possibly after tweaking the prompt) or whether this task is something that current AI is doomed to fail every time? If we think AI is doomed, then don’t give it a second chance! If we think a tweak will make a second attempt likely to succeed, then go ahead!


This news item summarized as an infographic and a comic strip. Which format do you prefer? (Nano Banana 2)
Google Simplifies Its AI Offerings
One of my 18 predictions for AI & UX in 2026 was that Google would have to simplify the UX architecture of its AI, which is a confused mess. This has now begun: Google has announced the sunsetting of Whisk and ImageFX, effective April 30, with their features moving to Flow.
If you don’t understand these three names and what each one was supposed to do for users, you are not alone!
Basically, all three services were slightly different ways of generating images, with Flow also generating videos. Unifying these into a single service is a step forward to better usability of Google’s AI by making its offerings easier to comprehend.
About time, yes. But also, Bravo Google for acknowledging defeat and making its AI a little more usable.
More is needed, though. This step transforms Google’s AI product architecture from a hot mess into a lukewarm mess.

Google's AI usability is getting better, but still needs more work. (Nano Banana Pro)
One-Shot AI Video
Google’s NotebookLM product is an undervalued AI service for making sense of large amounts of content. Last week, they added the feature to make a so-called “Cinematic Overview Video” of the content you store in a “notebook” (their name for a content repository the AI can act on).
For now, the supposedly “cinematic” effects are fairly primitive: a small amount of simplistic B-roll animation plus slides made with Nano Banana (Pro or 2, I don’t know) with a voice-over. However, these videos will likely improve every few months until they rival the best educational or documentary videos produced by the likes of the BBC and the Open University in their heyday.
To experience these new 100% AI video capabilities, check these two explainer videos I made based on my article “The Capability Maturity Model for AI in Design”:
Video with no style specified (YouTube, 6 min.)
Specifying a charcoal sketching style (Instagram, 11 min.)
To see how far we have come in one-shot AI video, compare these two new videos with one I made in May 2025 (only 10 months ago!) about two of my usability heuristics (YouTube, 2 min.). I hesitate to give you this link, because the old one-shot video is honestly terrible, and I only published it at the time as an experiment.

In a few years, all films and videos will be pure AI productions, including writing, direction, cinematography, sound, and postproduction and editing. Of course, all the “actors” will be 100% AI-rendered animations, but since humans like to watch other humans, I suspect that most AI videos will feature characters that look fully human, except for a few funny animal animations. (Nano Banana 2)
The two new videos NotebookLM made as pure AI creations, with only the underlying information provided by a human (and a style direction for the second video), are pretty good, but not as good as the videos I have made myself. For example, compare with “Old Professionals Stay Creative With AI” (YouTube, 3 min.) or “History of the Graphical User Interface” (YouTube, 5 min.). I am particularly pleased with the music in the first video and the B-roll clips in the second.
However, the new videos required 10 seconds of my expensive human time (other than writing the article, which was a separate project), whereas the ones I made myself required about 10 hours each. Seconds instead of hours make for completely different economics. If I had a monetized YouTube channel, it would now be extremely profitable to make explainer videos for anything I write. However, I create for fun rather than for money, and it is not as satisfying to merely push a button to make a video.
As an example of what’s now possible, here’s a great workflow if you want to learn something new:
Take out a Google Ultra subscription. Yes, this costs $200, but you’ll save that money on your very first learning project.
Ask NotebookLM to perform Deep Research on whatever you want to learn. When I did this for one UX topic, it identified 50 good articles, but instead of reading this overwhelming amount of content, keep going.
Have NotebookLM create a 10-minute cinematic overview video. Go away for 30 minutes while the video is cooking. When you return, you’ll often find that watching this video is all you need.
If you want more detail, ask it to synthesize an in-depth elaboration from the sources of anything you want to explore further. You can even make this synthesis a new document and select it as the basis for yet another explainer video if you don’t like to read.
For deeper learning, have NotebookLM create a quiz and flashcards. Maybe even a slide deck featuring funny animal characters. I must hand it to Google: it has created a wide range of AI-powered learning resources, so there is now no excuse for staying ignorant about anything you have ever wanted to learn.
As impressed as I am about the new Cinematic Overview Videos compared with previous one-shot AI video solutions, they can get much better. Obvious improvements include better B-roll, including more elaborate animations, as well as animating the entire video rather than the current extensive use of slide-show images. This may be enough for videos that are only intended for one-time personal use, when learning a new topic. Instead of investing time in improving a one-time video, you are better off spending any extra time requesting follow-up videos about details, such as the individual steps in a process.
It’s amazing in itself that it’s now feasible to talk about throwaway “one-time view” videos made for one person to watch only once!
However, for videos that you want to publish, and maybe even monetize, additional steps are required to step up the quality: I would want features for iterative design of a video, similar to the way I can shape a song in Suno by many rounds, rather than one-shotting its first song attempt. For example, identify a specific video segment and issue directions for modifications (shorter, longer, different in emphasis or style).

AI can now one-shot decent instructional videos with no human involvement. I do love my funny animals. (Nano Banana Pro)
GPT 5.4 Writes Better
There is no good benchmark for writing quality, so the AI labs keep chasing the math and programming performance benchmarks. The new GPT 5.4 model does indeed score well on these metrics and has already solved a math problem that eluded the world’s top human mathematicians for twenty years.
But I am more interested in AI’s performance on creative tasks, such as writing. (For image generation and video, AI was so primitive until recently that it only takes a glance to realize how much better the current models are than the ones we had a year ago.)
Lacking reliable metrics, we’re left with how we feel about AI-generated content. Many AI influencers say that GPT 5.4 is now the best writer among the frontier models. Even though I have not yet given it an extensive writing tryout, I tend to agree. For sure, GPT 5.4 is substantially better than 5.2.
The preceding 3 paragraphs were written manually by me (as was this paragraph, for that matter). As a test of GPT 5.4, I asked it for a more compelling rewrite. See how you like it:
[Over to GPT 5.4]
We still lack a credible benchmark for writing quality, so the AI labs continue to race after whatever can be counted most neatly: math scores, coding tests, and other metrics that fit cleanly into a leaderboard. By those standards, GPT 5.4 is already impressive. It posts strong results across the usual benchmarks and has even solved a math problem that resisted the world’s best human mathematicians for two decades.
That is noteworthy, but it is not what interests me most. I care more about creative performance, especially writing, because writing is where intelligence becomes visible to ordinary users. In image generation and video, progress has recently become so dramatic that no benchmark is needed. One quick glance is enough to see that today’s models are vastly better than the awkward systems we were using only a year ago.
Writing is harder to judge. There is no agreed-upon measuring stick, so we are left with the oldest evaluation method of all: human reaction. Does the prose feel alive? Does it hold attention? Does it sound as if someone meant what they said? On that more subjective but ultimately more important criterion, many AI influencers now argue that GPT 5.4 is the strongest writer among the frontier models.
I have not yet put it through a full writing workout, but my early impression points in the same direction. GPT 5.4 does seem to write better. And compared with GPT 5.2, the improvement is not subtle. It is substantial.

GPT 5.4 improves its writing skills. (Nano Banana 2)
