2026 AI and UX Predictions: A Mid-Year Reality Check

Jakob Nielsen
12 hours ago
30 min read

Summary: Halfway through 2026, AI is evolving faster than expected, but usability is struggling to keep pace. I graded my 18 predictions on autonomous agents, compute shortages, and interface design to separate the hype from reality.

In January, I published 18 predictions for UX and AI in 2026. I also made a jazz song about my predictions and a 10-page comic strip showing the predictions play out for a product design team.

My new comic strip about the predictions at the halfway point runs 30 pages and was made with GPT-Images-2. My recurring narrator characters Alice and Zimo are drawn with a charcoal style this time.

We are now halfway through the year, so let’s look at what I nailed, what I totally missed, and why usability still lags terribly behind.

Each prediction receives a percentage for how much of it has come true so far. Because only half the year has elapsed, a score near 50% means perfectly on track, with an expected score of 100% by year-end. Higher means ahead of schedule, and only scores far below 40% signal genuine trouble.

1. Accelerating Relentless Change (68%)

My original prediction was that AI capability growth would keep accelerating, with autonomous task horizons moving toward work-week-scale tasks by the end of 2026. So far, the direction is clearly right, but the strongest version is not yet proven. METR’s 2026 time-horizon work continues to show rapid progress, and GPT-5.5, Claude Opus 4.8, Gemini 3.5 Flash, and similar releases are making long-running coding, office-work, and agentic tasks more credible than they were at the start of the year. OpenAI says GPT-5.5 improved persistent work, computer use, document/spreadsheet/slide generation, and professional workflow benchmarks, including OSWorld-Verified and GDPval.

Claude Mythos Preview hit a 16-hour horizon in March 2026, and METR is openly saying its current suite cannot measure the next model because only 5 of 228 tasks are 16+ hours long. (When benchmarks saturate, the industry loses its shared speedometer, and marketing rushes in to fill the vacuum. Expect the loudest capability announcements of late 2026 to arrive when they are hardest to falsify.)

Autonomous multi-agent frameworks are now deployed across enterprises, managing multi-day cognitive tasks: migrating legacy codebases, synthesizing competitive research, and coordinating end-to-end media campaigns with minimal prompting.

But long-duration autonomy is still not fire-and-forget autonomy. Models drift, miss implicit constraints, and sometimes compound small errors into large ones. Human oversight remains necessary, particularly for subjective work where the definition of success is negotiated rather than measured.

Looking ahead, the signs point to fast progress for the rest of the year. Recent advances in agentic memory and expanded context windows indicate that autonomous execution horizons will only grow longer and more reliable heading into Q4.

We are not yet seeing mainstream frontier AI autonomously completing 39-hour human tasks across ordinary knowledge-work domains, but I still think this is likely by December 2026.

2. AGI: No (48%)

My original prediction was negative: no AGI (Artificial General Intelligence) in 2026, especially under the more demanding definition that requires efficient learning of novel, open-ended tasks outside the training distribution. This prediction is fully true so far. The year has produced much stronger agents and professional-work systems, but the best public evidence still points to increasingly capable specialized and semi-general systems, not unaided machines that can learn and outperform average human workers across every specific task.

Domain-specific superintelligence is real: models ace complex legal exams and synthesize novel drugs. Yet the same models still fail at physical-world puzzles a child solves intuitively, so mastery has not translated into adaptable, general intelligence.

I score this 48% rather than the full 50% that would mean fully true for half the year, because some experts claim the most advanced frontier models have already crossed the AGI threshold. I don’t believe them, but I’ll give them two percentage points of credit.

There has been no credible public AGI declaration backed by broad, independently validated evidence. The rest-of-year pace on this prediction is likely slow in terms of actual falsification. There may be dramatic product releases and confusing marketing language, but the demanding AGI definition gives the prediction a large safety margin.

3. New AI Scaling Law: Maybe (55%)

My original prediction was deliberately uncertain: perhaps a new scaling paradigm would emerge beyond pretraining, reinforcement learning, and test-time compute. So far, there are hints but no recognized new law. Continual learning has become the single most-discussed candidate paradigm, and Google Research’s Nested Learning framed continual learning as a new machine-learning paradigm. Similarly, DeepMind’s Project Genie advanced world-model work, but neither has become the visible, dominant scaling law of 2026 in the same way that test-time reasoning changed 2024–2025.

The most visible industry shift is still test-time compute, especially inference-time scaling and Reinforcement Learning from Verifiable Rewards (RLVR). These methods let models spend more computation at runtime, iterate internally, and self-check logic before answering. Important as they are, they extend known approaches rather than prove the discovery of a new scaling law.

What has not happened is the public emergence of a clean, industry-wide scaling mechanism that everyone can see is driving a capability phase change. Progress for the rest of the year is uncertain. This prediction could jump to 80% with one major research paper, but absent that kind of reveal, it will remain a “promising research rumble” rather than a confirmed 2026 event.

An alternative possibility is that the next “scaling law” may not look like a model-training law at all. It may be an operations law: capability rising as models are embedded in better tool ecosystems, better evaluators, better memory stores, and better feedback loops from real work. In that case, the breakthrough will not be a single paper with a clean curve, but rather the gradual discovery that the same model becomes much more capable when surrounded by a smarter work environment and improved usability.

As the ability to perform useful work increasingly comes from the surrounding environment, UX stops being a wrapper around intelligence and becomes an input to it. (Assuming that by “intelligence” in the AI context, we don’t mean some abstract IQ score, but the fit between an organism and its environment that allows it to thrive.) Task analysis, error tolerance, memory design, and feedback loops will sit inside the scaling stack, next to data and compute. The first lab to treat designers as capability engineers will pull ahead on benchmarks, not just on satisfaction scores.

4. No Moat for AI Labs (83%)

My original prediction was that leadership among AI labs would remain temporary: whoever led in December would stay only a few months ahead of the next competitor, and leadership would fluctuate during the year. This is mostly coming true, and fast-follower dynamics dominate the AI space. Vals’ public rankings show close competition among Claude Opus 4.8, GPT-5.5, Claude Sonnet 4.6, GLM 5.2, and Gemini 3.5 Flash, and China’s Z.ai GLM-5.2 has reportedly narrowed the gap with top closed models while running at roughly one-sixth the cost of U.S. frontier models.

What has not happened is total commoditization. Distribution, enterprise integration, compute access, safety approvals, and regulatory status are becoming moats even when model intelligence itself is hard to defend. The rest of the year should move fast: GLM-6, GPT-6, Gemini 4 Pro, and Anthropic releases beyond Fable 5 could reshuffle the leaderboard again.

All signs indicate that the parity race will remain fast. Talent poaching, open-weight releases, and rapid imitation mean that intelligence leads will be temporary. What remains defensible is not being the smartest model for three months, but being the easiest, safest, and most deeply integrated model to use.

5. UX as an AI Model Differentiator (62%)

My original prediction was that raw model intelligence would converge enough that workflow and UX would become the main differentiator. This is half true already. Microsoft is productizing agents through Agent 365, Copilot Studio, Scout, and Work IQ; Anthropic is wrapping Claude into finance workflows and Microsoft 365 add-ins; and Figma is turning the design canvas into an AI-assisted production environment. These are workflow products and not simply “smarter chatbots.”

The best AI UX is often invisible because the product already knows the context. A generic chatbot asks the user to explain the job. A good workflow AI inherits the document, customer record, design system, calendar, permission model, deadline, brand voice, and organizational norms before the user types a word. The winning interface is not necessarily the prettiest; it is the one that asks the fewest unnecessary questions.

AI-native IDEs for software engineers, drafting platforms for legal professionals, and canvas-based creative suites are winning market share. Their targeted, seamlessly integrated workflows are proving superior to the generic, blank-canvas chat interfaces that characterized the early AI boom. However, no UX revolution has emerged from the major foundational labs themselves.

What has not happened is the clean, obvious moment when a frontier lab becomes the UX leader by hiring and empowering a world-class UX organization. The strongest progress is coming from enterprise platforms and design-tool companies rather than from the foundation-model labs themselves.

Giants like Google, OpenAI, and Anthropic still surprisingly struggle to build intuitive, proactive UX natively, possibly because their leadership is so nerdy that they can’t recognize superior UX talent in the hiring process. Their tech-centric cultures undervalue design research, task analysis, and the slow organizational discipline required to make complex systems feel simple.

The rest of the year should be medium-fast: workflow packaging will accelerate, but deep UX excellence is slower because it requires product discipline, user research, and organizational humility.

6. Google AI Gets Its Act Together (58%)

My original prediction was that Google would finally impose a more coherent UX architecture on its sprawling AI portfolio. The reorganization is real. Google has killed redundant experimental brands, shuttered competing internal divisions, and mandated unified Gemini integration across Android, Search, and Workspace.

For end users, however, the unification has yet to materialize: handoffs between the Gemini agent, legacy Google Assistant tools, and third-party Android applications often suffer from jarring UI inconsistencies, latency issues, and frustrating drops in contextual memory. Untangling decades of legacy infrastructure, institutional silos, and competing organizational incentives takes significantly longer than simply retraining model weights.

Google doesn’t offer true AI simplicity, except for AI in the search engine. (Which of course is where they make all their money.) Its product map remains a dizzying alphabet soup, from Gemini and Flow to Antigravity and Jules. While Search provides a powerful organizing surface, achieving true product coherence lags behind model release velocity.

7. Compute Crisis Continues (88%)

My original prediction was that compute scarcity would become a permanent design and business condition, not a temporary GPU shortage. This is strongly true and arguably underestimated. Analysts forecast that AI data centers will consume roughly 70% of the world’s memory output in 2026, up from 20–30% in 2022. DRAM prices are skyrocketing.

The tiering and quota patterns I predicted are now standard, as seen directly in Google's compute-based usage limits and overage credits. Anthropic and Google have even struck deals with their direct competitor, xAI (the company behind Grok), to rent compute capacity from xAI’s Colossus data center. Each pays its competitor roughly a billion dollars a month, proving that compute is now worth more than rivalry.

Jevons Paradox (that increased efficiency leads to higher consumption, not less) has proven true and driven AI usage up so dramatically that it has outpaced hardware advancements, turning premium compute into a scarce luxury. Autonomous agents consume far more compute than single-turn queries.

Because high-reasoning models using test-time compute require massive processing power, providers frequently impose strict rate limits on power users. Consequently, compute-aware product design, where UX designers must build intelligent loading states and usage throttling into their applications, is now a mandatory industry skill.

What has mercifully not happened is the total collapse of the free tier. Millions of regular users have been saved by the aggressive deployment of highly distilled, hyper-efficient “Flash” models and intelligent query routing algorithms that handle simple tasks cheaply. Still, the trajectory toward a severe bottleneck remains fast.

The compute famine has not yet caused everyday AI products to collapse under load. Instead, the crisis appears as tiering, quotas, rate limits, premium plans, peak-load management, and massive infrastructure buildouts. The rest-of-year pace will be fast in demand and slow in relief: demand can grow instantly, but power, fabs, data centers, and HBM capacity take years to build.

Compute has become a design material, like screen size, bandwidth, or battery life in earlier eras. UX teams can no longer treat latency, token budgets, queueing, and reasoning depth as implementation details. Usability now includes the user’s understanding of when the machine is thinking, when it is rationing, and when it is silently downgrading the quality of the answer.

Silent downgrading is the first native dark pattern of the compute-famine era, and a preview of Prediction 10. When a provider routes my query to a cheaper model without telling me, I experience quality roulette: the UI shows the same interface and same subscription, but delivers different intelligence with zero disclosure. The fix is equally nameable: nutrition labels for inference. Interfaces should state which model answered and how much reasoning it spent, first as a trust differentiator for professional products and eventually, I suspect, as a regulatory requirement. An industry that rations cognition owes users a gauge.

8. AI Agents (78%)

My original prediction was that 2026 would finally become the year of agents, moving from simple conversational UI (chatting turn-by-turn) toward delegative UI, where active AI systems plan, sequence, and execute complex tasks autonomously on behalf of the user. This is one of my strongest hits.

This shift has been highly successful within specific, well-defined boundaries. Single-vendor and internal enterprise-specific agents are now widely deployed and effective. They operate quietly in the background for hours to clear IT support tickets, write thousands of lines of boilerplate code, or conduct deep, localized data analysis without human hand-holding.

But open-web interoperability remains weak. Agentic gridlock appears whenever an independent consumer agent must cross authentication systems, walled-garden APIs, payment flows, or liability boundaries. Progress will therefore be fast inside closed enterprise ecosystems and slower for universal agents operating across the messy public web.

We also don’t yet have full “digital employee” normality. Agents remain supervised, brittle in some environments, and often dependent on enterprise guardrails. The rest of the year should move very fast because control planes, app integrations, and computer-use models are now shipping together. The bottleneck will shift from “can an agent act?” to “can an organization safely delegate?”

Delegation has the hidden cost of verification. Generation scales with compute, but checking requires scarce senior human attention. An agent that works for 16 hours produces 16 hours of output that somebody must either trust or inspect, and current interfaces support neither well. Unless verification UX improves radically (risk-ranked diffs, confidence maps, sampling audits instead of full review), oversight capacity will cap enterprise adoption long before agent capability does. The companies that win the agent era may be those that make checking work ten times faster, not those that make doing work ten times faster. (One more argument for AI usability being a scaling law of its own.)

9. Generative UI and the Disposable Interface (47%)

My original prediction was that static interfaces would begin giving way to AI-generated micro-interfaces assembled for each user’s intent and context. The developer ecosystem is clearly moving in that direction. GenUI succeeds in constrained use cases: consumer onboarding flows, data-visualization dashboards, and adaptive search widgets that assemble UI modules on the fly from real-time behavior and context.

What has definitely not happened is the complete death of traditional, static software. Enterprise applications, complex spreadsheets, and professional toolkits still rely heavily on rigid, predictable spatial layouts because expert users depend profoundly on muscle memory (Jakob’s Law) to perform their jobs efficiently. Moving a crucial navigation button every time a user opens a platform causes frustration and cognitive load. We must preserve spatial stability for frequent actions while allowing AI to generate temporary controls for occasional, contextual tasks.

GenUI is still mostly in frameworks, agent products, and prototypes rather than daily consumer software. The rest-of-year pace should be medium: the standards and papers are appearing, but design systems, accessibility, predictability, QA, and user trust will slow mass deployment.

A possible compromise: semi-disposable UI with a stable skeleton and generated flesh. Navigation, identity, undo, safety, and key workflows must remain spatially dependable, while local panels, explanations, visualizations, and task-specific controls can be generated on demand. This hybrid preserves mastery while still giving novices the benefit of intent-specific simplification.

Jakob’s Law has always described consistency across products: users spend most of their time on other sites, so your site should work like theirs. GenUI makes it possible to invert the direction of that consistency. Instead of every vendor shipping its own frozen layout, the stable design system can belong to the user: your agent learns your personal conventions: where you expect navigation, how you like tables presented, what undo means to you, and renders every service into that layout. Maybe we can call this “Bring Your Own Interface.” Muscle memory survives not because vendors stop moving buttons, but because your interface travels with you. It would be the largest transfer of interface power from companies to individuals since the graphical user interface itself, but is a few years into the future.

10. Dark Patterns Move to the Model Layer (48%)

My original prediction was that manipulation would move from deceptive interface elements to AI-personalized behavioral dark flows. The underlying commercial machinery is arriving: Meta says its AI business assistant will give advertisers personalized performance recommendations and that its ad-ranking models use richer behavioral sequences; Google AI Max optimizes targeting and creative delivery in real time; and Google is moving legacy ad products toward AI Max.

A recent CDT report identified 37 manipulative dark patterns in ChatGPT, Gemini, Claude, and companion apps, finding they pose heightened risk precisely because of hyper-personalization and emotionally fluent conversation. Tactics such as conversation prolongation, gamification, and unpredictable reward behaviors extend sessions beyond users’ intent.

However, mainstream AI foundation models are not the worst offenders in dark design. AI sales bots and automated e-commerce agents are beginning to deploy insidious tactics such as empathy entrapment. By accurately reading a user’s hesitation, these agents dynamically forge false parasocial bonds, mimic emotional distress, or leverage guilt to prevent users from canceling subscriptions or to pressure them into expensive upsells.

The defensive side is mostly absent. Mainstream gatekeeper agents that screen calls, negotiate with customer-service bots, and detect manipulative conversational tactics have not yet arrived, so the “your AI versus my AI” battle remains more forecast than reality.

When gatekeeper agents do arrive, expect them to arrive unevenly: defensive AI will launch as a premium feature, adding one more brick to the wall between the premium-tier nobility and the free-tier proles. The manipulation problem is directly related to the two-tier divide of Prediction 14. Paying users will get agents that negotiate on their behalf, while free-tier users face persuasion engines undefended. Vulnerability to dark patterns is about to become a function of subscription status, which should worry regulators more than any single deceptive flow. (One dark design can only do so much damage, but systematically keeping half the population in the dark could be a disaster.)

Commercial progress may be fast while evidence remains scarce, because the worst behavior will be personalized and hard to observe from outside. The old method of documenting dark patterns with screenshots will be insufficient. A model-layer dark pattern is a distribution of behavior, not a single interface state. Regulators, journalists, and UX researchers will need counterfactual audits: same user goal, different vulnerability signals; same cancellation request, different emotional tone; same product, different inferred income or desperation level. The harm will only be visible when we compare what the model says to different users.

11. Multimodal AI (78%)

My original prediction was that frontier AI would stop being a text model with attachments and become a natively multimodal or world-model system. This is substantially true. Google announced Gemini Omni as a model for world understanding, multimodality, and editing, with “any output from any input,” while Gemini 3.5 Flash is described as Google’s strongest agentic and coding model and a leader in multimodal understanding. DeepMind’s Project Genie is explicitly positioned around world models that simulate environments and predict how actions affect them.

Native multimodality is becoming the industry baseline. Real-time voice, vision, audio, image, and video capabilities increasingly feel like parts of one system, with lower latency, better interruption handling, and richer cross-modal context that mimic human phone calls without awkward pauses.

What has not happened is the full multimedia: one general, robust system that speaks, listens, sees, imagines, edits, simulates, and outputs finished media in a single seamless representation. The rest of the year should move fast, especially in video, editing, and world models, but industrial-grade physics and cross-modal coherence will take longer.

12. Single-Mode AI Providers Bought by Multi-Modal AI Labs (38%)

My original prediction was that single-modality image, video, and music AI companies would likely be acquired by multimodal AI labs or fade. So far, this has mostly not happened. The best evidence points to partnerships and licensing rather than acquisition. (Meta did buy Manus, but this deal has since been overturned by Chinese authorities.)

What has not happened is the acquisition wave. Midjourney, Runway, Ideogram, Suno, and similar players have not been swallowed by the largest multimodal labs as of June 30. The rest-of-year pace looks slow to medium: strategic pressure for consolidation is real, but founder independence, antitrust scrutiny, copyright litigation, valuation gaps, and the value of distribution partnerships can all delay M&A.

13. Editing AI-Generated Images (77%)

My original prediction was that AI image generation would stop feeling like a slot machine and start feeling like design software with direct manipulation, semantic editing, constraints, and history. This is strongly on track. Conversational editing, highly precise regional in-painting masks, semantic control sliders, and reasonably reliable character-consistency locks are now standard, native features in the leading AI design suites.

Nano Banana 2 rolled out in February 2026 with subject consistency that recognizes the same item across revisions, making iterative edits reliable, and GPT Image 2 launched in April 2026 with advanced compositional editing. The current leader in image editing seems to be Reve, but progress is fast, and the front-runner trophy is likely to change hands several times before the end of the year.

We still don’t have universal Photoshop-like semantic editing where every object has reliable handles, identity locks, lighting controls, and non-destructive conceptual sliders. The rest-of-year pace should be fast because this is an obvious product-market fit: ad production, product imagery, thumbnails, ecommerce, and social content all reward precise variants.

14. The Two-Tier AI World (83%)

My original prediction was that a subscription divide would create a cognitive class system between people using frontier AI deeply and people stuck on free or limited models. This is very true. Google AI Ultra and related plans bundle higher limits and premium access across Google products; OpenAI prices GPT-5.5 Pro API output at $180 per million tokens; Anthropic explicitly tied higher usage limits to new compute capacity; and Microsoft’s Work Trend Index describes a gap between ordinary AI users and “Frontier Professionals,” with advanced users much more likely to say they now produce work they could not have produced a year ago.

The productivity gap in the workforce has widened. Premium users paying for enterprise subscriptions leverage expensive API-driven autonomous agents and vast amounts of test-time compute to gain digital leverage, effectively giving a single worker the output capabilities of an entire team. Meanwhile, free-tier users suffer quantized, shallow outputs that hit rate limits quickly and hallucinate on complex logic or code.

Because top-tier compute will stay expensive to run (see Prediction 7), this divide will solidify fast, cementing a two-tier cognitive system. Worse, the divide will deepen as users’ AI experience compounds: A frontier-tier professional produces more, earns more, and can afford still better AI leverage next quarter; a free-tier worker falls behind on all three at once. Cognitive inequality is not fixed, but growing as past advantage purchases future advantage.

Every earlier technology divide, such as televisions, PCs, or smartphones, narrowed as prices fell. (I was never very worried about poor people not being able to afford a computer in the so-called “digital divide” of the 1990s, because Moore’s Law automatically solved the problem. If only all social problems were resolved simply by waiting a few years.) AI may be the first where the gap widens within careers even while the floor rises, because the ceiling is priced in scarce compute and climbs faster than the floor.

However, free models are also improving, so the AI divide isn’t yet as wide as it will be later. The rest-of-year pace will be fast because compute scarcity and enterprise willingness to pay reinforce premium access. The counterforce is that vendors need free tiers for distribution and habit formation.

15. Ultimate Niche Targeting: One User, Right Now (60%)

My original prediction was that targeting would move from audience segments to one user in one moment, with creative, offer, and interface assembled dynamically. Advertising is clearly leading. Meta reports Advantage+ campaigns deliver roughly 22% higher return on ad spend, with more than 4 million advertisers now using its generative AI tools to produce over 15 million AI-enhanced ads monthly. As a result, Meta says it is expanding AI advertiser assistants, AI creative tools, richer ad-ranking models, and business AIs that can help users act directly in WhatsApp. Google AI Max processes real-time signals to refine targeting and creative delivery, and Google is testing Gemini-built ad formats in Search and expanding Direct Offers.

Even so, full automation remains aspirational. Meta has reportedly targeted end of 2026 for hands-off ad creation, but media buyers say it is likely much further off, with mixed reviews of current tools. Generative, on-the-fly rewriting of entire e-commerce and news experiences per user is not yet mainstream.

Thus, we don’t yet have the fully individualized web in which landing pages, product copy, price framing, and UI all regenerate for every session based on the user’s immediate psychological state.

Progress should be fast in advertising and shopping, slower in ordinary websites, news, and long-form media. Ads reward immediate optimization; websites must also satisfy privacy law, brand governance, CMS constraints, accessibility, QA, and user trust. Deep narrative personalization, such as a custom thirty-minute TV episode generated on the fly, probably remains several years away.

16. Physical AI: The Brain Gets a Body (58%)

My original prediction was that AI would move visibly into the physical world through autonomous vehicles, robots, drones, logistics, retail, and healthcare. Autonomous vehicles and industrial humanoids are advancing rapidly.

Logistics robots, automated mega-warehouses, and geo-fenced robotaxi zones have expanded globally and hit commercial scale. In San Francisco, many users (including me) prefer AI-driven Waymo cars over human-driven Ubers and are willing to pay a premium for them, partly because AI cars have proven to be many times safer. Waymo now serves over 20 million trips with roughly 3,000 robotaxis, targets 1 million trips per week by year-end, expanded to 10 US cities in February, and has laid the groundwork for 20+ cities, including its first international markets in London and Tokyo.

What has not happened is the most dramatic version: robots are not yet common in everyday retail and hospitality, and most cars on urban roads are not self-driving outside the densest robotaxi pockets in high-tech cities. The rest-of-year pace will be medium-fast for autonomous vehicles, medium for warehouse and factory robots, and slow for general-purpose humanoids in public spaces.

17. The Apprenticeship Comeback (43%)

My original prediction was that as AI became highly capable of automating the basic task execution usually assigned to junior staff, entry-level knowledge jobs would die off, forcing companies to shift to an apprenticeship model to train the next generation of senior UX and tech staff. AI would absorb routine production work and make judgment the scarce skill.

The labor-market pressure is real, with the death of traditional junior roles overwhelmingly confirmed. A Harvard working paper found that entry-level hiring at companies adopting generative AI has fallen by roughly 80% since 2023. Goldman Sachs economists estimate AI is erasing roughly 16,000 net US jobs per month, with the pain falling hardest on Gen Z and entry-level workers. Similarly, in Switzerland, entry-level roles are 32% lower in 2025 than the 2019–2022 average, with junior roles in AI-exposed fields down and senior roles up.

Because AI effectively and cheaply automates basic junior tasks such as wireframing, simple copyediting, and writing boilerplate code, corporations have stopped hiring juniors. The true crisis lies in what has not happened: the corporate world has largely failed to establish the necessary formal, industry-wide apprenticeship pipelines to replace the entry-level roles that were lost. For a few years, this will be fine, but the long-term result is likely to be a drought of early-career training.

This prediction compounds with the two-tier AI world (Prediction 14). Juniors are precisely the people least likely to enjoy lavish token budgets, so the workers who most need AI leverage to substitute for vanished entry-level roles are stuck with the weakest models. The career ladder lost its bottom rungs at the same moment the tools that could replace those rungs were priced for people already standing higher up.

Many companies appear to be demanding more experience rather than training beginners differently. Entry-level jobs were always a bundle: cheap labor plus a training subsidy. AI bought the labor half at a discount, and no firm wants to purchase the training half alone because of the risk that trained juniors will leave. Training is a public good that every company hopes a competitor will fund. Unbundled public goods do not re-form voluntarily; they require an external force: licensure, unions, government funding, or a prestige race among elite employers. The hopeful counter-trend is bottom-up. Beginners are AI natives, using frontier models as patient masters that critique their work at midnight. The apprenticeship comeback may yet arrive with AI as the master craftsman and the corporation reduced to certifying the result.

The rest-of-year pace looks slow unless a few influential design-led companies formalize apprentice roles and make them prestigious rather than second-class internships. Thus, the negative part of my prediction has been unfolding, but my prescription to address the inevitable long-term consequences has not been adopted.

18. Human Touch as Luxury: No ( 57%)

My original prediction was that human-made content would not become a broad luxury category; mainstream audiences would not automatically pay a premium for human-made content out of a romantic desire for biological connection, but instead, quality and experience would matter more than whether meat or silicon made the artifact.

Consumers have proven ruthlessly pragmatic: they embrace whatever media is most entertaining, regardless of origin. AI-generated music tracks consistently top streaming charts, heavily synthesized short-form video dominates social media feeds, and indie games featuring entirely conversational, unscripted AI NPCs are achieving mass commercial success.

What has not happened is the complete cultural collapse of legacy human prestige in highly specific niches. Elites still value fine arts, live theater, and human-made physical luxury goods as status symbols. However, in everyday digital content consumption, provenance usually matters less than entertainment value, convenience, price, and social momentum. The dominance of engaging AI media will continue to accelerate.

The niches where human prestige survives share one trait: scarcity, which has migrated from the artifact to the event. A live performance cannot be copied because presence, risk, and accountability happen once, in one room. Human-made is not becoming a luxury label on content or products; human presence is becoming a luxury experience in time. Expect pricing power to concentrate in synchronous, embodied, accountable formats while asynchronous artifacts commoditize regardless of who made them.

Still to come: the first obvious breakout hit game created entirely by a non-programmer with natural-language prompts. The rest-of-year pace is medium. Tooling will improve quickly, but breakout entertainment is slower because distribution, taste, gameplay, and community matter more than generation capability alone.

Status of 2026 Predictions

Since we are halfway through the year, scores of around 40–60% indicate a prediction that’s on track. One can never be exactly right when predicting, especially about the future, as many wits have famously declared.

Scores of 25% or below mean things are definitely not going as I expected; scores of 75% or above mean events are outrunning my forecast and will likely prove me almost completely correct by year-end. (Check in with me in late December to see whether this assumption turns out to be true.)

Here is a list of all my predictions sorted by their halfway score:

38%: Single-Mode AI Providers Bought by Multi-Modal AI Labs (Prediction 12)

43%: The Apprenticeship Comeback (Prediction 17)

47%: Generative UI (GenUI) and the Disposable Interface (Prediction 9)

48%: AGI: No (Prediction 2)

48%: Dark Patterns Move to the Model Layer (Prediction 10)

55%: New AI Scaling Law: Maybe (Prediction 3)

57%: Human Touch as Luxury: No (Prediction 18)

58%: Google AI Gets Its Act Together (Prediction 6)

58%: Physical AI: The Brain Gets a Body (Prediction 16)

60%: Ultimate Niche Targeting: One User, Right Now (Prediction 15)

62%: UX as an AI Model Differentiator (Prediction 5)

68%: Accelerating Relentless Change (Prediction 1)

77%: Editing AI-Generated Images (Prediction 13)

78%: AI Agents (Prediction 8)

78%: Multimodal AI (Prediction 11)

83%: No Moat for AI Labs (Prediction 4)

83%: The Two-Tier AI World (Prediction 14)

88%: Compute Crisis Continues (Prediction 7)

Boldface indicates the six predictions (a third of my 18 predictions) that are coming true particularly fast. There are no predictions that are so far behind schedule that they currently score less than 25%. I am not particularly proud of this record because it indicates that I was insufficiently aggressive in my predictions. (My mean score is 63%, where 50% would have been the average of forecasts that were equally over and under the target.) You have to be a little wrong in this game! As we will see in the next section, several major developments were outside my January predictions because I did not go wild enough. This is a known forecaster’s disease. Even people who believe in exponentials hedge toward linearity the moment they must commit to a specific number. My gut accepted the curve; my keyboard flattened it.

When analyzing the six highest-scoring predictions (all charting between 77% and 88%), two underlying themes explain why they have materialized so rapidly.

First, these runaway trends are driven by the brute force of economics and hard physical limits. The compute crisis (88%), the cementing of a two-tier AI world (83%), and the relentless commoditization of AI lab moats (83%) are the inevitable results of investments colliding with hardware constraints. When hundreds of billions of dollars are poured into data centers, the market consequences (waitlists, premium pricing tiers, and a ruthless parity race among competitors) happen predictably and rapidly. We aren’t waiting on human adoption or theoretical breakthroughs here; we are simply watching the free market process scarcity and competition.

Second, the capability hits in this top tier, agents (78%), multimodal AI (78%), and direct-manipulation image editing (77%), share a short path from change to money. These are not speculative UX paradigms but direct solutions to acute pain points: better image editing immediately helps creators and ecommerce teams, multimodal AI turns isolated generation into a production workflow, and agents promise labor leverage inside bounded enterprise environments. Vendors can ship these improvements, measure them, charge for them, and improve them within a single product cycle.

Contrasting these fast-moving realities with the six lowest-scoring predictions (ranging from 38% to 55%) reveals two main differences between what accelerates and what bogs down.

The primary difference is technological momentum versus institutional friction. The highest-scoring predictions rely almost entirely on servers, code, and market forces, which scale at exponential speed. In contrast, the lowest-scoring predictions are gated by human behavior, corporate bureaucracy, and legal friction. For instance, the stalled apprenticeship comeback (43%) represents the failure of corporate HR imagination. The lack of an M&A wave among single-mode AI providers (38%) is bogged down by antitrust scrutiny and founder egos. Even Generative UI (47%) is dragging because it fundamentally violates my own Jakob’s Law: human beings rely on spatial stability and muscle memory to perform expert work, and completely disposable interfaces destroy that stability. We can update software weights overnight, but we cannot quickly patch human psychology or corporate incentive structures.

A second key difference is linear acceleration versus structural paradigm shifts. The top six predictions represent extreme, predictable accelerations of known technological trajectories. The bottom-scoring predictions demand messy, structural phase changes: the discovery of a brand-new scientific scaling law (55%), entirely new regulatory frameworks to audit invisible model-layer dark patterns (48%), or the genuine scientific leap required for AGI (which I predicted would not happen, keeping it structurally capped at 48% at the half-year mark).

The lesson is clear: throw enough money and compute at an existing trajectory, and it will accelerate beyond expectations. But anytime an AI trend requires human nature, scientific paradigms, or legacy institutions to change, it will invariably plod along at a stubbornly biological pace.

Momentum, I called well; agency, I called early.

The difficulty of forecasting human agency also foreshadows my four big misses discussed in the next section, every one of which was an actor seizing control (a government, an incumbent, a rival bloc) rather than a capability improving. The uncomfortable lesson for the second half of the year: forecasting capability is the easy half of this job, and forecasting who may decide to act on that capability is the hard part.

Major 2026 Developments I Did Not Predict

You can obviously never hit every major development in fields as broad as AI and UX in a set of predictions. At least not if you want to stay below a list of several hundred predictions. I missed many things, and it certainly doesn’t matter that I didn’t include, for example, the prediction that European frontier AI models would become basically irrelevant during the year, with the AI game being completely dominated by the USA and China. Yes, this is my current assessment, but it’s not a particularly important development for the world, since Europe wasn't particularly important in 2025 either.

Much worse, I missed the following four big changes. They all follow from 2026 being the year AI moved from a capability competition to a control competition, which is much nastier.

1. Frontier-Model Export Controls Became Government Policy

The most important unpredicted development is the U.S. government’s direct intervention in frontier-model access. Anthropic said a U.S. export-control directive required it to suspend all access to Fable 5 and Mythos 5 by any foreign national, including foreign-national employees, forcing Anthropic to disable those models for customers to ensure compliance. OpenAI’s GPT-5.6 system card also says OpenAI previewed the models and capabilities to the U.S. government, and at the government’s request began with a limited preview for a small group of trusted partners before broader release.

This is significant because AI regulation has shifted from chips and data centers to access to live models, which fractured the once-borderless global AI community. Government intervention forced international developers to rely on older, sub-threshold models and sparked an immediate global arms race, leading many nations to prioritize localized data centers and "sovereign AI" to ensure their independence from U.S. tech hegemony.

The practical UX consequence is that the availability of intelligence is no longer just a product-tier issue; it is a citizenship, jurisdiction, employer, and national-security issue. A highly fragmented “splinternet” global AI landscape will no doubt make many advances in the human condition harder to achieve.

Fragmented access also fragments knowledge about users. A usability study run on frontier models in San Francisco no longer describes the experience of a user in Jakarta or Warsaw who is legally limited to sub-threshold models, so UX research findings are now jurisdiction-specific. Note also what this does to Prediction 14: the two-tier world has grown a third tier, defined not by willingness to pay but by passport.

I did not predict this in December 2025 because most public policy was still focused on compute export controls, model safety evaluations, copyright, and privacy, not on APIs being cut off by nationality. Government limitations on foreign use of AI will probably accelerate if the government treats top models as dual-use strategic assets, though it may stabilize if the collateral damage to allies, enterprise customers, and lab operations becomes too high.

2. Google Turned Search Itself into a Mass-Market Agent Surface

I predicted agents, GenUI, and targeting, but I did not identify Search as the consumer-scale agent interface of 2026. Instead, I had assumed that Search would decline in importance. Google now describes its AI Search update as enabling users to use agents just by asking a question and calls it the biggest upgrade to the Search box in more than 25 years. Google also made Gemini 3.5 Flash the default model in AI Mode globally.

This is major because Search is not merely another app. It has been the front door to the web, advertising, ecommerce, publishing, and local services for decades, and Google clearly aims to keep it that way. If agentic Search becomes the default, UX moves upstream from websites into answer engines and task-completion surfaces: a website’s primary reader will no longer be a human user but a model acting on the person’s behalf. Ranking becomes routing because an AI agent does not display 10 blue links; it picks one merchant and completes checkout, which makes Google simultaneously both referee and player.

This was not my prediction because I correctly identified agents and generative interfaces, but framed them more generally; the specific strategic importance of Search as the universal agent gateway became clearer after Google I/O. The trend will likely accelerate because Google can make AI Mode a default habit, but it may stabilize under publisher pressure, antitrust scrutiny, ad-load constraints, and answer-quality failures.

3. Apple Re-Entered AI UX Through the Operating-System Layer

I did not make an Apple-specific prediction for 2026 because I saw Apple as increasingly irrelevant to the future of computing in general and to AI in particular (despite its strong contributions to usability from 1976 to 2010). Yet Apple introduced Siri AI as a deeply integrated assistant with personal context, broad world knowledge, onscreen awareness, and the ability to surface information from messages, emails, photos, and other sources. Apple’s App Intents framework also gives developers a way to make app content and capabilities available through natural language with less code.

This is major because Apple’s advantage is not model leadership but OS-level distribution, trust, app integration, and personal context. If Siri AI works well enough, it changes the UX of iPhone, iPad, Mac, Vision Pro, and Apple Watch from “find the right app” to “state the intent and let the system route the action.” It may not have been predicted in December 2025 because Apple’s AI execution has been delayed and substandard. The trend should accelerate if developers adopt App Intents aggressively, but stabilize if privacy constraints, model latency, and Apple’s conservative rollout culture limit what Siri AI can actually do.

4. Chinese Open-Weight Frontier Models Became a Geopolitical Shock Absorber

I did predict “no moat” for frontier AI models (Prediction 4), but I did not predict the geopolitical substitution effect that followed U.S. model restrictions. Z.ai’s GLM-5.2, released after Anthropic’s most advanced models were banned by the US government, narrowly trailed leading closed-source models, operated at roughly one-sixth the cost of U.S. frontier models, and became a focus for countries worried about overreliance on U.S.-controlled AI infrastructure.

Open-weight AI models have suddenly become important because open frontier capability is no longer only a developer convenience; it is sovereign infrastructure. If U.S. model access can be cut off, every large country and enterprise has new incentives to support local or open alternatives. The bitter irony is that export controls become industrial policy for competing countries. Every access ban creates demand for open weights; the U.S. government did more for GLM adoption in one directive than Z.ai’s marketing department could have bought in a decade. We watched this movie in the 1990s crypto wars, when Washington classified encryption code as munitions and thereby handed the market to foreign implementations. The twist this time is enforceability: weights are copyable, but frontier-scale serving requires physical compute, which is why the control point migrated from chips to live APIs, and why these controls will bite harder, and provoke harder, than the crypto restrictions ever did.

I missed this development because the model-leaderboard race was visible, but the combined effect of export controls plus a credible Chinese open-weight alternative was harder to foresee. The trend is likely to accelerate through the second half of 2026, especially if GLM-5.5 ships as planned, though compute restrictions and safety concerns may slow the top end.

What My Four Misses Have in Common

It would be comforting to call these four misses bad luck, but they share a single structure, which is the real lesson. None of them is about a lab building a smarter model. They are all about a single actor seizing control of access: a government deciding who may use a model, an incumbent turning Search and the operating system into the default gateway, and a rival bloc offering open weights as sovereign infrastructure. My 18 predictions were overwhelmingly about what the technology and the labs would do. My blind spot was power over distribution and jurisdiction.

This reframes my headline claim from January. I said there would be no moat (Prediction 4), and for raw intelligence that is correct. But a new moat was dug in 2026. As intelligence commoditized, the scarce and defensible resource became access to the best intelligence: who can afford premium compute (Predictions 7 and 14), who is legally permitted to call the model (export controls), and who owns the surface where users arrive (Search, the OS).

The two moat types age in opposite directions. Intelligence leads decay within months as competitors imitate, but access moats compound over time through habit, contracts, default placement, and legal precedent. That asymmetry yields a new forecast: expect frontier labs to spend their temporary intelligence leads acquiring permanent access positions through browsers, devices, telecom bundles, and enterprise defaults. Watch the partnership announcements, not the leaderboards.

Read together, my strongest predictions and my biggest misses tell one story. Intelligence got cheap; access got expensive. The next gatekeepers to watch are the ones I still have not priced in: app stores, payment networks, insurers, and courts, each of which can throttle AI without writing a line of model code.

Conclusion: AI on Track, UX Lagging

My assessment of 2026 so far is that the underlying AI technology is advancing exactly as expected, and in some agentic capabilities, even faster. But raw intelligence is not enough. The bottleneck to widespread, frictionless AI adoption is the user interface.

The two-tier cognitive divide and the evaporation of entry-level knowledge jobs are no longer warnings; they are present reality.

We are accumulating UX debt. If the tech industry doesn’t shift its focus from scaling parameter counts to scaling human comprehension, the second half of this year will be defined by user abandonment, agentic gridlock, and more usability failures.

Worse, AI obscures the UX debt. When an agent operates a confusing interface on the user’s behalf, the pain disappears from view while the debt keeps accruing interest, paid out as agent errors that nobody can diagnose because nobody watches the interaction anymore. We need a second UX discipline: design for agent legibility. Interfaces must now be understandable twice over, once by humans and once by machines acting for humans, and almost nobody is testing for the second audience.

The basic UX process is not lagging where it has been allowed to work. Image editing (Prediction 13) already behaves like real software, agent and multimodal interfaces are maturing fast, and the market keeps rewarding workflow over raw model quality. What lags is UX as an organizational priority, specifically inside the frontier labs (the unrealized half of Prediction 5), plus the trust-dependent frontiers of generative UI (Prediction 9) and apprenticeship (Prediction 17). The lag, in other words, is institutional, not technical. This is a troubling conclusion, because organizational will is harder to manufacture than design skill.