AI Roundup

Deep Currents 01.01.26

Crispin Bailey

01 Jan 2026 • 8 min read

Happy New Year! Welcome to the first Deep Currents of 2026, a monthly curated digest of breakthroughs, product updates, and significant stories from the world of generative AI. The holiday period brought another flood of releases across every category, but it also brought something else: a wave of consolidation that signals where this industry is headed. When companies with cash decide it's faster to buy than build, we're seeing a market mature in real time.

Reading the Currents

If you want the full rundown of updates, scroll down to the complete listing. Otherwise, here's what I think matters most from this month's developments...

Meta paid $2 billion to catch up

The biggest story of the past month is Meta's acquisition of Manus, the Singapore-based AI agent startup that went from launch to $125 million in annual revenue in nine months. Meta reportedly closed the deal in just ten days at a valuation of over $2 billion. For a deal this big to be done in under two weeks suggests serious scrambling to finalize something before the end of the year.

The context makes the desperation clear. After the Llama 4 release flopped earlier this year, Meta has been conspicuously absent from the agent race while competitors have been busy releasing more and more capable (and profitable) models. They've also been searching for revenue streams beyond advertising, a business model that looks increasingly fragile as attention fragments across platforms. Manus solves both problems: it gives Meta an instant agent capability and comes with actual paying customers.

Meta plans to keep Manus running as a standalone service while integrating its technology into Facebook, Instagram, and WhatsApp. If that integration works, AI agents could become the new way people interact with Meta's social platforms. Instead of scrolling feeds, you might have a virtual agent handling tasks like checking Facebook or Instagram for updates on your behalf. I'm not sure people actually want that, or how it would benefit Meta's advertising business. What's more likely is it will become a premium option for professional marketers to generate more posts with less effort and to automate post engagement.

The consolidation wave is here

Beyond Meta's Manus acquisition, December brought three more significant deals. Cursor acquired Graphite, a code review platform, with plans to integrate it into their AI-powered editor. Nvidia signed a $20 billion "non-exclusive licensing agreement" with Groq and brought over key leadership including founder Jonathan Ross. And Whispr, makers of the Flow voice-to-text technology, acquired Yapify, a voice AI agent for email.

Four acquisitions in a single month suggests the industry is shifting from build mode to buy mode. The companies with resources have decided that acquiring proven teams and products is faster than developing capabilities in-house. This is what happens when a technology sector starts consolidating around winners with deep pockets.

Generative UI is the wild card

Google released A2UI this month, a new open source project for building generative user interfaces powered by LLMs. They've also launched a website with the official A2UI spec at a2ui.org. This builds on the Generative UI capabilities Google introduced with Gemini 3 last month, but now it's becoming infrastructure rather than just a novel experimental feature.

The implications are still fuzzy, which is what makes it so interesting to me as a veteran web designer and agency director. If AI can generate functional interfaces on the fly, what happens to the traditional design-to-development pipeline? What happens to the websites and apps we navigate today that have a rigid information architecture? Will users expect this level of personalization from all websites in the future? Google is betting that generating experiences on demand will eventually replace navigating to pre-built pages, at least for some things. Whether users actually want that is another matter, but the infrastructure is getting built regardless and it represents a completely new paradigm for web design.

OpenAI is building a platform

OpenAI launched a new app store framework this month, complete with submission guidelines and a developer platform. What used to be called "connectors" are now just "apps," and a new app directory in ChatGPT makes them easier to find. This is the iOS playbook: build the dominant interface, then let third parties extend it while taking a cut of the value they create.

The positioning shift is subtle but significant. OpenAI isn't just selling access to models anymore. They're trying to become the operating system layer that sits between AI capabilities and end users. If developers build on ChatGPT rather than building their own interfaces, OpenAI captures the relationship with users even when they're using someone else's functionality.

Open and efficient models keep pace

Ai2 released Olmo 3.1, demonstrating that fully open source models can still improve quickly and stay competitive when you open the black box instead of hiding it. Z.ai's GLM-4.7 topped benchmarks for open-source coding systems and comes surprisingly close to Claude Sonnet 4.5 across reasoning and agentic tasks.

Meanwhile, the push for efficiency continues. Mistral's Magistral Small can do reasoning on a MacBook Pro with 32GB of RAM, Google's Gemini 3 Flash delivers speed without sacrificing capability, and Nvidia's Nemotron 3 family comes in three sizes for different deployment needs. These developments serve two overlapping audiences: enterprises looking to cut costs with efficient on-premises deployments, and developers who want capable models they can run locally. Not the flashiest category, but this is where practical adoption happens.

We're learning how agents actually get used

Two research studies this month offered real data on agent usage patterns. Google and MIT found that using more agents on tasks doesn't always generate better results, though it always costs more. The takeaway: multi-agent systems work best when tasks are broken down cleanly into smaller subtasks. Otherwise, a single capable agent with tools performs better.

Harvard and Perplexity published a separate study showing that agents are primarily being adopted by digitally savvy, higher-income knowledge workers who use them for practical task execution rather than casual conversation. Over half of all usage clusters around productivity and research workflows. The agent revolution, such as it is, is happening in offices rather than living rooms.

The Full Stream

Okay, now for the full rundown of releases across every category that mattered this month. I've reordered the categories alphabetically this time, in an effort to bring some order to the chaos

Acquisitions

Cursor announced the acquisition of code review platform Graphite, with plans to integrate it into their AI-powered code editor.

Meta acquired Manus for over $2 billion, closing the deal in ten days. They plan to keep running Manus as a separate service while integrating it into Facebook, Instagram, and WhatsApp.

Nvidia signed a $20 billion "non-exclusive licensing agreement" with Groq and acquired key talent including founder Jonathan Ross and president Sunny Madra.

Whispr, makers of the Flow voice-to-text technology, acquired Yapify, the voice AI agent for email.

Agents

Google opened a waitlist for CC, an experimental AI assistant powered by Gemini that connects to Gmail, calendar, and files to send personalized morning summaries. Notably, it's only available on personal Google accounts for now.

Manus released version 1.6, which can build mobile apps from descriptions, edit images with point-and-click precision, and runs on a new Max agent that completes complex tasks in one shot.

Zoom launched AI Companion 3.0 with agentic workflows that join meetings to take notes and connect to Google, Microsoft, and Slack workspaces.

AI Usage Research

Google and MIT published a study finding that more agents doesn't always mean better results, though it always costs more. Multi-agent systems work best when tasks decompose cleanly; otherwise a strong single agent with tools performs better.

Harvard and Perplexity published research showing that agents are primarily adopted by knowledge workers for productivity and research tasks rather than casual conversation.

Audio

Meta released SAM Audio, a model that can isolate specific sounds from audio or video files using text descriptions, visual clicks, or timeline selections.

xAI introduced the Grok Voice Agent API, allowing developers to build voice tech using the company's top-ranking speech-to-speech mode.

Ecommerce

Klarna launched the Agentic Product Protocol as an open standard to make products more discoverable by AI agents.

OpenAI announced the app store for ChatGPT along with new submission guidelines and a developer platform.

Education

OpenAI launched a learning hub for journalists and publishers called the OpenAI Academy for News Organizations, with on-demand courses and guidance on responsible AI use.

Generative UI

Google released A2UI, a new open source project for building generative user interfaces powered by LLMs. They've also launched the official A2UI spec.

Images and Video

Alibaba unveiled Wan2.6, a multimodal model that generates up to 15 seconds of HD video with dialogue, storyboarding, and character reference capabilities.

Black Forest Labs released Flux.2[max], the top-performing model in the Flux.2 family, optimized for professional marketing and design with advanced character consistency, spatial reasoning, and text rendering.

Figma rolled out AI-powered image editing tools to erase and isolate objects and expand images.

OpenAI released GPT Image 1.5, which is 4x faster and significantly better at following instructions. The new model achieved first place on both Artificial Analysis and LM Arena leaderboards for text-to-image and editing. They also finally fixed the awful yellowish tint.

Qwen released Qwen-Image-Edit-2511, an enhanced image editing model featuring multiple improvements including notably better character consistency and text rendering, and Qwen-Image-2512 which offers enhanced human realism, finer detail, and better text rendering.

LLMs

Ai2 released Olmo 3.1, showing that fully open models can still improve quickly and stay competitive with proprietary systems when you open the black box instead of hiding it.

Google released Gemini 3 Flash, which is much faster than 3 Pro while retaining reasoning capabilities for when you need a good answer quickly.

Minimax released M2.1, an open weights model update focusing on usability across more programming languages and typical work scenarios.

Mistral released Magistral Small 2509, a small but powerful multimodal model with reasoning capabilities that can run on a MacBook Pro with 32GB of RAM.

Nvidia released the Nemotron 3 family of high-performance open source models in three sizes, with a unified architecture that combines reasoning and non-reasoning capabilities while supporting six languages.

OpenAI launched a new app store framework with submission guidelines and a developer platform for third-party apps. Connectors are now called apps, and a new directory makes them easier to discover.

Z.ai released GLM-4.7, a coding-focused model that tops benchmarks for open-source systems and approaches Claude Sonnet 4.5 and GPT 5.1 across agentic, reasoning, and coding tasks.

Specialized Tools

Mistral launched OCR 3, a multilingual document-reading model that converts handwritten notes, scanned forms, historical text, and complex tables into clean text, claiming the top spot across OCR benchmarks.

Translation

Google launched a beta for live speech-to-speech translation with headphones in the Google Translate app, now powered by Gemini.

UX Research

Keplar launched an AI moderator that gathers insights from natural conversations with real people.

Vibe Coding

Anthropic shipped several updates for Claude Code including syntax highlighting for diffs, prompt suggestions, a plugins marketplace, and shareable guest passes.

Google integrated the Opal app builder into the Gemini Gems Manager, allowing users to create reusable mini apps directly in Gemini.

OpenAI released GPT-5.2-Codex, claiming significant improvements over 5.1 in handling large codebases and cybersecurity capabilities.

If you made it this far, congratulations! The last few weeks of 2025 certainly weren't quiet, and the coming year will likely be equally frenetic and full of surprises. As always, please reach out if you have questions or thoughts to share.

Cover image created with Midjourney.