AI Roundup

Deep Currents 05.05.2025

Crispin Bailey

05 May 2025 • 6 min read

Welcome to the third instalment of Deep Currents, a monthly curated digest of breakthroughs, product updates, and helpful articles that have surfaced in the rapidly-evolving world of generative AI.

These were the things that stood out to me as impactful, as a design director in IT trying to stay on top of this rapidly evolving field. Hopefully this post will help you keep your head above water too.

Okay, let's dive into this month’s currents…

Frontier models + LLMs

Another month, another round of updates from the top AI model makers…

OpenAI expanded ChatGPT’s memory capabilities recently to enable it to reference your chat history in order to provide more personalized responses. It’s available to Pro and Plus accounts outside the EU, and it’s an optional feature so if you want to turn it off (or on) go to the Personalization settings and look for the “Reference chat history” toggle switch.
OpenAI also rolled out GPT-o3 (the full version) last month, replacing o1 as their top reasoning model, alongside new o4-mini and o4-mini-high models which are tuned more for coding and visual reasoning.
Google Gemini’s Deep Research tool is now powered by Gemini 2.5 Pro Experimental for Gemini Advanced users, making it a far more capable research agent that compares favorably to OpenAI’s version.
Claude Research is Anthropic’s new Deep Research feature for Claude. They also rolled out the new ability to reference your email and calendar and a more expensive subscription plan for high-use professionals called Max which is available at two price points depending on how much Claude-time you need.
Meta released Llama 4, the next generation of its open-source, natively multimodal family of language models. It got a bit of flack and generated some controversy for trying to game the model ranking scores. Meta also launched a standalone AI app that tracks everything it possibly can about you to feed its advertising algorithms.

Design + Images

The past month was a busy one for the labs making image-generation tools, with a number of big releases and updates.

The biggest update by far was the long-awaited Midjourney v7 launch:
- Midjourney V7 was initially rolled out as an “alpha” back in early April, with updates coming almost every week since then as the developers continue to add features and make improvements.
- Big new features include: a speedy draft mode for quick concepting; a conversational mode that lets you have an ongoing chat with the model to refine a direction; and most recently an all-new “omni-reference” feature that allows you to upload or select an image you want included in your generated images. This last item is a powerful upgrade from the old character-reference feature, as it can be applied to virtually anything, including objects or even logos, in addition to people and characters.
- Beyond all these big new features, Midjourney has also raised the bar on aesthetic quality, which has always been a priority for the company. To that end they continue to innovate, and just last week added two more “experimental” aesthetic parameters (—exp and —q 4) that enhance both the richness and detail of images.

In other image-generating news…

Ideogram upgraded their 3.0 model, with improved prompt adherence and better text accuracy. Additionally, they made their model available via API (in beta) for use in other tools like ComfyUI or for developers to integrate into their own tools.
Leonardo added Flux Element Training, which allows users to consistently apply their own style, or incorporate a person or products across generations using the Flux model.
Canva, who acquired LeonardoAI last summer, has gone all-in on AI with Visual Suite 2.0 and Canva AI.
Gemini Advanced now lets you edit images just like ChatGPT (including Gemini-generated and user uploaded images).
And for everyone who’s been using ChatGPT to generate Studio Ghibli and action figure meme pics, you’ll be relieved to know that OpenAI just added a new library tab, which allows users to access all their image creations in a single place instead of having to sort through past conversation threads. And just like Ideogram, OpenAI has made their image model available via API too.

Motion + Video

The AI video world has been heating up lately as well… this space is going to be huge with all the big players continuing to push the technology forward.

Runway introduced Gen-4 Turbo, a faster version of Gen-4 Alpha that can generate 10-second clips in just 30 seconds, and they launched a powerful new feature called Gen-4 References that allows you to generated consistent characters, locations and lighting based on photos, generated images, 3D models or even selfies. This is a game-changer for AI movie-making.
Leonardo has launched Motion 2.0 which allows users to generate video from a text prompt or from any of their generated images right from within the new AI Creation interface.
Google released Veo 2, its state-of-the-art video generation model, in the Gemini app for Advanced plan users, as well as in Whisk and AI Studio.
A small AI video lab called Sand AI is taking a completely different approach with their new Magi video model that uses an autoregression technique. Instead of generating an entire video upfront, it goes frame by frame, taking the previous shot into account before creating the next one. This technique helps it achieve better character and style consistency.

Music

AI-generated music is having a moment, with many notable artists protesting while others jump onboard. Until the courts get around to dealing with the thorny issues of copyright, consent, and eventually licensing, the industry seems content to keep making improvements.

Suno, the original game-changing AI music generation company, just launched the Suno v4.5 model that claims to generate more expressive music, as well as offering greater variety and accuracy in genres, and producing richer vocals. They also promise better prompt adherence and faster generation speeds with the new model.
Udio recently launched a faster model called 1.5 Allegro, along with a new feature called Styles that let you upload an audio sample or snippet of a song, and it will generate a track based on it. The Styles feature is currently available to Pro users only, but all users get the new Allegro model.
Speaking of styles, musician Imogen Heap has partnered with Jen to offer users preset filters based on her music style. Unlike Suno and Udio, Jen’s default audio model is trained on 100% licensed and copyright-free music, and their StyleFilters feature allows musicians to license their sound and earn royalties, offering a novel business model that could catch on.

Voice

As voice technology continues to permeate into the world, new models are coming out every month. Here are the latest releases:

Amazon unveiled a foundation model for voice-based apps called Nova Sonic, claiming it to be more accurate at transcribing than ChatGPT-4o, and particularly good at handling background noise or multiple speakers.
A Korean startup called Nari Labs released Dia, an open-source text-to-speech model that claims to exceed the capabilities of leading commercial offerings like ElevenLabs and Sesame. Remarkably, it was developed by two undergraduate techies with no outside funding. It’s not available yet but you can join the waitlist or download the research model from GitHub.
And in case you needed another AI-powered voice option to consider, Rime Labs is another startup looking to give voice to your content. The claim to have created “the most realistic spoken language model you’ve ever heard” with their new Arcana model.

Website + App Builders

Following up from my recent blog post about vibe coding, the world of AI-powered website and app builders keeps expanding.

WordPress launched an AI Website Builder that designs complete sites with images and text from a simple prompt.
Google launched Firebase Studio which uses Gemini models to prototype, build, and deploy full-stack applications.
Lovable 2.0 launched. It’s prettier and more expensive now.
Cognition Labs, who launched the original AI agent-based coding tool Devin, has released a 2.0 version. It’s cheaper now, starting at just $20/month USD.
Create is yet another vibe-coding tool but this one offers non-coders a way to build apps and launch them directly on the Android and Apple app stores with the push of a button.

Interesting Articles

These were the most interesting articles I read over the past month:

AI2027 is a thought-provoking essay that predicts the impact of superhuman AI over the next decade will exceed that of the Industrial Revolution.
Stanford’s 2025 AI Index Report came out. While the full report offers comprehensive analysis, even reviewing the executive summary provides valuable insights into current AI trends.
Anthropic took a peek under the hood to see how developers are using Claude for writing code, and this seemed particularly timely given the previous two articles, as AI threatens to upend the IT industry.
If people become more efficient and are able to do their jobs more quickly, what does that mean for business that charge by the hour and people who get paid to work 8hrs per day? There are some good insights in this article about the end of time-based work.

Politics

And on April 23rd, American president Donald Trump put out an AI-related executive order called “Advancing Artificial Intelligence for American Youth” which mandates integration of AI skills into the nation’s K-12 curriculum over the next 120 days. What could go wrong?

That’s all for this month! Let me know what’s resonated with you lately, either in the comments, or send me an email. Thanks!

Cover image created with Midjourney v7.