News Drop #18 - May 29, 2025
It's been an eventful few weeks! I didn't get one of these out last week, and so you all get to enjoy a double length newsdrop!
Google I/O's been dominating the headlines, but other labs refuse to be outdone:
Starting off with the long anticipated Google I/O developer conference:
There's been a ton of new announcements across all fields, but I'll try to cover them all!
Gemini 2.5:
A new version of 2.5 flash is out, and it's even better than before! Google now has strong offerings both at the top and mid range of performance, with 2.5 flash offering serious value for its low cost. Both of these models are extremely fast to respond, hybrid reasoning or non-reasoning, and pretty cheap! 2.5 pro DeepThink was also announced, and is even better than standard 2.5 pro, at the cost of taking forever to respond due to its multiple attempts to solve problems, and is therefore restricted to the new AI Ultra plan.
Imagen4 and Veo3:
Image and video generation models are getting major upgrades. Imagen4 is very fast and pretty impressive, only takes a few seconds. Veo3 is class-leading video generation, now including sound. Veo3 makes videos that often don't even seem AI generated at a first glance.
Gemma3n:
- A new open-source model
- Performs almost as good as Claude 3.7 sonnet
- Class-leading for open weight models
- Anyone can download and run locally
- Extremely impressive and cheap
Alongside this were some other cool Gemma powered models, such as SignGemma for understanding sign languages and MedGemma for medical text comprehension, or DolphinGemma, because Google wants to talk to dolphins now I guess...
AI in search:
Google's main product is a search engine, but they've been expanding what is possible in recent months!
- AI mode is now public to everyone, which lets you use a chat-like interface for more complex questions
- AI mode can break complex topics into a series of questions, research them, and then formulate the answer
- Deep Search is in preview, which takes time to perform even more intensive queries
- Agentic capabilities are coming from Project Mariner, this will be able to automatically perform tasks like booking tickets and reservations
- AI mode can also create graphics and visuals to help you understand complex topics
Other Google stuff:
Jules, the AI coding agent:
- Automatically performs tasks via GitHub integration
- Spends up to about an hour working on things, then creates a new GitHub branch with the changes
- Acts basically like a junior developer
Flow:
- AI filmmaking tool using Veo 3
- Allows specific control over characters, scenes, and styles
- Can create entire movies with it
AI powered shopping and automatic Agentic checkout, plus a feature that lets you digitally try on clothes while shopping just from a picture of you.
Google AI Ultra:
- New max or pro tier plan like in Claude or OpenAI
- Costs $250 per month
- Includes lots of AI stuff, early access, and 30tb of Google Drive storage, as well as all regular Google One benefits
SynthID:
- Detector will be out publicly soon
- Detects any Google AI generated content via watermark, including text
Next up, Claude 4:
Claude 4 sonnet and Opus are here! Sonnet is a particularly effective coding model, whereas Opus excels at more complex tasks that can take up to hours to complete. Anthropic is now back on top in the software engineering world, however, in real-world usage Gemini often performs tasks just as effectively, while being much faster and cheaper. Anthropic continues to do lots of interesting safety research, which seems to be increasingly important in the modern day. These models are however much more dangerous than previous ones! You can read their safety report here: https://www.anthropic.com/news/activating-asl3-protections
My take: the vibes are strong with this one
Claude also now has a live voice mode, similar to what ChatGPT and Gemini live offer.
DeepSeek has a new version of R1:
A new version of the groundbreaking open-weight reasoning model has been released, and it's state of the art by a huge margin when compared to other open-weight models. As a reminder, open weight means that anyone (including us) can download and run it locally without an internet connection for basically free.
Compared to other models like Gemini pro or o3-full it does fairly well, trading blows with both of them.
The FOSS community seems to be really sticking with the top of the pack right now, and China continues to innovate!
Finally, as I'm running out of time (and will to write), I'm going to rapid fire round up with some less significant, but still cool stories:
Chatterbox TTS is a new open-source text to speech model that beats the very best from ElevenLabs. You can clone someone's voice with just 5 seconds of audio, and it's REALLY impressive, especially because anyone can use it in their own tech.
Huawei's been making progress in the chip world, and now has a H200 competitor. They're still quite a bit behind Nvidia, and no one is competing with Google's TPUs really, but China is catching up in an effort to avoid the bans and export controls.
Github Copilot is now Free and Open Source, which is awesome as it allows basically anyone to build their own take on a coding agent.
There's several interesting tunes of models to be better at coding, such as r1-coder or Mistral Devstral, more info is here: https://mistral.ai/news/devstral (very good at coding for open source, and very good compared to other cheap models)
And xAI paid Telegram 300 million USD to put grok in the service.
That's all for this week, and that's a wrap on the year! Don't panic though, I will continue to post major AI updates right here all summer, and I'm even considering starting a newsletter that you should all sign up for.
We're gonna cook up some crazy stuff this summer, and I hope you're all as excited as us to hit the ground running with some groundbreaking tech next year!
This is Kellen signing off for the summer, it's been great, see you all next year!!