News Drop #17 - April 17, 2025
GPT 4.1 just dropped!
Limited info so here's what I know:
- multimodal
- excels at coding
- 1m context
- possibly the cloaked models we saw earlier in the week (huge if true)
Pricing is competitive with other models in its class.
Claude has a competitor to Deep Research
It's called "Research"
Currently only available on the max plan, because of course it is
o3 & o4 mini!
Here's what's known:
- o3 pulled 20.3 which is a new record for Humanity's Last Exam (kaileh.dev/hle)
- o3 & o4 mini are trading blows with each other and Gemini 2.5 pro across the benchmark suite
- Google's models are no longer the only ones competent at math, and actually lose sometimes to the OpenAI ones
- My best guess is that this is o3-medium, as there's no way they got the performance cost of o3-high from the ARC AGI prize down this low this quickly
- They also released Claude code but with ChatGPT
Hey look it's Gemini 2.5 flash!
- Hybrid reasoning and direct answer model
- Overall very solid uplift over 2.0 flash, keep in mind this is a model that responds nearly instantly