Read in

News

Google’s decay

Google’s AI team has been struggling as of late. Their Gemini 3.1 Pro and Flash 3.5 models have failed to impress, taking them out of the conversation as a frontier lab. With the rise of the Chinese AI labs, who have orders of magnitude less compute yet are still competitive if not better than the Gemini models, things have not been looking good for Google, who seem to have been unable to figure out the post training of their models.

They have even lost their spot as the multimodal kings, as OpenAI has been focusing more on these capabilities and have been able to surpass Gemini.

Because of this poor performance, they are starting to lose their superstar researchers. The first to leave this week was Noah Shazeer.

Even if you don’t know Noam Shazeer, you do know his work. He is arguably the most important figure in modern AI, being credited with coming up with the modern attention mechanism we use in transformers (the main breakthrough in the Attention is All you Need paper) and also being one of the creators of sparse mixture of experts models, which has allowed us to efficiently scale LLMs to their current size.

To call him the Kobe or Messi of AI undersells him, he is more like AI Jesus, without him the field would not be where it is today.

He left Google originally back in 2022 to start a startup called Character.ai, but was acquihired back to Google in 2024 for 2.7 billion dollars.

Just 18 months later, he is jumping ship again, this time heading over to OpenAI, for what I can only assume is a ridiculous amount of money.

The other major departure this week was John Jumper, the project lead for the noble prize winning AlphaFold team, announced he will be going to Anthropic.

AlphaFold will probably go down as Demis Hassabis’ and DeepMind’s greatest achievement (there’s a reason they won a Nobel prize for it), so to lose the person that led that project is another large blow to the organization.

Because of these researchers achievements, I can imagine that Google offered them the world, but because of the direction or culture of the Google DeepMind org, they decided to jump ship. Its also interesting that they went to different competitors, showing that Anthropic and OpenAI are both considered top labs by the best in the field, and that which one is best depends on your own views; there is no definitive best that everyone is going to.

As for Google, I don’t see them coming back from this. Their models have been getting worse relative to the competition (and they know it), and now losing two of their most well regarded researchers will cause further exodus, similar to Meta after Llama 4 release, where they reportedly lost 80% of their AI team due to how poorly their model performed.

This is despite Google having the most compute of any of the major AI labs, which shows that your compute is only as good as the people that are using it, and that scale is not all that you need.

Releases

GLM 5.2

As one giant dies, another emerges, as Z.ai has released GLM 5.2, which is competitive with GPT 5.5 and Opus 4.8.

GLM 5.2 benchmark scores

On the high signal DeepSWE benchmark, where Chinese models have struggled, we see a major improvement.

The Chinese models also tend to struggle on newer benchmarks because their models are not very general and they have not had time to overfit on the benchmarks. Looking at the performance of GLM on benchmarks released this week, we see a different story.

On shape-rotator bench, we see GLM 5.2 outperforming Opus 4.8. On Artificial Analysis’s new agentic long context knowledge work benchmark, GLM 5.2 is the 3rd best model, falling behind only Fable 5 and Opus 4.8, but beating GPT 5.5. On KernelBench-Mega and hard, it gets better speedups than GPT 5.5 and every other open source model, only losing to Opus 4.8.

On some of the less concrete/ soft skills benchmarks we also see strong performance. On Design Arena, GLM 5.2 takes first place, beating out Fable 5. On EQ bench (creative writing and emotional intelligence) it beats all other open source models, including Kimi, which has historically been a very strong writing model.

It also passes the real world vibe check. I have seen many people compare the model to GPT 5.5 and Opus 4.8 in terms of quality, and that it is no longer a model they fall back to when they run into rate limits with Claude or GPT, but rather it is a model you can daily drive with little intelligence penalty.

It does this while being 1/3 the price of GPT 5.5 and 1/5 the price of Opus according to Artifical Analysis.

This is very exciting to see, and it is probably the smallest gap we have had between open and closed source models since DeepSeek R1. I expect the other Chinese labs to also catch up in the following months, which will only heat up the frontier LLM battle even more.

Finish

I hope you enjoyed the news this week. If you want to get the news every week, be sure to join our mailing list below.

Vibes of the week

From Akiitopiia on Twitter

GLM 5.2

News

Google’s decay

Releases

GLM 5.2

Finish

News

Google’s decay

Releases

GLM 5.2

Finish

resumo

Notícias

A decadência do Google

Lançamentos

GLM 5.2

Final

Noticias

La decadencia de Google

Lanzamientos

GLM 5.2

Final

Stay Updated