PICK YOUR SUPPORT STYLE
MONTHLY SUPPORT
Reader
$5/mo
Contributor
$15/mo
Architect
$50/mo
Recurring subscriptions auto-bill monthly via Stripe Checkout. Cancel anytime from the receipt email.
Kimi K2.5
Can Kimi topple the closed source giants? Do skills files actually work?
tl;dr
- Can Kimi K2.5 beat GPT 5.2 and Opus 4.5?
- Do skills files actually help your agents?
- Google’s world model Genie 3 gets released to the public
Releases
Kimi K2.5
Moonshot AI have released an updated version of their 1 trillion parameter open source model, Kimi K2.5. This version departs from its previous version (and most Chinese models in general) by being a multimodal model, meaning it supports both text and image inputs.
Kimi has been known for its interesting personality and writing style, something that was unique compared to all other LLMs. That personality has been degraded a bit (it sometimes says “You’re absolutely right!”), but this has come at better expressiveness in agentic tasks, which we can see as it sits on top of the Design Arena leaderboard.

For coding tasks it still lags behind Opus 4.5 and GPT 5.2, the two top tier models right now. This is actually the case from what I have seen for most tasks. On benchmarks it is in the top tier, but in the real world it is in the tier below, with models like GLM 4.7, Gemini 3 Flash, and Sonnet 4.5.
| Model | $ per million (input) | $ per million (output) | Tokens per second |
|---|---|---|---|
| Kimi K2.5 Thinking | $0.6 | $3 | 30 |
| Gemini 3 Flash | $0.50 | $3 | 75 |
| GLM 4.7 | $0.60 | $2.20 | 90 |
| Claude Sonnet 4.5 | $3 | $15 | 57 |
| GPT 5.2 | $1.75 | $14 | 34 |
| Claude Opus 4.5 | $5 | $25 | 64 |
GPT 5.2 and Opus 4.5, although being the top models, are there for different reasons. GPT 5.2 is cold and very literal, but is thorough and extremely smart. Opus on the other hand understands user intent very well, and is great to talk to, but makes more mistakes.
I feel like the comparison is very similar for Kimi K2 and Gemini 3 Flash. Kimi is the cheaper version of Opus and Gemini is the cheaper version of GPT 5.2.
For cheap coding, I think I will still turn to GLM 4.7, but for all other tasks Kimi beats it out, which means it’s a top 5 model in the world right now. I highly recommend checking it out if you haven’t already.
Research
Skills are not enough
If you have been using any agentic coding tool (Claude Code, Cursor, etc) you have probably heard of skills. Skills are markdown files that contain instructions for LLMs on how to do specific tasks or use certain libraries that the model may not have been trained on.
What Vercel found out is that just because you have these skills, doesn’t mean the models will use them. By default, most frameworks will just tell the LLMs that they exist, but its up to the LLMs to actually read them when needed.

What they found was that the models will not call skills on their own unless specifically told to and even when you tell them directly in your AGENT.md they still won’t use them when needed.
They found to get agents to calls when needed properly they had to add this to their AGENTS.md file
IMPORTANT: Prefer retrieval-led reasoning over pre-training-led reasoning for any {your skill content} tasks. {List of paths to skills files here}
This bypasses the actual skills loading and calling tools that frameworks have and instead just gives the model the direct paths to look at instead, which it understands to do much better.
This is most likely due to the fact that models are much more used to reading and looking at files, as that’s just a general coding task that they have to do, versus using the custom skill calling tools that they have in their harnesses. This goes to show the importance of utilizing things that the model has already seen a lot of versus making your own new abstraction for them to go and try and use.
Quick Hits
Genie 3 Public Release
Google’s world model Genie 3 has been released publicly.
An AI world model is basically a video game engine that generates each frame on the fly based on your inputs. There is no game engine, code, or any other additional state that is used, it is purely an AI model. You can give it a starting frame, or just a text description of the world that you want, and then you can interact with the world from there and it will generate it all on the fly for you as you go in real time.
Note: to access the model you will need Google’s AI Ultra subscription, which is $125 a month for the first 3 months and then $250 a month after that.
Finish
I hope you enjoyed the news this week. If you want to get the news every week, be sure to join our mailing list below.
Stay Updated
Subscribe to get the latest AI news in your inbox every week!