Read in

News

OpenAI enters the browser game

Having agents control your browser for you has been big recently, with products like Browserbase and Perpexity Comet.

OpenAI has decided to dip their toes in the space as well, releasing their own web browser, ChatGPT Atlas.

Atlas operates just like any normal web browser would, except you have a chat sidebar where you can ask ChatGPT to do tasks for you. One of the big selling points is that it keeps track of your browsing history and habits, and is able to build a profile around you to continually improve the more that you use it.

OpenAI also says that they have done extensive red teaming to prevent it from following malicious “hidden” AI instructions on a page. It still is vulnerable to other attacks like clipboard injection since it can’t see the Javascript of the site that is being used.

In terms of quality, it is nothing that we haven’t seen before. It is good at “boring”, well defined, repetitive tasks and struggles in situations where it’s not immediately obvious what it needs to do or if the task requires any aesthetic taste.

Releases

OpenRouter Exacto

Previously Kimi had uncovered that many of the people hosting their open source Kimi K2 model did not have the same quality as their own “correct” implementation.

This lead to OpenRouter (an inference provider aggregator) to dig into this more, and for many of the major models, they have identified which of their providers are the best.

providers offered

They bundle the best inference providers into a group called the exacto providers. You can use the exacto providers by adding the :exacto keyword to the model name when using a supported model on OpenRouter.

Benchmarks

Performance increase by using only exacto providers on OpenRouter

Everyone releases an OCR model

All of the cool kids this week decided to release an open source OCR model. The types of models fall into 2 distinct categories: interesting, and good. We will start with the interesting ones first.

On the same day, both DeepSeek and Z.ai, two of the top labs in China, released OCR models that operate fully in pixel space bypassing the need to convert to tokens. By doing so, they are able to use 3x less input tokens to process the documents.

These models are both very strong, and would be state of the art if it weren’t for the other models also released this week. Architecturally, I think we will see most models going forward adopt a similar architecture to these two, since it is so much more efficient, and it does not cause any real hit to performance. It is still to be seen if we can adopt this to more general LLMs as well in the future.

On the good side of things, we have 3 new models that all exceed the previous state of the art level.

The first is Paddle OCR from the Chinese Paddle Paddle team. It was state of the art for a few hours, until Chandra OCR was released.

Chandra OCR is from datalab. The model was previously closed source, its release this week is just the open sourcing of it.

The final model is OlmOCR 2 from AllenAI.

Model scores

Scores from all the models mentioned — from AllenAI

If you are looking to use the models, Chandra OCR looks like the best based on scores, but it doesn’t tell the whole story. OlmOCR has comparable scores, and is made to run much faster. This can be seen by the pricing on the companies site for their hosted versions.

Chandra OCR is 10x more expensive per page than OlmOCR 2 ($2 vs $0.20 per thousand pages). So if you have a large number of documents, I would suggest OlmOCR 2, but if you need the very highest quality and don’t care about how much it costs, then use Chandra OCR.

All of these models are open source as well, so you can run them at home as well.

Quick hits

Claude Code comes to the browser

Similar to OpenAI’s Codex, which has both a web and terminal interface, Claude Code now has the same as well.

Finish

I hope you enjoyed the news this week. If you want to get the news every week, be sure to join our mailing list below.

Color video of a Tokamok reactor operating — from Tokamak Energy on Twitter

OpenAI is a browser company

News

OpenAI enters the browser game

Releases

OpenRouter Exacto

Everyone releases an OCR model

Quick hits

Claude Code comes to the browser

Finish

News

OpenAI enters the browser game

Releases

OpenRouter Exacto

Everyone releases an OCR model

Quick hits

Claude Code comes to the browser

Finish

Notícias

OpenAI entra no jogo dos navegadores

Lançamentos

OpenRouter Exacto

Todo mundo lança um modelo de OCR

Destaques rápidos

Claude Code chega ao navegador

Fim

resumen

Noticias

OpenAI entra en el juego de los navegadores

Lanzamientos

OpenRouter Exacto

Todo el mundo lanza un modelo de OCR

Notas rápidas

Claude Code llega al navegador

Final

Stay Updated