Another day, another model improvement. The latest visual model by OpenAI is now the gold standard for creating realistic image, beating Google’s Nano Banana (August 2025).

I prompted a “911 in Hoogeveen” back in an earlier post to Nano Banana (left), and the ChatGPT result today to the right. Nano Banana figured out Hoogeveen was a town in the Netherlands, and created a historical Dutch town as the backdrop, ChatGPT got the actual details of the town (which I recognize very well), but created its own mashup version of the city.

Text rendering is now great. Look at the traffic sign: correct spelling and places relevant to the town. The model is actually incredibly good at making slide in consistent on-brand format. Below the result of a request to transform a slide in a 1960s Swiss graphic design style. The catch: you get pixels not a file you can edit…

It achieves these results not by just being a better pixel generation model. The response to a prompt now involves reasoning about it, sketching a few raw options, ‘seeing’ (an LLM cannot see) the intermediary results, picking the best one, then producing the final result in pixels.

ChatGPT Images 2 is now the default model in ChatGPT, it will be used when you ask it to create an image. Set the model effort to ‘thinking’ to add more reasoning effort in the processing.

To be continued.

SlideMagic: a platform for magical presentations. Free student plan available. LEARN MORE