On December 16, 2025, OpenAI released GPT Image 1.5 — a significant upgrade to its image generation model that arrived on an accelerated timeline. The original plan was an early 2026 release. Instead, OpenAI moved it forward by weeks. The reason was publicly acknowledged by Sam Altman in a leaked internal memo that circulated the same month: a "code red" declaration after Google's Nano Banana Pro topped multiple leaderboard benchmarks and was driving millions of new users into the Gemini ecosystem.
The result is a model that OpenAI describes as its most capable image generation release — 4x faster, with architecture-level changes to how editing works, and available to all ChatGPT users from day one.
What GPT Image 1.5 Actually Changed
The headline numbers — 4x faster, 20% lower API cost — matter. But the most significant change in GPT Image 1.5 is architectural: it is built directly into the GPT-5 language model, rather than running as a separate diffusion system connected to a language model.
This matters because it changes what the model understands when you ask it to do something. A separate diffusion system processes your text prompt and generates an image from noise, with limited continuity between generation passes. GPT Image 1.5, operating within the same neural network that processes language, maintains context about what the image contains, what you asked to change, and what you asked to preserve.
The practical result is what OpenAI calls "surgical editing." Previous image models would reinterpret large portions of a scene when you asked for a small change. Ask to adjust a jacket color, and the face might shift, the background might change, the composition might drift. GPT Image 1.5 identifies which pixels should change and which should stay constant. It understands the difference between "change this" and "leave everything else alone."
This is not a minor convenience improvement. For any workflow that involves iterative refinement — product imagery, brand asset creation, any visual work where you build on a previous generation — the ability to make specific changes without destroying the rest of the image changes what is practical.
The Competitive Context
The timing of GPT Image 1.5 is inseparable from what Google had shipped in the weeks before it.
Google released Nano Banana Pro on November 20, 2025 — built on Gemini 3 Pro, with reasoning-driven composition, Google Search grounding, up to 14 reference images, and multilingual text rendering that genuinely outperformed everything available at the time. The model immediately dominated LMArena blind evaluation rankings. Downloads of Gemini surged. Google reported that Nano Banana (the original, released August 2025) attracted 10 million new users to Gemini and generated over 200 million image edits within weeks of launch.
OpenAI's internal memo leaked the same month: "code red." The company was watching a competitor gain meaningful market share on image generation — a capability OpenAI had pioneered with DALL-E — and accelerated its response.
GPT Image 1.5 arrived December 16, less than four weeks after Nano Banana Pro. The positioning is direct: where Nano Banana Pro leads on photorealism and Google ecosystem integration, GPT Image 1.5 leads on instruction following and precise editing. Early LMArena benchmarks after release showed GPT Image 1.5 taking the top position for instruction adherence — "doing exactly what I asked" — while Nano Banana Pro retained its photorealism edge.
Text Rendering and Infographic Generation
Text rendering in images has been a documented weakness across AI image models for years. Models would generate shapes that resembled letters without producing actual readable text, particularly for small sizes, complex layouts, or any language beyond basic English.
GPT Image 1.5 is meaningfully better at this. The native multimodal architecture gives the model actual language understanding when rendering text within images — it knows what words say, not just what they look like. This produces readable labels, coherent infographic copy, legible headlines, and correctly spelled text across multiple languages.
OpenAI demonstrated this with an extreme test: rendering the full phrase "How much wood would a woodchuck chuck if a woodchuck could chuck wood" made entirely out of wood. The rendered text in the demonstration was legible and correctly spelled — a test that would have produced garbled results in most previous image models.
For creators building any content that combines text and visuals — marketing materials, infographics, poster design, product labeling, multilingual assets — this is a meaningful capability step.
What Changed in Practice
Generation speed: Up to 4x faster than GPT Image 1. Typical completion times of 15-45 seconds depending on complexity and quality setting, versus 1-3 minutes for the previous generation.
API pricing: 20% lower than GPT Image 1. Current rates: $0.01 (low quality), $0.04 (medium), $0.17 (high) per square image. Image input tokens also dropped 20%.
Quality tiers: Three settings — Low, Medium, High — allow trading speed for detail. Low quality is sufficient for prompt direction testing and rough drafts. High quality is for final production assets.
Surgical editing: The most important workflow change. Sequential edits preserve unchanged elements. Build an image through multiple specific instructions without composition drift.
Text rendering: Significantly improved across small sizes, dense layouts, and multi-language content.
Sizes: 1024x1024 (square), 1024x1536 (portrait), 1536x1024 (landscape).
GPT Image 1.5 on Cliprise
GPT Image 1.5 is available on Cliprise as part of the AI Image Generator feature alongside Nano Banana Pro, Nano Banana 2, Flux 2, Midjourney, and 45+ other image models.
For the full guide to GPT Image 1.5's capabilities, workflows, and where it fits in the Cliprise model lineup, see GPT Image 1.5: Complete Guide →
