GENERATIVE · IMAGE · OPENAI~5s
GENERATIVE · IMAGE · OPENAI

GPT Image 1.5.

Photorealism with reasoning.Frame-perfect.

OpenAI's flagship image model — deep world knowledge, exceptional prompt adherence on complex multi-subject scenes, and the kind of portrait realism that makes you double-take. The reach-for tool when the photograph has to feel real.

AVG LATENCY · ~5s
STARTING AT · $0.04 / IMG
TRY IT NOWCmd/Ctrl + Enter to generate
1969 Bethel NY photoreal crowd scene at golden hour, 100,000 people on a hillside.
Cinematic 1969 Bethel NY photoreal crowd scene at golden hour, hand-held 35mm Kodachrome look, layered atmospheric depth.
Studio product photo of luxury Italian leather oxford shoes on a polished black marble plinth.
Studio product photo of luxury hand-stitched Italian leather oxfords on polished black marble, single key light, dramatic chiaroscuro.
Environmental editorial portrait of a steelworker in fluorescent vest inside a working steel mill.
Environmental editorial portrait of a steelworker in a fluorescent yellow vest inside a working steel mill, sparks falling in slow-motion behind, anamorphic 2.39:1.
Editorial close-up portrait of a weathered Greek fisherman with deep wrinkles and salt-grey beard.
Editorial close-up portrait of a weathered Greek fisherman, age seventy, salt-grey beard, oatmeal cable-knit sweater, fishing nets over his shoulder, 85mm f/1.2.Generated by GPT Image 1.5 · OpenAI
LIVE OUTPUT

Where GPT Image 1.5 shines.

Editorial close-up portrait of a weathered Greek fisherman.

Photoreal portraits

Editorial-grade portrait generation with realistic skin micro-detail, accurate eye reflections, and natural lighting falloff.

EXAMPLEEditorial close-up portrait of a weathered Greek fisherman, salt-grey beard, cable-knit sweater, 85mm f/1.2.

1969 Bethel NY photoreal crowd scene at golden hour.

Complex multi-subject compositions

Crowded narrative scenes where prompt adherence matters more than per-pixel craft. Holds layout intent across 8+ elements.

EXAMPLE1969 Bethel NY photoreal crowd scene, 100,000 people on a hillside, hand-held 35mm grain.

Studio product photo of luxury Italian leather shoes on black marble.

Product staging

Product hero shots with coherent reflections, contact shadows, and surface chemistry. Replace 70% of catalog photography.

EXAMPLELuxury Italian leather oxfords on a marble plinth in a modernist gallery, soft skylight, single key light.

Environmental portrait of a steelworker in a fluorescent vest inside a steel mill.

Editorial photography

Newsroom-grade photo composition for stories that don't need real photographers. Lighting, blocking, and emotional read all dialed.

EXAMPLEEditorial portrait of an industrial worker in a steel mill, sparks falling, fluorescent green safety vest, cinematic shadows.

Magazine cover for FIELD with a single ripe tomato on marble.

Text-on-image marketing

Strong second to Ideogram on legible typography. Use when you need photoreal background AND clean text.

EXAMPLEA magazine cover for 'FIELD' featuring a single ripe tomato on a marble counter, oversized condensed serif at the top.

Anatomical cross-section of a human heart with labels on cream paper.

Scientific diagrams

Lab-grade illustrations and labeled cross-sections. Prompt adherence makes labels and structure reliable in a single pass.

EXAMPLECross-section of a beating human heart, anatomically labeled, isometric perspective, blueprint linework on cream paper.

Generated with GPT Image 1.5.

A live cross-section of the model's range — portraits, products, typography, illustration, fashion, cinematic. Hover any tile to pause and read its prompt.

Editorial portrait of a Mongolian eagle hunter in winter furs at dawn with golden eagle on his arm.
Editorial close-up portrait of a Mongolian eagle hunter in deep winter furs at dawn, golden eagle on his arm, snow-dusted Altai mountains, 85mm anamorphic.
Cinematic Times Square at night with neon billboards reflected in wet pavement.
Cinematic Times Square at night: dense crowd, neon billboards reflected in wet pavement, anamorphic 2.39:1, 35mm film grain.
Photoreal close-up of an iridescent dragonfly resting on a dewdrop on a fern leaf.
Photoreal close-up of an iridescent dragonfly resting on a single dewdrop on a fern leaf, ultra-shallow depth of field, prismatic wing.
Cinematic NYC subway car interior at midnight with a single passenger reading.
Cinematic NYC subway car interior at midnight, a single passenger reading a newspaper, fluorescent overhead light, deep cinematic shadows.
Editorial portrait of an elderly Japanese tea master cradling a delicate matcha chawan.
Editorial portrait of an elderly Japanese tea master cradling a delicate matcha chawan, soft north window light, deep blacks, 85mm f/1.4.
Studio product photo of a vintage Leica M3 camera on a weathered oak desk.
Studio product photo of a vintage Leica M3 on a weathered oak desk, single warm key light, ultra-detailed brass and chrome detailing.
Cinematic wide of a Patagonian glacier calving into a deep teal lake at golden hour.
Cinematic wide of a Patagonian glacier calving into a deep teal lake at golden hour, mist and ice spray frozen in motion, anamorphic 2.39:1.
Editorial portrait of a Tuareg desert nomad in indigo turban, golden-hour rim-light.
Editorial portrait of a Tuareg desert nomad in indigo turban, weathered face, golden-hour rim-light, sand-blasted texture.
Hyperreal close-up of fresh oysters on shaved ice with seaweed garnish.
Hyperreal close-up of fresh oysters on shaved ice with seaweed garnish, soft window light, water droplets on the shell, food editorial.
Cinematic Berlin café at twilight with lone barista wiping down the espresso machine.
Cinematic Berlin café at twilight, lone barista wiping down an espresso machine, ambient warm pendant lights, urban story photography.
Photoreal architectural shot of a brutalist concrete cathedral interior with raking sunlight.
Photoreal architectural shot of a brutalist concrete cathedral interior, raking afternoon sunlight, deep shadows, Tadao Ando aesthetic.
Editorial portrait of a London pearly king in full button suit on a vibrant East End street.
Editorial portrait of a London pearly king in full button suit, vibrant East End street backdrop, soft afternoon light.
Environmental portrait of a beekeeper at golden hour in a Provençal lavender field.
Environmental portrait of a beekeeper at golden hour in a Provençal lavender field, soft ambient bees, warm sun, magazine editorial.
Photoreal underwater shot of a freediver descending into a sun-shafted kelp forest.
Photoreal underwater shot of a freediver descending into a sun-shafted kelp forest, deep teal palette, atmospheric particles drifting.

By the numbers.

#2On Infer's image generation1271 Elo · Arena EloView leaderboard →
LeadPhotorealism (internal eval)
+12% vs FLUX 1.1 [pro]Complex scene adherence
Top 1Portrait fidelity
Nano Banana 2Stronger photorealism on portraits and people; trade-off is ~3× slower latency.
FLUX 1.1 [pro]Better at complex multi-subject scenes with embedded text.
Imagen 4 StandardSlower (~5s vs 2s); both lead on photorealism in different categories.
$0.04/ image

Pay only for successful generations. No idle, no minimums, no per-seat. Volume discounts kick in at 10K req/mo.

VS NATIVESame per-image price as OpenAI's direct API — but with one Infer key, batched billing, and a unified SDK across 100+ models.
VS SELF-HOSTClosed weights — self-host isn't an option. Infer is the production path for GPT Image.

Things teams ask.

Q.01How does prompt adherence compare to Nano Banana 2?
GPT Image 1.5 leads on complex multi-element scenes (8+ subjects, fine spatial relationships) and on photorealistic portraits. Nano Banana 2 leads on speed, multilingual text, and infographic factuality. Pick by use case rather than benchmark.
Q.02Does it support image editing?
Yes — the same endpoint accepts an input image and an instruction. Editing is competitive with Nano Banana 2 on quality but slower; use FLUX.1 Kontext if turnaround matters.
Q.03What are the pricing tiers?
Single tier at $0.04 per generated image. No resolution surcharge, no SDXL-style 'turbo' downsampling. The price you see is the price you pay.
Q.04Are there safety filters?
Yes. OpenAI's standard safety stack (CSAM detection, public-figure restrictions, NSFW filters) runs on every request. Filtered requests return a structured error rather than a degraded image.
Q.05Can I use the outputs commercially?
Yes. Infer passes through OpenAI's commercial terms for GPT Image outputs.
Q.06What are the rate limits?
Default tier is 60 requests per minute, with burst capacity to 120/min. Paid tiers scale to 1,000+ rpm.
Q.07How is this different from calling OpenAI directly?
One key, one bill, one SDK shape across 100+ models. Infer handles auth, retries, and per-model quirks — drop in by changing one URL.

Ship with GPT Image 1.5.

One key. One bill. One SDK shape — across 100+ models. Free credits on signup, no card required.