Which GPT-5 model should I use?

Use GPT-5.4 as your default for complex reasoning and coding. Use GPT-5.3 Instant for fast everyday tasks at lower cost. Use GPT-5.3 Codex for cost-sensitive coding with heavy input. GPT-5.4 Mini and Nano are best for latency-sensitive production applications.

How many GPT-5 models are there?

OpenAI currently offers 6 GPT-5 family models: GPT-5.4 (flagship), GPT-5.4 Mini (balanced), GPT-5.4 Nano (fastest), GPT-5.3 Codex (coding specialist), GPT-5.3 Instant (everyday tasks), and GPT-5.2 Thinking (legacy, retiring June 2026).

What is GPT-5.4's context window?

GPT-5.4 has a 1,050,000-token context window via API — the largest in the GPT-5 family. GPT-5.3 Codex has 400,000 tokens. This allows GPT-5.4 to process entire large codebases in a single context.

How much does GPT-5.4 cost?

GPT-5.4 costs $2.50 per million input tokens and $10.00 per million output tokens. Cached input costs $0.63 per million tokens. It's more expensive than GPT-5.3 models but more token-efficient per task.

Is GPT-5.2 being retired?

Yes. GPT-5.2 Thinking will be retired on June 5, 2026. OpenAI recommends migrating to GPT-5.4, which exceeds GPT-5.2's capabilities across all benchmarks.

What is GPT-5.3 Instant good for?

GPT-5.3 Instant is optimized for high-throughput everyday tasks at approximately $0.30/$1.20 per million tokens — the cheapest GPT-5 model. It excels at Q&A, how-tos, technical writing, and translation with 26.8% fewer hallucinations than its predecessors.

Key Takeaways

GPT-5.4 is the new default: Launched March 5, 2026, it merges GPT-5.3 Codex's coding prowess with broader reasoning, a 1M-token context, and native computer use.
Six models serve different needs: Flagship (5.4), balanced (5.4 Mini), fast (5.4 Nano), coding specialist (5.3 Codex), everyday (5.3 Instant), and legacy (5.2 Thinking).
Price spans 10x: From $0.30/MTok (GPT-5.3 Instant) to $2.50/MTok (GPT-5.4) on input — choosing the right model directly impacts your bill.
GPT-5.2 retires June 5: Migrate now to avoid disruption.

The Complete Guide to OpenAI's GPT-5 Model Family

OpenAI's GPT-5 family has grown into six distinct models, each optimized for a different trade-off between capability, speed, and cost. Choosing the wrong model means either overpaying for simple tasks or under-powering complex ones.

This guide covers every model, with pricing, benchmarks, and a clear decision framework.

The Full Lineup (March 2026)

Model	Role	Input Cost	Output Cost	Context	Released
GPT-5.4	Flagship reasoning + coding	$2.50/MTok	$10.00/MTok	1.05M	Mar 5, 2026
GPT-5.4 Mini	Fast balanced	Lower	Lower	Smaller	Mar 2026
GPT-5.4 Nano	Lowest latency	Lowest	Lowest	Smallest	Mar 2026
GPT-5.3 Codex	Coding specialist	$1.75/MTok	$7.00/MTok	400K	Mar 3, 2026
GPT-5.3 Instant	Everyday tasks	~$0.30/MTok	~$1.20/MTok	Standard	Mar 3, 2026
GPT-5.2 Thinking	Legacy flagship	Higher	Higher	Smaller	Previous

Sources: OpenAI API Models, FelloAI Comparison

GPT-5.4: The New Flagship

GPT-5.4 is OpenAI's most capable model — the first to merge frontier reasoning with frontier coding in a single architecture.

What Makes It Special

1,050,000-token context window — Process entire large codebases, complete documentation sets, and long conversation histories without chunking
Native computer use — Interact with desktop applications, browsers, and system tools programmatically
57.7% on SWE-Bench Pro — State-of-the-art coding benchmark performance
83% on GDPval knowledge tasks — Matches or exceeds industry professionals
Token efficient — Fewer output tokens per task despite higher nominal pricing

When to Use GPT-5.4

Complex coding tasks requiring deep reasoning
Multi-step autonomous workflows (via Codex CLI)
Long-context analysis (>400K tokens)
Tasks requiring computer use capabilities
Any new project where you need the best model available

Pricing

Tier	Input	Output	Cached Input
Standard	$2.50/MTok	$10.00/MTok	$0.63/MTok

GPT-5.4 Mini: The Balanced Choice

GPT-5.4 Mini inherits GPT-5.4's architecture at a lower cost and latency point. It's designed for applications that need good reasoning without paying flagship prices.

When to Use GPT-5.4 Mini

Production APIs where cost per request matters
Applications needing a balance of speed and quality
Chatbot backends with moderate complexity
Workflows where GPT-5.4 is overkill but GPT-5.3 Instant isn't enough

GPT-5.4 Nano: The Speed Demon

GPT-5.4 Nano is optimized for the lowest possible latency. It trades reasoning depth for raw speed.

When to Use GPT-5.4 Nano

Real-time autocomplete and suggestions
Latency-critical production endpoints
High-volume, low-complexity classification tasks
Mobile applications where response time is critical

GPT-5.3 Codex: The Coding Specialist

GPT-5.3 Codex remains the best model for cost-sensitive, input-heavy coding workflows. It runs 25% faster than GPT-5.2 Codex and costs less per input token than GPT-5.4.

When to Use GPT-5.3 Codex

Terminal-heavy batch coding operations
Workflows that repeatedly send large repository context
Cost-optimized agentic coding pipelines
Tasks where the 400K context window is sufficient

When to Upgrade to GPT-5.4

You need >400K tokens of context
You need computer use capabilities
You need knowledge work beyond coding
The 43% input cost premium is worth the broader capabilities

Pricing

Tier	Input	Output	Cached Input
Standard	$1.75/MTok	$7.00/MTok	$0.44/MTok

GPT-5.3 Instant: The Everyday Workhorse

GPT-5.3 Instant is the cheapest GPT-5 model and the best choice for high-volume, everyday tasks.

Key Strengths

26.8% fewer hallucinations than predecessor models
Excellent at: Q&A, how-tos, technical writing, translation
Lowest cost: ~$0.30/$1.20 per million tokens
High throughput: Optimized for speed

When to Use GPT-5.3 Instant

Customer support chatbots
Content generation at scale
Translation and localization
Simple Q&A systems
Any high-volume application where cost per request matters most

Pricing

Tier	Input	Output
Standard	~$0.30/MTok	~$1.20/MTok

GPT-5.2 Thinking: Legacy (Retiring June 2026)

GPT-5.2 was the previous flagship model. It introduced a three-tier architecture (Instant, Thinking, and Pro) but has been superseded by GPT-5.4 across all benchmarks.

Migration Timeline

Now → June 5, 2026: GPT-5.2 Thinking available under Legacy Models
June 5, 2026: GPT-5.2 Thinking retired. API calls will fail.
Action Required: Update model parameter from gpt-5.2-thinking to gpt-5.4

# Before (will stop working June 5, 2026)
model="gpt-5.2-thinking"

# After
model="gpt-5.4"

Decision Framework: Which Model to Use

By Use Case

Use Case	Recommended Model	Why
Complex coding + reasoning	GPT-5.4	Best capability, 1M context
Daily coding (cost-sensitive)	GPT-5.3 Codex	Lower input cost, strong coding
General chatbot/Q&A	GPT-5.3 Instant	Cheapest, fast, low hallucination
Production API (balanced)	GPT-5.4 Mini	Good quality, reasonable cost
Real-time autocomplete	GPT-5.4 Nano	Lowest latency
Science/research	GPT-5.4	Deepest reasoning

By Budget

Monthly Budget	Strategy
<$50	GPT-5.3 Instant for everything
$50-200	GPT-5.3 Instant + GPT-5.3 Codex for coding
$200-1,000	GPT-5.4 as default, GPT-5.3 Instant for simple tasks
$1,000+	GPT-5.4 for everything, or hybrid routing

The Router Pattern

The most cost-effective approach for production applications:

Request → Classify Complexity
  ├── Simple (60%) → GPT-5.3 Instant ($0.30/MTok)
  ├── Medium (25%) → GPT-5.4 Mini
  ├── Complex (10%) → GPT-5.4 ($2.50/MTok)
  └── Coding (5%)  → GPT-5.3 Codex ($1.75/MTok)

This pattern can reduce costs by 70-80% compared to running GPT-5.4 for all requests, with minimal quality impact.

GPT-5 vs the Competition

How does the GPT-5 family stack up against Claude and Gemini?

Model	Input Cost	SWE-Bench	Context	Strength
GPT-5.4	$2.50/MTok	57.7% (Pro)	1.05M	Broadest capability
Claude Opus 4.6	$15/MTok	80.8% (Verified)	1M	Deepest reasoning
Claude Sonnet 4.6	$3/MTok	79.6% (Verified)	1M	Best value reasoning
Gemini 3.1 Pro	Varies	Competitive	2M	Largest context

Each model family has strengths. GPT-5.4 offers the most balanced capability set; Claude leads on coding benchmarks; Gemini leads on context window size.

Beyond the API: Building Without Code

All GPT-5 models are tools for developers. Whether you use GPT-5.4 directly or through Codex CLI, you still need programming knowledge to build applications.

If you want to build an app without writing code, platforms like ZBuild let you describe your application in plain language and get a complete working product — powered by AI models like these behind the scenes.

Try ZBuild free →

Summary

OpenAI's GPT-5 family offers a model for every use case and budget:

Model	One-Liner
GPT-5.4	Best overall, use this if unsure
GPT-5.4 Mini	Good balance of speed and cost
GPT-5.4 Nano	Fastest, for latency-critical apps
GPT-5.3 Codex	Cheapest per-token for heavy coding
GPT-5.3 Instant	Cheapest overall, for everyday tasks
GPT-5.2	Retiring June 5 — migrate now

The right choice depends on your workload, budget, and latency requirements. When in doubt, start with GPT-5.4 and optimize down to cheaper models as you understand your traffic patterns.

Published by the ZBuild Team. Build apps without coding at zbuild.io.

OpenAI GPT-5 Model Guide: Every Model Explained (March 2026)

Key Takeaways

The Complete Guide to OpenAI's GPT-5 Model Family

The Full Lineup (March 2026)

GPT-5.4: The New Flagship

What Makes It Special

When to Use GPT-5.4

Pricing

GPT-5.4 Mini: The Balanced Choice

When to Use GPT-5.4 Mini

GPT-5.4 Nano: The Speed Demon

When to Use GPT-5.4 Nano

GPT-5.3 Codex: The Coding Specialist

When to Use GPT-5.3 Codex

When to Upgrade to GPT-5.4

Pricing

GPT-5.3 Instant: The Everyday Workhorse

Key Strengths

When to Use GPT-5.3 Instant

Pricing

GPT-5.2 Thinking: Legacy (Retiring June 2026)

Migration Timeline

Decision Framework: Which Model to Use

By Use Case

By Budget

The Router Pattern

GPT-5 vs the Competition

Beyond the API: Building Without Code

Summary

Sources

Related Articles

Common questions

Build with ZBuild

Now try it yourself

Related articles

GPT-5.4 Deep Dive: Context Window, Vision, Computer Use, and Codex Integration (2026)

I Spent $500 Testing Claude Sonnet 4.6 vs Opus 4.6 — Here's What I Found

GPT-5.4 Migration Diary: What Broke, What Got Better, and What I Didn't Expect

Seedance 2.0 Complete Guide: ByteDance's AI Video Generation Model for Text, Image, Audio, and Video Input (2026)