GPT-5.3-Codex: OpenAI Unifies Its Training Stacks
OpenAI's GPT-5.3-Codex is the first model combining Codex and GPT-5 training, scoring 77.3% on Terminal-Bench 2.0 and 81.4% on SWE-Lancer.
GPTUni Team
OpenAI released GPT-5.3-Codex on February 5th, 2026 — the same day as Anthropic's Claude Opus 4.6 release, in what the industry is calling "AI Super Bowl." The model is the first to combine OpenAI's Codex coding stack with the GPT-5 foundation model training.
The headline number is 77.3% on Terminal-Bench 2.0, a benchmark that tests end-to-end terminal workflow execution. This is a dramatic jump from the 64.0% achieved by GPT-5.2-Codex just three weeks earlier. The model also scores 81.4% on SWE-Lancer IC Diamond and 64.7% on OSWorld-Verified.
Notably, GPT-5.3-Codex is the first model rated "High" for cybersecurity capability under OpenAI's Preparedness Framework. This means the model can identify and analyze software vulnerabilities at a level that required specialized security teams previously.
The 400K context window with 128K maximum output makes it suitable for processing entire repositories and generating extensive code changes. Pricing has not been officially published for API access, though early reports suggest it may be approximately 1/7th the cost of Claude Opus 4.6.
GPT-5.2-Codex, released January 14th, served as the stepping stone. It introduced the agentic coding workflow with context compaction that GPT-5.3-Codex builds on, scoring 56.4% on SWE-Bench Pro in its initial release.