OpenAI Launches GPT-5.2-Codex to Bolster AI Programming and Cybersecurity, Challenging Google

Deep News06:50

Just a week after releasing the GPT-5.2 series, OpenAI unveiled GPT-5.2-Codex, an advanced AI coding model based on GPT-5.2, on Thursday, December 18 (ET). The new model focuses on professional software engineering and defensive cybersecurity, further strengthening OpenAI’s competitive edge against Alphabet's (GOOGL) Gemini in AI programming.

OpenAI stated that GPT-5.2-Codex achieves breakthroughs in coding performance, cybersecurity capabilities, and long-term task handling. The model scored 56.4% accuracy in the SWE-Bench Pro test and 64.0% in Terminal-Bench 2.0, setting new records in both benchmarks. It is now available to paid ChatGPT users across all Codex interfaces, with API access rollout underway.

A key highlight of GPT-5.2-Codex is its enhanced cybersecurity capabilities. CEO Sam Altman noted that earlier this month, a security researcher using the previous-generation GPT-5.1-Codex-Max identified and responsibly disclosed a vulnerability in React that could expose source code. While OpenAI acknowledges that the new model has not yet reached "high" cybersecurity proficiency, the company is preparing future models to cross this threshold.

GPT-5.2-Codex was released immediately to paid ChatGPT users, with API access expected in the coming weeks. OpenAI plans a phased rollout, combining safeguards and close collaboration with the security community to maximize defensive impact while minimizing misuse risks.

This release continues OpenAI’s aggressive push in AI programming. Last week, the company touted GPT-5.2’s "state-of-the-art agent coding performance," citing high scores in SWE coding tests—even surpassing human expert levels—seen as a direct response to Google Gemini 3’s praised coding and reasoning abilities.

**Enhanced Coding Performance for Large-Scale Scenarios**

GPT-5.2-Codex is an optimized version of GPT-5.2, specifically fine-tuned for agent-based coding. OpenAI highlighted improvements in three areas: extended context compression for long-term tasks, stronger performance in project-level work like refactoring and migrations, and better efficiency in Windows environments.

In benchmarks, GPT-5.2-Codex outperformed its predecessors—56.4% in SWE-Bench Pro (vs. GPT-5.2’s 55.6% and GPT-5.1’s 50.8%) and 64.0% in Terminal-Bench 2.0 (vs. 62.2% and 58.1%, respectively). SWE-Bench Pro evaluates patch generation for real-world software tasks, while Terminal-Bench 2.0 tests AI agents in terminal-based workflows like compiling code and setting up servers.

The model also excels in long-context understanding, reliable tool usage, improved factual accuracy, and native compression, making it a dependable partner for prolonged coding tasks without sacrificing token efficiency. Enhanced visual interpretation allows GPT-5.2-Codex to accurately analyze screenshots, technical diagrams, and UI designs, accelerating functional prototyping.

OpenAI emphasized that these upgrades enable Codex to handle large codebases, maintain context during complex tasks (e.g., refactoring or feature development), and adapt seamlessly to plan changes or failed attempts.

**Cybersecurity Leap: Preparing for "High"-Level Proficiency**

Cybersecurity is another major focus for GPT-5.2-Codex. OpenAI observed sharp capability jumps starting with GPT-5-Codex, followed by GPT-5.1-Codex-Max, and now GPT-5.2-Codex.

In professional capture-the-flag assessments, the model demonstrated advanced multi-step cybersecurity problem-solving. While it hasn’t yet reached "high" proficiency per OpenAI’s framework, the company expects future models to progress further and is planning accordingly.

A real-world case underscores its defensive potential: On December 11, React disclosed three security flaws affecting apps built with React Server Components. Andrew MacPherson, Chief Security Engineer at Privy (owned by Stripe), discovered these previously unknown vulnerabilities while using GPT-5.1-Codex-Max to investigate another critical flaw (React2Shell), following standard defensive workflows.

Altman shared on social media: "Last week, a researcher using our prior Codex model found and disclosed a React vulnerability that could leak source code. These models will bring net benefits to cybersecurity, but as they improve, we’re entering the 'real impact phase.'"

**Trusted Access Program for Security Professionals**

To balance capability and risk, OpenAI added safeguards at both model and product levels, including specialized safety training (e.g., against harmful tasks/prompt injections), agent sandboxing, and configurable network access. The company is also piloting an invite-only Trusted Access Program.

Initially open to vetted security professionals and organizations with clear cybersecurity use cases, the program grants access to OpenAI’s most powerful models for defensive work—such as vulnerability research or authorized red-teaming—while lifting restrictions for threat simulation, malware analysis, or critical infrastructure testing.

Altman announced on X: "We’re exploring a Trusted Access Program for defensive cybersecurity work." He also promoted Codex hiring: "Codex is getting incredibly good and will improve fast. If you want to help make it 100x better next year, the team is hiring. Guaranteed wild ride with high odds of success."

Disclaimer: Investing carries risk. This is not financial advice. The above content should not be regarded as an offer, recommendation, or solicitation on acquiring or disposing of any financial products, any associated discussions, comments, or posts by author or other users should not be considered as such either. It is solely for general information purpose only, which does not consider your own investment objectives, financial situations or needs. TTM assumes no responsibility or warranty for the accuracy and completeness of the information, investors should do their own research and may seek professional advice before investing.

Comments

We need your insight to fill this gap
Leave a comment