Business US

OpenAI launches new macOS app for agentic coding

AI is already having a seismic impact on how software is written, with much of the grunt work of programming now performed by swarms of agents and subagents. But as developers experiment with new interfaces and form factors for human-AI collaboration, it’s become hard for even the most advanced AI labs to keep up. 

The current trend is for agentic software development — systems where AI agents can work independently on coding tasks — epitomized by the Claude Code and Cowork apps. In the meantime, OpenAI has been gradually building out its Codex tool, which launched as a command line tool last April and expanded to a web interface one month later.

Now OpenAI is taking a major step toward catching up. On Monday, the company launched a new macOS app for Codex, integrating many of the agentic practices that have become popular in the past year. The new app is designed to work with multiple agents in parallel, integrating agent skills and other state-of-the-art workflows. The launch also comes less than two months after the launch of GPT-5.2-Codex, OpenAI’s most powerful coding model, which the company hopes will be enough to tempt over Claude Code users.

“If you really want to do sophisticated work on something complex, 5.2 is the strongest model by far,” CEO Sam Altman told reporters on a press call. “However, it’s been harder to use, so taking that level of model capability and putting it in a more flexible interface, we think is going to matter quite a bit.”

While Altman’s confidence in GPT-5.2 is understandable, coding benchmarks tell a more complicated story. GPT-5.2 does hold the top spot on TerminalBench (a test measuring how well AI handles command-line programming tasks), at least as of press time. But agents from Gemini 3 and Claude Opus have logged roughly equivalent scores — lower, but within the margin of error of the benchmark. Results from SWE-bench, another coding benchmark that tests AI’s ability to fix real-world software bugs, are similar, showing no clear advantage for GPT-5.2. However, agentic use cases have been difficult to benchmark effectively, and state-of-the-art models can vary significantly in user experience.

The Codex app also comes with a range of new features that OpenAI says will help it achieve parity or, in some cases, outpace the various Claude apps. The Codex app will allow for automations that can be set to run in the background on an automatic schedule, with results placed in a queue to be reviewed when the user returns. Users can also select different personalities for the agent — from pragmatic to empathetic — depending on their working style.

But for the company, the biggest selling point is the sheer speed of development that’s made possible by AI. “You can use this from a clean sheet of paper, brand new, to make a really quite sophisticated piece of software in a few hours,” Altman said. “As fast as I can type in new ideas, that is the limit of what can get built.”

Techcrunch event

Boston, MA
|
June 23, 2026

Related Articles

Leave a Reply

Your email address will not be published. Required fields are marked *

Back to top button