OpenAI Codex CLI vs Claude Code - There’s a Clear Winner

Greg + Code (Greg Baugues) • 2025-04-23 • 12:00 minutes • YouTube

📚 Chapter Summaries (8)

• Intro - 0:00
• Disclaimer - 0:50
• Developer Experience - 1:31
• API Key Management - 4:12
• Cost Management - 6:03
• Crashes - 9:21
• MCP Support - 9:44
• Cloud Code - 10:14

🤖 AI-Generated Summary:

Why Cloud Code Outshines OpenAI’s Codeex: A Developer’s Perspective

Hey there! I’m Greg, a developer who has spent hundreds of dollars experimenting with AI-powered coding assistants over the past few months. Lately, I’ve made Cloud Code my go-to coding agent, especially when starting new projects or navigating large, older codebases. With the recent launch of OpenAI’s Codeex, I was eager to give it a shot and pit it against Cloud Code in a head-to-head comparison. Spoiler alert: Codeex fell short in several key areas.

In this blog post, I’ll share my firsthand experience with both tools, highlight what OpenAI needs to improve in Codeex, and explain why developer experience is crucial for AI coding assistants to truly shine.

First Impressions Matter: The Developer Experience

Right from the start, Codeex’s developer experience felt frustrating. Although I have a Tier 5 OpenAI account—which is supposed to grant access to the latest GPT-4 models—Codeex informed me that GPT-4 was unavailable. Instead of gracefully falling back to a supported model, the system simply failed when I tried to use GPT-4 Mini.

To make matters worse, the interface for switching models was confusing. I had to use a /help command to discover a /model command with a list of options ranging from GPT-3.5 to Babbage and even DALL·E (an image generation model that doesn’t belong here). Most of these options didn’t work with the product, so I was left guessing which model to pick. This was a baffling experience—why show options that don’t actually work? It felt like a basic user experience bug that should have been caught during testing.

For developers, the first interaction with a tool should be smooth and intuitive—no guesswork, no dead ends. Sadly, Codeex made me jump through unnecessary hoops just to get started.

API Key Management: A Security and Usability Concern

Cloud Code shines in how it manages API keys. It securely authenticates you via OAuth, then automatically stores your API key in a local config file. This seamless process means you can focus on coding without worrying about environment variables or security risks.

Codeex, on the other hand, expects you to manually set your OpenAI API key as a global environment variable or in a .env file. This approach has several drawbacks:

Security Risk: Having a global API key in your environment exposes it to any local script or app, increasing the chances of accidental leaks.
Lack of Separation: You can’t easily dedicate a separate API key for Codeex usage, which complicates cost tracking and project management.
Inconvenience: Managing environment variables across multiple projects can become tedious.

Cloud Code’s approach is more secure, user-friendly, and better suited for developers juggling multiple projects.

Cost Management: Transparency and Control Matter

AI coding assistants can get expensive, and managing usage costs is critical. Cloud Code offers helpful features to keep your spending in check:

/cost Command: View your session’s spend anytime.
/compact Command: Summarize and compress chat history to reduce token usage and lower costs.

Codeex lacks these features entirely. There is no way to check how much you’ve spent during a session or to compact conversation history to reduce billing. This opacity can lead to unpleasant surprises on your bill and makes cost management stressful.

Project Context Awareness: Smarter by Design

One of Cloud Code’s standout features is its ability to scan your project directory on startup, building an understanding of your codebase. It lets you save this context into a claw.md file, so it doesn’t have to reanalyze your project every time you launch the tool. You can even specify project-specific preferences and coding conventions.

Codeex, by contrast, offers zero context-awareness upon startup. It simply opens a chat window with your chosen model and waits for input. This puts the burden on the developer to manually introduce project context, which is inefficient and time-consuming.

For a coding agent, understanding your existing codebase from the get-go is a game-changer that Codeex currently misses.

User Interface: Polished vs. Minimal Viable

Cloud Code’s command-line interface (CLI) is thoughtfully designed with clear separation between input and output areas, syntax highlighting, and even color schemes optimized for color-blind users. The UI feels intentional, refined, and comfortable for extended use.

Codeex feels like a bare minimum implementation. Its output logs scroll continuously without clear visual breaks, it lacks syntax highlighting, and it provides only rudimentary feedback like elapsed wait time messages. This minimalism contributes to a frustrating user experience.

Stability and Reliability: Crashes Are a Dealbreaker

Cloud Code has never crashed on me. Codeex, unfortunately, has crashed multiple times, especially when switching models. Each crash means reconfiguring preferences and losing all previous session context—a major productivity killer.

Reliability is table stakes for developer tools, and Codeex’s instability makes it feel unready for prime time.

Advanced Features: MCP Server Integration

Cloud Code supports adding MCP (Machine Control Protocol) servers, enabling advanced use cases like controlling a browser via Puppeteer to close the feedback loop by viewing changes in real-time. This kind of extensibility greatly expands what you can do with the tool.

Codeex currently lacks support for MCP servers, limiting its potential for power users.

The Origin Story: Why Polished Tools Matter

During a recent Cloud Code webinar, I learned that Cloud Code began as an internal tool at Anthropic. It gained traction within the company, prompting the team to polish it extensively before releasing it publicly. This internal usage ensured a high-quality, battle-tested product.

In contrast, Codeex feels like it was rushed to market with minimal internal adoption and testing. With just a couple of weeks of internal use and intentional polish, Codeex could improve dramatically.

Final Thoughts: Potential vs. Reality

I have not even touched on the core coding ability or problem-solving skills of Codeex’s models, such as GPT-4 Mini plus codecs. It’s possible that, once the bugs and UX issues are ironed out, Codeex could outperform Cloud Code at a lower cost.

But right now, the frustrating user experience, instability, poor key management, and lack of cost transparency prevent me from fully engaging with Codeex. A well-designed developer experience isn’t just a nice-to-have; it’s essential to unlocking the true power of AI coding assistants.

What OpenAI Needs to Do to Bring Codeex Up to Par

Graceful Model Fallback: Automatically switch to a supported model if the default is unavailable.
Clear and Accurate Model List: Only show models that actually work with the product.
Secure and Convenient API Key Management: Implement OAuth or a dedicated API key setup for the tool.
Cost Transparency: Add commands or UI elements to track session spending and manage token usage.
Project Context Awareness: Automatically scan and remember project details to save time and costs.
Stable, Polished UI: Improve the CLI interface with clear input/output zones, syntax highlighting, and accessibility options.
Reliability: Fix crash bugs to ensure smooth, uninterrupted workflows.
Advanced Feature Support: Enable MCP servers or equivalent extensibility to boost functionality.

Conclusion

AI coding assistants hold incredible promise to revolutionize software development, but only if they respect developers’ time, security, and workflows. Cloud Code exemplifies how thoughtful design and polish can make a tool truly empowering.

OpenAI’s Codeex has potential, but it needs significant improvements in developer experience and stability before it can compete. I look forward to seeing how it evolves and hope these insights help guide its growth.

Thanks for reading! If you’ve tried either Cloud Code or Codeex, I’d love to hear about your experiences in the comments below. Happy coding!

📝 Transcript Chapters (8 chapters):

• Intro - 0:00
• Disclaimer - 0:50
• Developer Experience - 1:31
• API Key Management - 4:12
• Cost Management - 6:03
• Crashes - 9:21
• MCP Support - 9:44
• Cloud Code - 10:14

📝 Transcript (309 entries):

## Intro [00:00]
Hey, my name is Greg and I've spent hundreds of dollars hacking with cloud code over the last couple months. Uh, it's become sort of my default coding agent whenever I'm starting new projects or especially if I'm working on an older project and that has a larger code base.

Last week, OpenAI launched their codec and I was really excited to try it out. So, I tried both Cloud Code and OpenAI's codeex out headtohead on a couple different projects. Uh, I'll kind of go against YouTube best practices here and just spoil it. Uh, codeex fell pretty short and it was disappointing. Uh, and so in the rest of the video, I'm going

to share my opinion on what OpenAI needs to do to bring Codeex up to the same standard that Cloud Code's at now. Uh, and I I honestly I think there's some simple fixes having to do with the developer

## Disclaimer [00:50]
experience. Before we get into this, I want to preface this by saying I'm wearing my OpenAI Devday hoodie. I've been hacking on OpenAI's APIs starting with the GPT3 text completion endpoint.

OpenAI has shipped some of the best developer experiences that I've ever encountered in my career. I have a number of former colleagues and friends from Twilio who work there now. Uh I generally don't want this channel or really anything I do to be about criticizing other people. Uh because shipping stuff is really hard. Uh, I I only feel comfortable critiquing this because I know that OpenAI's bar for developer experiences has historically been really high, and this one just sort of fell surprisingly short. So, let's

## Developer Experience [01:31]
talk a little bit about uh a few points of the developer experience. When I personally start up uh Codeex, I'm given a message that 04 is not available to me. Uh, my account is tier 5. The announcement said that 04 should be rolled out to all tier 5 accounts. So, they've set my expectations at here. Uh, they're delivering at here with no explanation of the gap. But Codeex though does ship with 04 MIDI being the default. And it does enough checking to let me know that I don't have access to that model. But why doesn't it just fall back to a model that works then?

Instead, if I try to chat with 04 Mini, it just fails. Okay, fine. I guess I need to change my model. It's not obvious on the screen how I do that, but it does say I can type /help here. So, I'll type /help. All right, cool.

There's a slashmodel command. Let's choose /model. Oh my gosh. Which model am I supposed to choose from here? Do I want Babbage? Do I want GPT 3.5? I follow these things pretty closely and I'm pretty sure that of this list, 03 Mini is the best option for me.

Uh, and I'm pretty sure that 03 Mini is better than this dated version of 03 Mini. I'm pretty sure the 03 Mini is going to point to the latest one, but I'm only like 90% sure. But hey, let's just for shits and giggles, let's let's choose one of the older ones. Let's try Babage. I I This is before my day. Let's just try Babage and play. Oh, man. And it fails. All right. Well, let's try one of these newer models. Oh, look. That fails, too. Well, let's try one of these other models that uh you know, it's weird that Dollyy's in here, right? I mean, that doesn't seem like we're going to do image generation here. Let's try that one. that fails as well. Why are all of these models listed

here if they don't work with this product? Like this just baffles me. This is this is like this is an if statement. You know, this is 30 minutes of work. This is like if you had a couple dozen people trying this, this feedback would have been surfaced. And so I think this sort of thing is the thing that frustrated me most coming from OpenAI because this just sort of feels like wasting people's time in order to get some headlines. And and I don't that maybe that's not it, but that's what it feels like from my perspective is I haven't even gotten started yet and I'm having to make these decisions and it's like trying to pick which door has the death trap and which one leads to the happy place. Uh, and I feel like a developer should never have to make that decision just simply to get to hello world with your product. API

## API Key Management [04:12]
key management. When you boot up cloud code, it's going to ask you to oth into your account and then it's going to set up your API key and store it away into a configuration file for you. Codeex on the other hand uh expects you to set the API key as an environment variable um specifically either as a global environment variable or as a env. Um this is very similar to how you would set the API key if you were just going to use the OpenAI API in one of your projects. Uh but that doesn't actually make sense for this tool. I really like how Anthropic expects you to spin up a second API key or I think even a separate project uh for cloud code because the usage is different, right?

Right. And so if I've set uh an OpenAI API key in my project, that means, you know, I have some code in that project that's using the OpenAI API and it's using that API key. And I don't want the usage of the coding tool to uh spend against that same API key. Also, I really don't like the idea that you need to set a global OpenAI API key for if you want to use this across all of your projects. So, now there's this uh OpenAI API key just like hanging out in your environment all the time that could be snagged by any script that's running locally or any app I suspect that's running. Uh, and so it just seems like unnecessarily insecure. There's really not a good option with codeex to create an API key that's solely dedicated to the usage of codecs and then to store it in a persistent way that you can use it across all of the different projects you might be working on. And cloud code makes this super easy. Cost management.

## Cost Management [06:03]
Uh so as I mentioned uh cloud code if there's any major complaint of it is that it can get expensive compared to what developers are used to paying uh in order to write code. uh but still, you know, probably pretty good value compared to hiring a human to do it. Um and they know this and so they give you a couple of options to manage the cost over the course of your session. The first is /cost. At any point in time,

you can run the /cost command and you can see how much you've spent during this coding session. This is really really useful. Uh two, they give you the slash compact command, right? So, it's just basically taking everything you've done, bulletointing it, and then significantly reducing or compacting the context uh with a summary so that cloud code still knows what's happened, but it's able to know that with much much smaller requests. So, each request is going to be a lot less expensive. Codeex does none of this. There's no way on any given session to figure out how much you've spent. Uh and there is no concept of compacting your history. Project context. When you start up Claude, uh, it does a scan of your directory that you're in, of your codebase, and it tries to gain an understanding of your codebase, and then you can run this /init command and write what it's learned about the codebase to a claw.md file. So, it doesn't need to do that analysis every single time. You can also give it instructions on you know your formatting preferences on uh the you know just coding conventions that you're going to use in the project so that you're not starting from scratch every time you start this up. Codeex when you start it up it does nothing. Uh it just basically gives you a text box to chat with whatever model you have chosen. It doesn't take any initiative to understand your codebase to understand what is going on. Uh now once you initiate the chat it will but I really love the idea that since this is a coding agent since you ran it in a specific directory it can assume that you want it to understand your codebase before you start working with it. So why not go ahead and start doing that and then once you have started doing that why not in order to save costs and save time write what you've learned to a standardized file so you don't have to repeat that process every time. Uh fifth is just the UI. I mean, Cloud Code is a really, really nicely designed CLI, and there's a lot of subtleties in Cloud Codes, like the way that they separate the output versus your section down below where you're going to do the input, um, the way they do the syntax highlighting, the just the colors they've used, like it just it feels like a very intentional product. Uh by contrast, codeex feels sort of like the minimum viable design. You know, instead of a single line that shows how long you've been waiting, uh it just has like sort of a console log uh that outputs every few seconds uh saying, "Okay, now you waited for 24 seconds. Now you waited for 26 seconds." And it just scrolls there. There's no clear visual delineation between the where the user does their input and the output of the agent. um you know this there's not a lot of good color contrast. Cloud Code even has some nice considerations to change the color scheme if you're color blind. Codeex does not have this. So again it it just feels like this was the minimum possible that could be done and

## Crashes [09:21]
then it was pushed out. This thing crashes like uh I've never had cloud code crash on me. Uh once I switch over to 03 Mini, it has crashed multiple times on me. Uh just hard crash and then to start it back up again, I got to go through and set all the same preferences again. and then I lose all the context and everything. Um, and it's just again just sort of feels like table stakes.

Uh, and it's super super frustrating and it just feels like it's not ready. MCP

## MCP Support [09:44]
support. Uh, you now can add MCP servers to cloud code. So, one great use case that I learned about on a cloud code webinar last week was uh using the Puppeteer uh MCP server to control a browser. So you can ask cloud code to make changes and then you can have it control a browser to view those changes and it sort of closes the loop on the feedback cycle. Uh once you add MCP servers to these tools, it just really opens up the world of possibilities of

## Cloud Code [10:14]
what you can do with them. Speaking of that webinar, uh you know, I don't remember the last time I sat in on a webinar for a software product. uh but I've used cloud code so much and uh that I I wanted to participate and learn how to get more uh use out of it and so the seminar from anthropic was called the origin story of cloud code and best practices so I'm like I'm in um and there was something they said that was interesting there they said cloud code started off as an internal tool and then when they released it internally uh they saw adoption internally just sort of go up and to the right and so they said okay there must be something here and

they polished it up and once they got it to the right place then they released it out into the open. Um but and even today like it's how many many developers inside of anthropic uh you know do their daily coding tasks.

Uh you can tell from the polish that people internally had used cloud code before it made it to market. And you can tell from the polish that very few people used OpenAI's codeex CLI before it came to market. really just like a week or two of internal use and a little bit of intentionality, this could be so much better. And notice here, I have said nothing about this tool's ability to solve coding problems. It is very possible that 04 Mini plus codecs would solve my developer problems with far more reliability uh at a fraction of the cost of cloud code. But I never got there because it was so frustrating to use it and because it kept crashing on me and because I don't have access to the latest and greatest models.

YouTube Deep Summary