Hello everyone, I’m Sunday.
Recently, I’ve been asked the same question so many times in my inbox that it’s become overwhelming.
How exactly do you use Codex?
There are three entry points: CLI, App, and VS Code extension — all seemingly usable.
Then out pop plugins, skills, MCP, automations, AGENTS.md, sub-agents, permission modes…
You just want it to help you modify some code.
But first, it dumps a whole set of new concepts on you.
Seriously, man, I just wanted to fix a bug — how did I suddenly end up learning a new profession…
So yesterday, I spent an entire afternoon re-organizing how Codex works, and recorded a fairly comprehensive video along the way.
After recording, I realized there are actually many parts that are easy to miss if you only watch the video.
Especially things like prompt templates, permission boundaries, version management, context management, and when to let Codex be read-only versus when to let it actively modify code.
If you don’t write these down, they’re really easy to forget after watching.
So I went ahead and reorganized the script from the video into this article.
I won’t claim it’s highly authoritative.
But if you’re a programmer, or you’re using Codex, Claude Code, Cursor, etc., for projects, I think this article can help you avoid quite a few pitfalls.
This article will mainly cover:
Alright, let’s get started~~
Let’s start with Codex itself.
Currently, Codex has roughly three main entry points:
- Codex CLI
Let’s go through them one by one.
That is, the command-line version.
If you’re a programmer, you can install it directly via npm:
bash
After installation, navigate to the project root directory and enter:
bash
to launch it.
To update, you can also use:
bash
The advantage of CLI is that it’s lightweight, direct, and great for terminal enthusiasts.
It works based on whichever project directory you open it in.
When we let Codex first encounter a project, it’s best to have it read the project first.
My usual prompt is:
text
This habit is very important.
Many people fail at Vibe Coding not because the model is bad, but because they rush to let it modify code.
No wonder things go wrong.
I think this is more suitable for beginners.
You can download it directly from the official website; there are installers for macOS and Windows.
Because the App feels more like a complete AI programming workstation than the CLI.
You can start new conversations, search history, manage plugins, install skills, configure automations.
Officially, Codex App is now positioned as a multi-agent workstation.
That means it’s not just letting you chat with one AI.
You can run multiple tasks simultaneously, with different agents working in different projects, branches, or worktrees.
If you’re using Codex for the first time, I recommend starting with the App.
CLI is nice, but less beginner-friendly.
Things like directories, permissions, commands, context, sandboxing, git status — mixed together, they can easily confuse newcomers.
At least the App gives you a clearer workstation view.
For example, using Codex directly in VS Code or Cursor.
This is suitable when you’re already coding and want Codex to assist you with a specific file, bug, or component in the current project.
Let’s take VS Code as an example.
Simply search for Codex in the extension marketplace and install it.
Then open the corresponding project, and you’ll find the Codex icon in the top right corner.
Clicking it takes you straight into the Codex workspace.
Also note:
If you have both Codex App and the Codex IDE extension installed, their data can互通.
So, how should you start using Codex for the first time?
I suggest you don’t jump straight into practicing on a company’s core project.
Start with a demo project.
Or a small personal project that you’re not afraid to mess up.
Then make sure the project is under git management.
If not, initialize it first with:
bash
The purpose of this is: to ensure that if Codex modifies your code incorrectly, you can revert.
Also, when doing Vibe Coding, it’s best to make a git commit after each code modification by Codex, just in case.
So, version management is essential.
Additionally, don’t let Codex write code the first time you launch it.
Just have it read the project.
Here’s a prompt I often use:
text
After this step, check whether its understanding is accurate.
If it’s wrong, correct it.
Don’t rush to let it act.
You can continue asking:
text
This method has proven very effective — sharing it with you.
Next, let’s talk about permission modes.
Many overlook this, but I think it’s crucial, so I’ll discuss it separately.
We know Codex can read and modify your local code and files.
This involves local file security.
For example, if the AI accidentally deletes something important on your computer, that’s trouble.
To avoid this, Codex’s permission modes are generally divided into three types.
Default permission is the safest, or most conservative.
It can read files, suggest modifications, and propose shell commands to execute.
However, before actually modifying files or executing commands, it needs your explicit approval.
Advantage: safe.
Disadvantage: you can’t walk away.
After all, you must confirm every step.
Slightly bolder than default permission.
In default mode, the AI must ask you before modifying files or running commands.
Auto-review means Codex first judges whether an operation is safe.
If it’s just modifying ordinary code in the current project, or running common commands like installing dependencies, running tests, or builds, it may auto-approve without requiring you to click “confirm” each time.
Benefit: higher efficiency.
For example, if you ask Codex to fix a bug, it might read code, modify it, run tests, and keep adjusting based on errors.
If every step requires confirmation, it interrupts the flow.
But auto-review isn’t completely unrestricted.
If Codex attempts high-risk actions — deleting files, accessing directories outside the project, reading sensitive files, downloading from the internet, or running dangerous commands — it will still pause for your confirmation.
So what scenarios suit auto-review?
When you’re familiar with the project and trust the current task.
You want Codex to carry out several consecutive steps without asking each time.
But you don’t want full open permissions.
Auto-review is a compromise:
More convenient than default permission, but safer than full access.
Use this mode cautiously.
Think of it as minimizing Codex’s restrictions.
In this mode, Codex can more freely read files, modify files, execute commands, and even access directories and networks outside the project.
Advantage: highest execution efficiency.
For example, large tasks like migrating a project, bulk refactoring, installing dependencies, running scripts, or modifying code across multiple directories — it can proceed continuously without waiting for your confirmation, acting more like a true development assistant.
But risk is also highest.
If it misunderstands your intent or executes an unintended command, it could genuinely affect your local files.
So I don’t recommend keeping full access on all the time.
Especially for unfamiliar projects, freshly cloned open-source projects, company core projects, or projects containing .env, keys, tokens, config files — don’t enable it casually.
A more stable approach:
Afterward, switch back to default or auto-review.
These three modes can be summarized simply:
Default permission: safest, but requires frequent confirmation.
Auto-review: more efficient, suited for continuous tasks in daily development.
Full access: most powerful, but highest risk, only for temporary use.
Now let’s look at best practices for Vibe Coding with Codex.
This breaks down into 4 steps:
Let’s go through them one by one.
Don’t let Codex wander aimlessly through the entire project.
Especially large ones.
If you only want to change the login page, tell it the relevant login page files.
If you only want to inspect a hook, specify the hook file and its call sites.
In the App or IDE, you can use @ to specify files.
In CLI, you can also clearly write paths.
Example:
text
This is far more reliable than saying “help me see what’s wrong with login.”
Make it clear whether it’s read-only or allowed to modify.
Example:
text
Or:
text
Or:
text
See? These three sentences correspond to completely different working states.
Many problems arise with Codex because the boundaries aren’t clearly stated.
Example:
text
That’s called acceptance criteria.
Without them, it will just finish according to its own understanding.
And AI’s “I think it’s done” often doesn’t match engineering’s “really done.”
If you notice Codex heading in a strange direction, don’t wait for it to finish.
Interrupt immediately.
During AI execution, click the pause button and tell it:
text
It’s like mentoring an intern.
Correct the course early when the direction is off.
Next, let’s examine the relationship between prompts and token consumption.
Many think: should I keep instructions to Codex as short as possible?
Does shorter speech mean lower token usage?
Not necessarily.
Short instructions can sometimes consume more tokens.
Because if you’re too brief, Codex has to search, guess, and fill in context on its own.
You might think you’re saving effort.
In reality, it takes a long detour, and token usage goes up.
Example:
Say:
text
Nice and short.
But Codex will struggle.
What exactly to optimize? UI, performance, or code maintainability?
It can only guess.
And guessing consumes potentially huge amounts of tokens.
A better approach is to specify concrete optimization items.
Example:
text
This instruction is much better.
Remember: In Vibe Coding, the worst thing is letting AI improvise freely.
Codex isn’t limited to text.
You can also feed it screenshots, design mockups, error screenshots, UI problem screenshots.
For instance, if a button is squeezed on the page, or mobile layout is misaligned.
You can directly paste or drag the screenshot in, then say:
text
This works much better than pure text description.
After all, a picture is worth a thousand words.
Next, version management and rollback.
Before Codex modifies code, check git status.
Git is a version control tool; you can download it from the official site.
Making version management part of your habit before each Vibe Coding session is strongly recommended.
You can do this via Git commands or using visual tools.
You can also have Codex check first:
text
If your working directory already has half-written changes of your own, be careful.
Codex might continue modifying those files.
Not that it can’t — just make sure you know what it changed.
After completing a small task, have it summarize the diff once.
Example:
text
Then review the diff yourself.
Don’t blindly commit.
Next, context management.
This is especially critical in AI Agents.
Each Codex conversation carries context.
Everything you said before, what it did, what files it read, what it modified — all gradually piles up.
Initially, this is great.
It remembers what you just did.
But over time, context gets heavier.
Responses slow down, and it may mix up old and new tasks.
That’s what people usually mean when they say: the AI seems dumber.
At this point, consciously manage context.
My advice: use one conversation per type of task.
E.g., one thread only fixes login bugs.
Another thread only handles component refactoring.
Don’t jump between writing pages, checking CI, summarizing meetings in the same thread.
That clutters the context.
When to continue an old conversation?
When the task is strongly tied to previous context.
E.g., you fixed half and tests haven’t passed yet — keep going.
When to start a new conversation?
When tackling a new problem, or when old context is too messy.
E.g., you just finished login, now moving to payments — start a new thread (new dialog).
Also, at task end, have it produce a closing summary.
Example:
text
Such summaries are extremely useful later.
When you search history or reopen the thread, you won’t need to sift through fragments — you’ll see the engineering record directly.
That’s the value of Codex App’s search feature.
Speaking of long-term memory, we come to AGENTS.md.
If you’ve used Claude Code, you may know CLAUDE.md.
In Codex, the more common equivalent is AGENTS.md.
Think of it as a project manual written for the AI Agent.
Codex reads these files and treats their rules as long-term context.
It can have multiple layers.
Put your personal preferences here.
E.g., answer in Chinese, give a plan before modifying code, explain reasons for test failures.
Put project conventions here
E.g., tech stack, startup commands, test commands, code style, directory conventions.
Special rules for a module
E.g., this directory contains legacy code — don’t refactor public interfaces.
Or all requests in this module must go through unified wrappers.
This is very useful.
You don’t have to repeat yourself each time.
For example, place an AGENTS.md in the project root like this:
text
From then on, Codex will follow these rules in the project.
That’s long-term memory.
Solidify your work standards.
Encode team code conventions, branching rules, testing rules, directory agreements.
Much more reliable than repeatedly reminding in group chats.
Then there are skills.
Their popularity needs no elaboration.
Nearly all major model products now offer skills functionality.
And I believe this is a very worthwhile area to explore in Codex.
Many confuse skills with plugins.
They’re not the same.
Skills are more like a way of doing things.
They tell Codex what process to follow for certain tasks, what risks to check, and what standards to use for output.
For example, you could have a frontend review skill.
It specifies that during frontend code reviews, focus on component boundaries, state management, interface error handling, mobile adaptation, performance issues, XSS risks.
Later, when you ask Codex to review, it won’t just glance — it’ll apply this standard.
You could also have a writing skill.
Specify your article structure, tone, banned words, case organization.
Next time you ask Codex to draft, it won’t need to adapt to your style each time.
Turning these into skills makes them reusable workflows.
Codex has a built-in Skill Creator.
Use it to create your own skills directly.
Example — ask Codex to create a skill:
text
You can also have it install others’ skills.
Find a skill on GitHub, then tell Codex:
text
It will handle installation for you.
Next, plugins.
Biggest difference between plugin and skill:
Skill is method; plugin is toolbox.
Plugins let Codex connect to more external tools and information sources.
E.g., GitHub, browser, documentation, email, calendar, desktop apps, various MCP services.
Once connected, Codex is no longer just a coding assistant.
It enters real workflows.
E.g., have it check GitHub PR review comments:
text
Or have it read a document, then check if related modules in the project are affected:
text
It’s like an AI colleague that can move between tools.
But I must emphasize again:
Greater capability means greater boundary importance.
More plugins ≠ better.
Higher permissions ≠ better.
Especially regarding company repos, emails, chat logs, docs — pay attention to data boundaries.
Then there’s MCP.
This term is trending lately.
MCP can be simply understood as a protocol allowing AI to connect to external tools and data sources.
E.g., databases, browsers, knowledge bases, design tools, internal systems.
Previously, AI had to adapt individually to each tool.
Now, MCP provides a unified interface.
Once Codex connects to MCP, it gains access to more external capabilities.
E.g.:
Of course, supported MCPs depend on your configuration and permissions.
I don’t recommend diving into a bunch of MCPs as a beginner.
First master basic project read/write, permissions, context, AGENTS.md, git workflow.
Those are fundamentals.
MCP is an extension.
Without solid fundamentals, more extensions just mean more chaos.
Next, automation.
Many underestimate this feature.
Automation means giving Codex a task plus a time rule, so it runs periodically on its own.
Kinda like a scheduled task.
But it’s not an ordinary one.
Because the executor is an Agent.
E.g., have it summarize yesterday’s commits every morning:
text
Or check PR status every 30 minutes:
text
Or summarize today’s diff every evening:
text
Tasks like these suit Codex well.
They’re repetitive, clear, with fixed judgment criteria.
I suggest starting automation with read-only tasks.
First let it report, not auto-modify.
Once you confirm its judgments are stable, gradually expand permissions.
Two categories:
E.g., waiting for deployment results.
Have Codex return to this thread in 10 minutes to check.
Good for short-cycle, context-heavy tasks.
E.g., check project commits every morning.
Generate weekly reports every Friday.
Summarize error logs every night.
These run independently, not relying on current conversation.
Rule of thumb:
That makes it easier to grasp.
Now, let’s tie everything together with a concrete example.
Suppose I want to build a Pomodoro app.
First, see the final result
A full engineering practice can be broken into 6 steps.
Some may say: so troublesome?
Why not just say:
text
Wouldn’t that work?
Yes.
But the result will likely be mediocre.
A more reliable approach is to turn Codex programming into a complete engineering practice.
text
You can use GPT or Codex here.
If you’re unsure, I recommend GPT first.
GPT is better for discussion and breakdown.
text
This step is critical.
Same Pomodoro feature can differ vastly across projects.
Some use React, some Vue, some Next.js, some have their own component libraries or state management.
Codex must read the project first to match its style.
text
Note: I didn’t ask it to do a pile of things at once.
No history stats, notifications, sound effects, complex themes.
Just core features.
Because Codex struggles with large tasks if overloaded.
text
This step is very valuable.
Having Codex write code is one thing.
Having it review from a reviewer’s perspective is another.
text
text
See? This full cycle yields much more stable Codex output quality.
Here’s a set of templates I frequently use.
text
text
text
text
text
You don’t need to memorize these.
Just remember the underlying logic:
That’s the simplest and most stable Codex workflow.
When opened, it looks like this:
Contains lots of Codex configurations.
E.g., personalization, MCP, browser usage, etc.
You can explore these later.
But there’s a fun little feature previously seen in Claude Code: pet.
Earlier Claude Code pets felt half-baked.
Codex’s, though not heavily promoted, feels quite complete.
Under “Appearance,” scroll to the bottom for the Pet module.
Lots of pets available, plus custom pets.
Click “Wake Pet” and your pet appears on the desktop.
Feels like the old PC Manager mini basketball.
If you think Codex’s default pets are ugly, there are now many Codex Pet Markets.
E.g., https://codex-pets.net/
Look much better than official ones, right?
Installation is simple.
E.g., install my favorite Brother Chicken:
Use the command on the right: npx codex-pets add ikun
Of course, this is just an easter egg.
Fun aside, what truly determines Codex’s usefulness is the earlier stuff.
New conversations, search, plugins, skills, automation.
These are the real workflow core of Codex APP.
Today’s content is quite long — I checked, it’s about 16,000 characters. And the video is long too, 40 minutes after editing.
Final summary:
AI is not a god. It’s a very powerful tool.
But the stronger the tool, the more you need to know how to use it.
Beginners’ most common mistake: give a vague task and expect it done in one go.
That’s unreliable.
A reliable approach is:
AGENTS.md.You remain responsible for direction, judgment, and acceptance.
I believe that’s where Codex delivers the most value.
This article is reprinted from: Spent a bloody 10k words! Probably the most comprehensive Codex practical tutorial online