The difference between a coding assistant and an agentic IDE is not just a matter of capability — it’s architectural. A coding assistant responds to prompts. An agentic system operates in a closed loop: it reads the current state of the codebase, plans a sequence of changes, executes them, and verifies the result before reporting completion. That loop is what makes the tooling genuinely useful for non-trivial work.
Agentic CLIs
Most of the conversation around agentic AI focuses on graphical IDEs, but the CLI tools are worth understanding separately. They integrate more naturally into existing scripts and automation pipelines, and in some cases offer capabilities the GUI tools don’t.
The main options currently available:
Claude Code (Anthropic) works with the Claude Sonnet and Opus model families. It handles multi-file reasoning well and tends to produce more explanation alongside its changes, which is useful when the reasoning behind a decision matters as much as the decision itself.
OpenAI Codex CLI is more predictable for tasks requiring strict adherence to a specification — business logic, security-sensitive code, anything where creative interpretation is a liability rather than an asset.
Gemini CLI is notable mainly for its context window, which reaches 1–2 million tokens depending on the model. Large enough to load a substantial codebase without chunking, which changes what kinds of questions are practical to ask.
OpenCode is open-source and accepts third-party API keys, including mixing providers. Relevant for environments with restrictions on approved vendors.
Configuration and Permission Levels
Configuration is stored in hidden directories under the user home folder — ~/.claude/ for Claude Code, ~/.codex/ for Codex. Claude uses JSON; Codex uses TOML. The parameter that actually matters day-to-day is the permission level.
By default, most tools ask for confirmation before destructive operations: file deletion, script execution, anything irreversible. There’s also typically a mode where the agent executes without asking. It’s faster, and it will occasionally remove something that shouldn’t have been removed. The appropriate context for that mode is throwaway branches and isolated environments where the cost of a mistake is low.
Structuring a Development Session
Jumping straight to code generation tends to produce output that looks correct but requires significant rework. The agent didn’t have enough context to make the right decisions, so it made assumptions — and those assumptions have to be found and corrected manually.
Plan Mode
Before any code is written, the agent should decompose the task and surface ambiguities. This is sometimes called Plan Mode or Chain of Thought mode. The output is a list of verifiable subtasks and a set of clarifying questions, typically around:
- Tech stack and framework choices
- Persistence strategy (local storage, SQL, vector database)
- Scope boundaries — what’s in and what’s explicitly out
It feels like overhead. The time is recovered during implementation because the agent isn’t making assumptions that have to be corrected later.
Repository Setup via GitHub CLI
The GitHub CLI (gh) integrates cleanly with agentic workflows. Repository initialization, .gitignore configuration, and GitHub issue creation with acceptance criteria and implementation checklists can all be handled by the agent. Having the backlog populated automatically keeps work visible without manual overhead.
Context Management
The context window is finite. How it’s used determines whether the agent stays coherent across a long session or starts producing inconsistent output. Three mechanisms matter here: rules, skills, and MCP.
Rule Hierarchy
Rules operate at three levels:
User-level rules are global preferences that apply across all projects — language requirements, style constraints, operator restrictions. Set once.
Project rules (.cursorrules or AGENTS.md) are repository-specific: naming conventions, architectural patterns, which shared components to reuse before creating new ones. In a team context, this file deserves the same review process as any other documentation. It tends to get neglected and then blamed when the agent produces inconsistent output.
Conditional rules activate only for specific file patterns. Testing rules that only load when editing .test.ts files, for example. This keeps the context lean when those rules aren’t relevant to the current task.
Skills
Skills are reusable logic packages that the agent loads on demand. Each skill lives in .cursor/skills/ and consists of a skill.md file with frontmatter metadata, plus any executable scripts it needs (Python, Bash, or JavaScript). The agent discovers them semantically or they can be invoked explicitly.
The practical value is context efficiency — instead of re-explaining a pattern every session, the skill carries it and only loads when the task requires it.
Model Context Protocol (MCP)
MCP is the standard for giving agents access to external systems. An MCP server exposes Tools (functions the agent can call) and Resources (data it can query). Configuration is added to the IDE’s config file, after which the agent can interact with connected systems directly.
Common integrations: Slack for notifications, Sentry for querying recent errors related to code being modified, Chrome DevTools for visual validation. The Figma MCP integration is particularly useful — design context can be pulled directly without manual translation of specs into implementation requirements.
Validation
A task isn’t complete until there’s evidence it works. The validation sequence should cover four things:
Compilation and static analysis. The build runs, linters pass. Errors get fixed before the agent reports done.
Test suite. Unit and integration tests for the affected logic must pass. Existing tests must stay green. This sounds obvious and is frequently skipped.
Runtime verification. The agent launches the application in a background process and monitors console output. Runtime errors that don’t surface in tests are common enough that skipping this step is a real risk.
Visual validation. With a browser MCP server, the agent can take a screenshot and compare it against design requirements. Layout and styling issues won’t be caught by any automated test.
Security Configuration
Two files, different purposes, frequently confused:
.cursorignore is a hard block. The agent cannot read files listed here. Use it for .env files, credentials, secrets — anything that shouldn’t leave the local environment. This is the primary security layer.
.cursorindexingignore excludes files from semantic indexing but still allows the agent to read them if explicitly requested. The appropriate use is performance optimization: node_modules, build outputs, generated files that would pollute the index without adding useful signal.
For corporate environments, Privacy Mode should be explicitly verified as enabled rather than assumed. This prevents source code from being stored by the provider or used for model training. Most enterprise tiers include it; the default state varies by tool and version.
Hooks
Hooks are event-driven triggers that run custom scripts at specific points in the agent’s lifecycle. Not necessary for small projects, but worth the setup as the codebase grows.
beforeSubmitPrompt runs before a prompt is sent. Useful for injecting dynamic context — current branch name, recent error logs — or for auditing what’s about to be sent.
afterFileEdit fires immediately after the agent modifies a file. The natural use is triggering auto-formatting or running the test suite, catching regressions as they’re introduced.
pre-compact fires when the context window is about to be trimmed. Allows prioritization of what information should be retained. Relevant for long sessions where important context has accumulated, and the default trimming behavior would discard it.
Parallel Development with Git Worktrees
Sequential work on a single branch is a bottleneck when multiple tasks are running in parallel. Git worktrees allow different branches to exist as separate working directories simultaneously:
git worktree add ../wt-feature-name -b feature/branch-name
Each worktree should have its own .env with unique local ports (PORT=3001, PORT=3002) to prevent dev server collisions. The agent can handle rebases and straightforward merge conflicts autonomously. Complex conflicts still require human judgment — the agent will flag them rather than guess.
The model itself is less of a determining factor than it might seem. Rule configuration, context management, and validation coverage drive the actual quality of the output. A well-configured environment with a mid-tier model will consistently outperform a poorly configured one with a better model. The engineering work shifts toward writing the constraints and verification steps that govern how code gets produced, which is a different skill than writing the code directly, but the productivity difference once it’s in place is significant.
