Agentic Coding with Claude Code and Cursor: Context Files, Workflows, and MCP Integration

Agentic coding uses AI assistants as active participants in development under strict human oversight. Most teams treat coding agents like magic autocomplete. They’re not. Treat agents as supervised assistants under strict rules.

Why it matters:

Speed – faster delivery by offloading boilerplate and routine tasks.
Safety & Control – strict governance over what the agent can access and modify, with all changes reviewed before deployment
Predictability – standardized workflows produce consistent results.

This document is a distillation of the most effective principles and workflows our team has adopted for software development. These practices are the product of extensive experimentation and are actively used to enhance our engineering efficiency.

Get our production-ready agentic coding template

We maintain an open-source B2B SaaS starter template with a production-ready agentic coding context setup: github.com/softcery/saas-template.

The Problem That Started Everything

The main problem was clear: the agent has no context about the project. It doesn’t know our architectural decisions, our coding rules, or the reasons why we made certain technical choices. Every time we start a session, the agent is like a new software engineer who has never seen our codebase.

That’s why context files are the solution. We needed to teach the agent how we work, document our decisions, and make sure he understood our limitations before it wrote a single line of code.

After building a proper context system, the same agent that went in circles now ships production code daily. It finally retains what we’ve taught it.

Six Steps to Actually Productive AI Coding

After months of iteration, here’s the system that actually ships production code. We’ll use a real example - building a voice agent integration that took one day instead of three.

Step 1: Build Your Agent’s Brain (System prompt)

We primarily use Claude Code and Cursor. But the concepts of context files, working memory, and structured workflows are universal. So you can use any AI coding assistant or whatever tool you prefer. Before writing any prompts, create a CLAUDE.md file (or .cursorrules for Cursor) at your project root. This becomes the agent’s persistent knowledge about your project – automatically included in every request.

Start with the essentials:

## **CRITICAL: Follow these personality guidelines strictly before responding:**

1. Exercise Quiet Confidence: Trust your abilities without needing to prove them. State what you know simply. Acknowledge uncertainty directly and explore options together.
2. ...
3. ...
4. ...

## **Further reading:**

- Foundation document that shapes this project as a product: @.ai/project-brief.md
- Backend architecture: api/.ai-knowledge/backend-architecture.md
- Frontend architecture: web/.ai-knowledge/frontend-architecture.md
- Backend tech stack: api/.ai-knowledge/backend-tech-stack.md
- Frontend tech stack: web/.ai-knowledge/frontend-tech-stack.md
- Engineering instructions: @.ai/engineering.md

## **Memory**

Follow the memory instructions in @.ai/memory.md

## **Workflows (Tools)**

- Code review workflow for backend: api/.ai-knowledge/backend-code-review-workflow.md
- Code review workflow for frontend: web/.ai-knowledge/frontend-code-review-workflow.md
- Absolutely follow this file always when user provide git difference or another context and ask you to make review
- Task preparation workflow in @.ai/tools/task-preparation-workflow.md
- Don't follow this workflow if user didn't specify it
- Task execution workflow in @.ai/tools/implementation-workflow.md
- This workflow is executed after task-preparation-workflow.md, so at this stage there is already a specific folder for the task and trd.md, implementation-strategy.md
- Absolutely follow this file always when user ask about it

This modular approach keeps the main file clean while providing comprehensive context. The agent now understands your constraints before writing a single line.

Step 2: Create Working Memory

Agent keep a history of conversations, but they don’t track the state of the project between sessions. If you start a new chat, the agent won’t know that you refactored the authentication module yesterday, won’t take into account how we store JWTs, or where you’re stuck in a complex implementation.

We achieve this by instructing the agent to log its sessions and update core documentation. The agent follows a strict protocol for documenting its work, ensuring that architectural changes, new dependencies, or key outcomes are captured permanently. This turns a series of stateless conversations into a continuously evolving knowledge base.

Create .ai/memory.md that tracks what actually matters:

# Working Memory

1. Document sessions in @.ai/sessions.
2. Use the following file format: "{yyyy-mm-dd}-{title}.md".
3. Use the following memory log file format: title, description, session log, session outcomes, lessons learned (if any).
4. Keep memories concise, only document what is worth documenting.
5. Document only when user asks you to.
6. You should also adjust @.ai/architecture.md, @.ai/tech-stack.md if, for example, we have made changes to the project architecture or added a new dependency.

The instructions for this behavior are defined in a dedicated file (@.ai/memory.md), which is referenced in the main CLAUDE.md or rules file.

Now the agent builds on previous work instead of starting fresh each time.

Step 3: Define Reusable Workflows

Workflows are step-by-step instructions that tell the agent how to handle routine tasks. Instead of explaining how to review code or implement features repeatedly, we created reusable patterns the agent follows autonomously.

Create .ai/tools/task-preparation-workflow.md:

When given a new feature, the agent will:

Create a knowledge folder for the task (e.g., knowledge/billing).
Add trd.md (Task Requirement Document) with:
- Original requirements
- Acceptance criteria
- Dependencies
Review existing code if specified
Create implementation-strategy.md with:
- Database changes/API modifications/UI components/Testing approach
Create progress.md for tracking
Await next instruction

The critical part is to create implementation strategy – the implementation-strategy.md file. By having the agent create this detailed plan upfront, we save thousands of tokens in subsequent sessions. Instead of re-analyzing the entire codebase to understand what needs to be done, the agent simply reads its own strategy document and continues working.

 Create .ai/tools/implementation-workflow.md: Read /knowledge/{feature}/implementation-strategy.md
Check or create progress.md
If new, break strategy into a subtasks checklist
If existing, identify the next incomplete task
Implement the next task
Update progress.md
Repeat until complete
 

In Claude Code, put these in .claude/commands/ to use as slash commands:

/project:task-preparation voice-agent-feature
/project:implementation voice-agent-feature

In Cursor, invoke by @-mentioning the file:

@.ai/tools/task-preparation-workflow.md implement voice agent

This two-phase approach - plan then execute - catches problems early and keeps work organized.

Step 4: Connect to Live Systems (MCP)

The Model Context Protocol (MCP) is a standardized communication layer that bridges the gap between an isolated AI agent and live development tools. MCP transforms the agent from a passive text generator into an active partner capable of performing real actions in our environment.

We use a network of specialized MCP servers to grant our agents specific, controlled capabilities, similar to how AI agent frameworks provide different orchestration patterns. Here are our primary use cases:

Browser MCP: We run an MCP server that provides the agent with access to browser dev tools. The agent can be instructed to analyze the DOM, inspect network requests, and check console logs to diagnose and suggest fixes for UI bugs directly.

Figma-Context-MCP: Allows agents to translate design into code. We can simply provide a link to a Figma frame, and the agent retrieves layout information, component properties, and styling. It then uses this data to generate the corresponding code, ensuring high fidelity to the original design.

Postgres-MCP: A dedicated MCP server exposes a secure interface to our PostgreSQL development databases. This grants the agent the ability to run queries, analyze schemas, and suggest optimizations. For example, we can task it with identifying missing indexes or rewriting inefficient queries based on an execution plan it requests through the server. Do NOT grant access to production databases.

Important: Always be mindful of the data you are transmitting to the LLM and be conservative with the permissions you grant for command execution to prevent unintended actions.

Step 5: Write Prompts That Get Results

Specificity beats brevity every time. The agent can infer intent, but it can’t read minds. The difference between a prompt that works and one that leads to hours of revision is usually just missing details.

What We Learned About Specificity

Every vague instruction costs you iteration time. Watch the difference:

Waste of time: “add tests”

Actually works: “add unit tests for the payment retry logic in processPayment(), especially the exponential backoff when Stripe returns a 429”

Gets you nowhere: “make the API faster”

Gets results: “the /api/users endpoint times out with 1000+ results. Add pagination with 100 items per page, keep the response under 200ms”

Creates a mess: “add user management”

Ships to production: “create CRUD endpoints for user management at /api/admin/users following our existing REST patterns from /api/admin/teams. Include role checking - only admins can modify. Use the same error responses we have everywhere else”

Five Rules for Prompts That Actually Work

1. Give it something to copy Don’t describe patterns when you can point to them. “Build it like our TeamService class” beats a paragraph of architectural explanation.

2. Set boundaries explicitly The agent will helpfully “improve” things you didn’t ask it to touch. Always specify: “Only modify files in user module. Don’t touch the database schema.”

3. Define success specifically Replace “make it better” with measurable outcomes: “Refactor this part of code following SOLID principles” or “Use streaming to return messages as they are generated from llm”

4. Link to your source of truth You wrote documentation for a reason. Use it: “Follow the error handling described in @.ai/standards/errors.md.” This prevents the agent from inventing its own approach.

5. Demand a plan before code Complex tasks need thinking first. End with: “List your implementation steps and potential breaking changes before starting.”

Step 6: Build. Test. Improve.

Perfect setups don’t exist. It is something that will never be ready. Every project is unique. The key is continuous refinement. When an agent makes mistakes, constantly correct them.

When the agent makes mistakes, update your documentation:

Agent used wrong error format?

## **Error Handling**

- Always use AppError class:
  - `throw new AppError('message', 'ERROR_CODE', statusCode)`
- Never use plain Error or console.error

Agent missed a performance issue?

Add to memory.md:

## **2025-01-16: Database Performance**

- Learned: Our users table has 2M rows
- Always use indexes for user_id lookups
- Never use LIKE queries on email field

Agent needs new capability?

Create a new workflow in .ai/tools/.

Every mistake becomes a prevention. Every success becomes a pattern.

Step 7: Security Measures

Be attentive about what data you transmit to the LLM and restrictive with the permissions you grant.

For Claude Code: Use the permissions.deny block in ~/.claude/settings.json to forbid access to sensitive files. Read more.

{
  "permissions": { "deny": ["./.env"] }
}

For Cursor: Use a .cursorignore file at the project root. It functions like .gitignore, making any listed file or directory invisible to the agent.

Real Results from Real Projects

Faster Onboarding and Code Navigation: Instead of engineers spending days manually tracing logic through an unfamiliar codebase, the agent can instantly locate relevant code sections, explain architectural patterns, and identify dependencies. This has cut the effective onboarding time for new engineers from weeks to days, allowing them to contribute meaningful code much faster.

Rapid Unit Test Generation: Create comprehensive test suites in minutes, covering standard success and failure cases. What used to take hours of hand-writing edge cases can now be tested in 5 minutes. This significantly reduces the risk of regressions without slowing down the development cycle.

Automated Documentation Drafting: A well-defined prompt instructs the agent to generate high-level technical documentation for new features and endpoints. What previously took a developer 30 minutes to write is now a 5-minute review and editing process, ensuring the project remains maintainable over time.

Automated Testing and Debugging: Manually testing numerous edge cases for an API endpoint is inefficient. By leveraging the Model Context Protocol (MCP) or standard CLI tools, the agent can execute a predefined test suite against an endpoint in minutes, covering dozens of scenarios and generating a concise report of the results for human review.

Architectural Planning and Brainstorming: When faced with a complex architectural problem, developers use a Socratic dialogue with the agent to brainstorm solutions. This approach helps in quickly exploring trade-offs, evaluating different patterns, and solidifying a design, reducing research time and making individual developers more autonomous in their decision-making.

Accelerated Infrastructure and Deployment: The agent creates initial Terraform or CloudFormation configurations based on high-level requirements. The developer’s role has shifted from writing code from scratch to validating the deployment plan generated by the agent, enabling faster and more reliable infrastructure deployment.

Drastic Reduction in UI Development Time: Building a standard UI component like a modal window, which used to take an hour, is now completed in minutes. Using a Figma-Context-MCP or by providing screenshots of a design, the agent generates the corresponding code with high fidelity, allowing developers to focus on complex state management and interaction logic.

Efficient and Thorough Code Reviews: The agent acts as a first-pass reviewer on pull requests. It generates a summary of changes, highlights potential risks, identifies deviations from coding standards, and points out specific code blocks that require careful human attention. This ensures that even large PRs receive a thorough review without becoming a bottleneck or being approved blindly.

Automation of Routine Coding Tasks: Repetitive tasks that previously required searching documentation or copying snippets from Stack Overflow are now delegated to the agent. Simple tasks like adding a new CRUD endpoint, writing a database migration, or setting up boilerplate for a new service are completed almost instantly, freeing up developer time for higher-value work.

Tips and Best Practices

This is a list of practices we have integrated into our daily workflow.

Onboard with the Agent. When learning a new codebase or feature, use the agent as you would a human colleague in a pair programming session. Ask it about architecture, control flow, and purpose.

Delegate Git Operations. Agents can handle many Git operations. We use them for searching commit history and writing detailed commit messages.

Embrace Iteration. Perfect prompts do not exist. Treat prompt engineering as a continuous, iterative process. Always be refining your instructions to better fit your project’s specific needs.

Be Specific. The agent’s success rate improves with precise instructions. Giving clear directions upfront reduces rework.

Poor: “add tests for billing.py”
Good: “write a new test case for billing.py, covering the edge case where the user is logged out. avoid mocks”

Provide Visual Context. Use screenshots and images, especially when working from design mocks for UI development or analyzing charts. If you cannot provide an image, explicitly state the importance of visual appeal in the prompt.

Give agent URLs. For tools, libraries, or versions released after the model’s knowledge cut-off, include links to official documentation in your prompt to ensure accuracy.

Leverage Terminal Access. Since the agent has terminal access, instruct it to make API requests with curl for debugging or to use other CLI as part of its workflow.

Be an Active Collaborator. While autonomous modes exist, you get superior results by guiding the agent’s approach. Either create a detailed technical plan before implementation or course-correct the agent as it works.

Keep context focused: During long sessions, llm’s context window can fill with irrelevant conversation, file contents, and commands. Use the /clear command for Claude Code or start new conversations for Cursor frequently between tasks to reset the context window

What Still Needs Humans

Complex cross-service refactoring: Agent can’t hold enough context for changes affecting multiple services, message queues, and databases simultaneously.

Business-critical architecture decisions: “Should we shard the database?” requires understanding growth projections, cost constraints, and business priorities.

Design sense: “Make this dashboard more intuitive” produces functional but soulless interfaces. No understanding of visual hierarchy or user psychology.

Production debugging: When everything’s on fire, you need pattern recognition and intuition that agents lack.

The Reality Check

Agents do not replace software engineers. They remove the low-value work – boilerplate, routine refactors, test scaffolding, code search – but they do not provide product judgment or architectural insight. They will not challenge unclear requirements.

Use them as supervised assistants with a clear task, scope, and definition of done. Keep guardrails in place: least-privilege permissions, required code review, CI checks, and audit logs. Measure impact instead of making claims – track lead time per PR, first-pass CI rate, review time, and defect rate. The result is higher throughput on repeatable work while humans stay accountable for design, trade-offs, and quality.

Agentic coding builds AI systems faster. The complete picture includes production architecture, observability, evaluation frameworks, error handling, compliance requirements, and cost management. Get the full production framework in our AI Launch Plan – covering all seven systems needed to ship AI agents that handle real customer complexity and scale reliably.

About Softcery: We’re the AI engineering team that founders call when other teams say “it’s impossible” or “it’ll take 6+ months.” We specialize in building advanced AI systems that actually work in production, handle real customer complexity, and scale with your business. We work with B2B SaaS founders in marketing automation, legal tech, and e-commerce—solving the gap between prototypes that work in demos and systems that work at scale. Get in touch.

The Problem That Started Everything

Six Steps to Actually Productive AI Coding

Step 1: Build Your Agent’s Brain (System prompt)

Step 2: Create Working Memory

Step 3: Define Reusable Workflows

Step 4: Connect to Live Systems (MCP)

Step 5: Write Prompts That Get Results

What We Learned About Specificity

Five Rules for Prompts That Actually Work

Step 6: Build. Test. Improve.

Step 7: Security Measures

Real Results from Real Projects

Tips and Best Practices

What Still Needs Humans

The Reality Check

AI Voice Agents for Personal Injury Law Firms: How to Automate Intake Calls

Building AI That Actually Understands Legal Documents (Not Just Reads Them)

How AI Legal Research Actually Works (And Why Most Tools Get Citations Wrong)

AI Call Center Automation: Actionable Playbook for 2025

The Legal AI Roadmap: What Founders Need to Know Before Building or Buying

Voice Agents for Travel: What Works at HotelPlanner, What Breaks Most Implementations

Custom AI Voice Agents: The Ultimate Guide

How to Build Production-Ready Legal AI Systems

AI for Law Firms: What Actually Works in Production (Beyond the Demos)

Legal Chatbots: When to Build Custom vs Buy Off-the-Shelf

Choosing an LLM for Voice Agents: Speed, Accuracy, Cost

Real-Time (S2S) vs Cascading (STT/TTS) Voice Agent Architecture

8 AI Observability Platforms Compared: Phoenix, Helicone, Langfuse, & More

We Tested 14 AI Agent Frameworks. Here's How to Choose.

The AI Agent Prompt Engineering Trap: Diminishing Returns and Real Solutions

RAG Systems: The 7 Decisions That Determine The Production Fate

How to Implement E-Commerce AI Support: 4-Phase Deployment Guide

AI Agents Break the Same Six Ways. Here's How to Catch Them Early.

Choosing LLMs for AI Agents: Cost, Latency, Intelligence Tradeoffs

You Can't Fix What You Can't See: Production AI Agent Observability Guide

E-Commerce AI Support: What Works, What Fails, Real Store Examples

E-Commerce AI Support ROI Calculator: Volume Thresholds and Break-Even Analysis

Why Voice Agents Sound Great in Demos but Fail in Production

Deploying & Scaling Voice Agents: 4-Phase Framework from POC to Production

11 Voice Agent Platforms Compared: Vapi, Ultravox, Retell, & More

SOC 2 for Voice AI Agents: Security, Confidentiality, and Quick Wins

US Voice AI Regulations: TCPA, BIPA, COPPA, HIPAA, & State Privacy Laws

Testing Voice Agents: Methods, Metrics, and Tools

How to Choose STT and TTS for Voice Agents: Latency, Accuracy, Cost

How to Implement 3-D Secure for a Checkout Engine

Shopify App Bridge: What It Is and Why Do You Need It?

How to Choose and Integrate a Payment Tokenization Provider

How to Create a Custom Checkout Engine from Scratch

8 Key Things to Know Before Developing a Shopify App

How to Build a Shopify App: An In-Depth Guide

Embedded vs. Standalone Shopify Apps: Key Differences and How to Choose

How to Publish a Shopify App: 9 Key Steps to Pass App Review