Introduction
This section brings together everything we've built in this chapter into a complete, production-ready coding agent. We'll integrate the file system tools, code execution sandbox, Git integration, test-driven development support, and debugging loop into a unified agent that can tackle real coding tasks.
What We're Building: A coding agent that can understand natural language requests, navigate codebases, write and edit code, run tests, debug failures, and commit changesβall while maintaining safety and providing clear feedback.
Complete Agent Architecture
Our complete coding agent consists of several integrated components:
1"""
2Complete Coding Agent Implementation
3====================================
4
5A production-ready coding agent with:
6- File system operations
7- Code execution sandbox
8- Git integration
9- Test-driven development
10- Automatic debugging
11"""
12
13from dataclasses import dataclass, field
14from typing import List, Dict, Any, Optional, AsyncGenerator
15from pathlib import Path
16from enum import Enum
17from datetime import datetime
18import asyncio
19import json
20import os
21
22
23class AgentMode(Enum):
24 """Operating modes for the coding agent."""
25 INTERACTIVE = "interactive" # Chat-based interaction
26 AUTONOMOUS = "autonomous" # Execute tasks automatically
27 SUPERVISED = "supervised" # Require approval for changes
28
29
30@dataclass
31class AgentConfig:
32 """Configuration for the coding agent."""
33 # Core settings
34 workspace: Path
35 mode: AgentMode = AgentMode.SUPERVISED
36
37 # Model settings
38 model_name: str = "claude-3-5-sonnet-20241022"
39 max_tokens: int = 4096
40 temperature: float = 0.1
41
42 # Safety settings
43 isolation_level: str = "container" # none, restricted, container
44 require_approval_for: List[str] = field(default_factory=lambda: [
45 "git_push", "delete_file", "run_unsafe_command"
46 ])
47
48 # Limits
49 max_iterations: int = 20
50 max_file_edits_per_task: int = 10
51 timeout_seconds: int = 300
52
53 # Features
54 enable_git: bool = True
55 enable_tests: bool = True
56 enable_debugging: bool = True
57 auto_commit: bool = False
58
59 # Paths
60 log_path: Optional[Path] = None
61 cache_path: Optional[Path] = None
62
63
64@dataclass
65class TaskContext:
66 """Context for a coding task."""
67 task_id: str
68 description: str
69 started_at: str = field(default_factory=lambda: datetime.now().isoformat())
70 files_read: List[str] = field(default_factory=list)
71 files_modified: List[str] = field(default_factory=list)
72 commands_executed: List[Dict] = field(default_factory=list)
73 errors_encountered: List[Dict] = field(default_factory=list)
74 tests_run: List[Dict] = field(default_factory=list)
75 git_operations: List[Dict] = field(default_factory=list)
76
77
78@dataclass
79class AgentState:
80 """Current state of the agent."""
81 status: str = "idle" # idle, thinking, acting, waiting, complete, error
82 current_task: Optional[TaskContext] = None
83 iteration: int = 0
84 last_action: Optional[str] = None
85 last_observation: Optional[str] = None
86 pending_approvals: List[Dict] = field(default_factory=list)Unified Tool Registry
We create a unified registry that manages all tools the agent can use:
1from abc import ABC, abstractmethod
2from dataclasses import dataclass
3from typing import Callable, Any
4
5
6@dataclass
7class ToolDefinition:
8 """Definition of a tool for the agent."""
9 name: str
10 description: str
11 parameters: Dict[str, Any] # JSON Schema
12 handler: Callable
13 requires_approval: bool = False
14 category: str = "general"
15
16
17class ToolRegistry:
18 """
19 Registry for all agent tools.
20 """
21
22 def __init__(self, config: AgentConfig):
23 self.config = config
24 self.tools: Dict[str, ToolDefinition] = {}
25 self._register_default_tools()
26
27 def _register_default_tools(self):
28 """Register all default tools."""
29 # File tools
30 self.register(ToolDefinition(
31 name="read_file",
32 description="Read the contents of a file",
33 parameters={
34 "type": "object",
35 "properties": {
36 "path": {"type": "string", "description": "File path relative to workspace"},
37 "start_line": {"type": "integer", "description": "Start line (optional)"},
38 "end_line": {"type": "integer", "description": "End line (optional)"}
39 },
40 "required": ["path"]
41 },
42 handler=self._read_file,
43 category="files"
44 ))
45
46 self.register(ToolDefinition(
47 name="write_file",
48 description="Write content to a file (creates if doesn't exist)",
49 parameters={
50 "type": "object",
51 "properties": {
52 "path": {"type": "string", "description": "File path"},
53 "content": {"type": "string", "description": "Content to write"}
54 },
55 "required": ["path", "content"]
56 },
57 handler=self._write_file,
58 category="files"
59 ))
60
61 self.register(ToolDefinition(
62 name="edit_file",
63 description="Replace specific content in a file",
64 parameters={
65 "type": "object",
66 "properties": {
67 "path": {"type": "string", "description": "File path"},
68 "old_content": {"type": "string", "description": "Content to replace"},
69 "new_content": {"type": "string", "description": "New content"}
70 },
71 "required": ["path", "old_content", "new_content"]
72 },
73 handler=self._edit_file,
74 category="files"
75 ))
76
77 self.register(ToolDefinition(
78 name="list_files",
79 description="List files and directories",
80 parameters={
81 "type": "object",
82 "properties": {
83 "path": {"type": "string", "description": "Directory path"},
84 "pattern": {"type": "string", "description": "Glob pattern filter"},
85 "recursive": {"type": "boolean", "description": "List recursively"}
86 }
87 },
88 handler=self._list_files,
89 category="files"
90 ))
91
92 self.register(ToolDefinition(
93 name="search_code",
94 description="Search for patterns in code files",
95 parameters={
96 "type": "object",
97 "properties": {
98 "pattern": {"type": "string", "description": "Search pattern (regex)"},
99 "path": {"type": "string", "description": "Directory to search"},
100 "file_pattern": {"type": "string", "description": "File glob pattern"}
101 },
102 "required": ["pattern"]
103 },
104 handler=self._search_code,
105 category="files"
106 ))
107
108 # Execution tools
109 self.register(ToolDefinition(
110 name="run_command",
111 description="Execute a shell command in the sandbox",
112 parameters={
113 "type": "object",
114 "properties": {
115 "command": {"type": "string", "description": "Command to execute"},
116 "timeout": {"type": "integer", "description": "Timeout in seconds"}
117 },
118 "required": ["command"]
119 },
120 handler=self._run_command,
121 category="execution"
122 ))
123
124 self.register(ToolDefinition(
125 name="run_tests",
126 description="Run tests and return results",
127 parameters={
128 "type": "object",
129 "properties": {
130 "test_path": {"type": "string", "description": "Specific test file or directory"},
131 "test_name": {"type": "string", "description": "Specific test name pattern"}
132 }
133 },
134 handler=self._run_tests,
135 category="execution"
136 ))
137
138 # Git tools
139 if self.config.enable_git:
140 self.register(ToolDefinition(
141 name="git_status",
142 description="Get git status of the repository",
143 parameters={"type": "object", "properties": {}},
144 handler=self._git_status,
145 category="git"
146 ))
147
148 self.register(ToolDefinition(
149 name="git_diff",
150 description="Get diff of changes",
151 parameters={
152 "type": "object",
153 "properties": {
154 "staged": {"type": "boolean", "description": "Show staged changes only"},
155 "file": {"type": "string", "description": "Specific file to diff"}
156 }
157 },
158 handler=self._git_diff,
159 category="git"
160 ))
161
162 self.register(ToolDefinition(
163 name="git_commit",
164 description="Create a git commit",
165 parameters={
166 "type": "object",
167 "properties": {
168 "message": {"type": "string", "description": "Commit message"},
169 "files": {"type": "array", "items": {"type": "string"}, "description": "Files to commit"}
170 }
171 },
172 handler=self._git_commit,
173 category="git"
174 ))
175
176 self.register(ToolDefinition(
177 name="git_create_branch",
178 description="Create a new git branch",
179 parameters={
180 "type": "object",
181 "properties": {
182 "name": {"type": "string", "description": "Branch name"},
183 "checkout": {"type": "boolean", "description": "Checkout after creating"}
184 },
185 "required": ["name"]
186 },
187 handler=self._git_create_branch,
188 category="git"
189 ))
190
191 # Task management
192 self.register(ToolDefinition(
193 name="task_complete",
194 description="Mark the current task as complete",
195 parameters={
196 "type": "object",
197 "properties": {
198 "summary": {"type": "string", "description": "Summary of what was done"},
199 "files_changed": {"type": "array", "items": {"type": "string"}}
200 },
201 "required": ["summary"]
202 },
203 handler=self._task_complete,
204 category="control"
205 ))
206
207 self.register(ToolDefinition(
208 name="ask_user",
209 description="Ask the user a question when clarification is needed",
210 parameters={
211 "type": "object",
212 "properties": {
213 "question": {"type": "string", "description": "Question to ask"}
214 },
215 "required": ["question"]
216 },
217 handler=self._ask_user,
218 category="control"
219 ))
220
221 def register(self, tool: ToolDefinition):
222 """Register a tool."""
223 self.tools[tool.name] = tool
224
225 def get_tool(self, name: str) -> Optional[ToolDefinition]:
226 """Get a tool by name."""
227 return self.tools.get(name)
228
229 def get_tools_for_llm(self) -> List[Dict[str, Any]]:
230 """Get tool definitions formatted for LLM."""
231 return [
232 {
233 "name": tool.name,
234 "description": tool.description,
235 "input_schema": tool.parameters
236 }
237 for tool in self.tools.values()
238 ]
239
240 def get_tools_by_category(self) -> Dict[str, List[ToolDefinition]]:
241 """Get tools grouped by category."""
242 by_category = {}
243 for tool in self.tools.values():
244 if tool.category not in by_category:
245 by_category[tool.category] = []
246 by_category[tool.category].append(tool)
247 return by_category
248
249 # Tool implementations (stubs - these will use the components we built earlier)
250 async def _read_file(self, path: str, start_line: int = None, end_line: int = None):
251 pass # Implemented using FileSystemTools from Section 2
252
253 async def _write_file(self, path: str, content: str):
254 pass # Implemented using FileSystemTools from Section 2
255
256 async def _edit_file(self, path: str, old_content: str, new_content: str):
257 pass # Implemented using FileSystemTools from Section 2
258
259 async def _list_files(self, path: str = ".", pattern: str = None, recursive: bool = False):
260 pass # Implemented using FileSystemTools from Section 2
261
262 async def _search_code(self, pattern: str, path: str = ".", file_pattern: str = None):
263 pass # Implemented using FileSystemTools from Section 2
264
265 async def _run_command(self, command: str, timeout: int = 30):
266 pass # Implemented using Sandbox from Section 3
267
268 async def _run_tests(self, test_path: str = None, test_name: str = None):
269 pass # Implemented using TestRunner from Section 5
270
271 async def _git_status(self):
272 pass # Implemented using GitTools from Section 4
273
274 async def _git_diff(self, staged: bool = False, file: str = None):
275 pass # Implemented using GitTools from Section 4
276
277 async def _git_commit(self, message: str = None, files: List[str] = None):
278 pass # Implemented using GitTools from Section 4
279
280 async def _git_create_branch(self, name: str, checkout: bool = True):
281 pass # Implemented using GitTools from Section 4
282
283 async def _task_complete(self, summary: str, files_changed: List[str] = None):
284 pass # Control flow
285
286 async def _ask_user(self, question: str):
287 pass # Control flowThe Main Agent Class
The main agent class orchestrates all components:
1class CodingAgent:
2 """
3 Complete coding agent implementation.
4 """
5
6 def __init__(self, config: AgentConfig):
7 self.config = config
8 self.state = AgentState()
9
10 # Initialize components
11 self.llm = self._init_llm()
12 self.tools = ToolRegistry(config)
13 self.sandbox = self._init_sandbox()
14 self.file_tools = self._init_file_tools()
15 self.git = self._init_git() if config.enable_git else None
16 self.test_runner = self._init_test_runner() if config.enable_tests else None
17 self.debug_loop = self._init_debug_loop() if config.enable_debugging else None
18
19 # Wire up tool implementations
20 self._connect_tools()
21
22 def _init_llm(self):
23 """Initialize LLM client."""
24 from anthropic import AsyncAnthropic
25 return AsyncAnthropic()
26
27 def _init_sandbox(self):
28 """Initialize execution sandbox."""
29 from .sandbox import SandboxManager, SandboxConfig, IsolationLevel
30
31 isolation = {
32 "none": IsolationLevel.NONE,
33 "restricted": IsolationLevel.RESTRICTED,
34 "container": IsolationLevel.CONTAINER,
35 }.get(self.config.isolation_level, IsolationLevel.RESTRICTED)
36
37 sandbox_config = SandboxConfig(
38 isolation_level=isolation,
39 max_cpu_time=self.config.timeout_seconds,
40 allow_network=False
41 )
42 return SandboxManager(self.config.workspace, sandbox_config)
43
44 def _init_file_tools(self):
45 """Initialize file system tools."""
46 from .file_tools import FileSystemTools
47 return FileSystemTools(self.config.workspace)
48
49 def _init_git(self):
50 """Initialize git tools."""
51 from .git_tools import GitTools, GitSafetyGuard
52 git = GitTools(self.config.workspace)
53 return GitSafetyGuard(git, on_approval_needed=self._request_approval)
54
55 def _init_test_runner(self):
56 """Initialize test runner."""
57 from .test_runner import TestRunner
58 return TestRunner(self.config.workspace, self.sandbox)
59
60 def _init_debug_loop(self):
61 """Initialize debugging loop."""
62 from .debug_loop import DebugIterationLoop
63 return DebugIterationLoop(
64 self.llm,
65 self.config.workspace,
66 self.sandbox,
67 self.file_tools,
68 self.test_runner
69 )
70
71 def _connect_tools(self):
72 """Connect tool handlers to their implementations."""
73 # File tools
74 self.tools.tools["read_file"].handler = self.file_tools.read
75 self.tools.tools["write_file"].handler = self.file_tools.write
76 self.tools.tools["edit_file"].handler = self.file_tools.edit
77 self.tools.tools["list_files"].handler = self.file_tools.list_files
78 self.tools.tools["search_code"].handler = self.file_tools.search
79
80 # Execution
81 self.tools.tools["run_command"].handler = self._run_command
82 self.tools.tools["run_tests"].handler = self._run_tests
83
84 # Git
85 if self.git:
86 self.tools.tools["git_status"].handler = self.git.git.status
87 self.tools.tools["git_diff"].handler = self.git.git.diff
88 self.tools.tools["git_commit"].handler = self.git.safe_commit
89 self.tools.tools["git_create_branch"].handler = self.git.git.create_branch
90
91 # Control
92 self.tools.tools["task_complete"].handler = self._mark_complete
93 self.tools.tools["ask_user"].handler = self._ask_user
94
95 async def run(self, task: str) -> AsyncGenerator[Dict[str, Any], None]:
96 """
97 Run the agent on a task.
98
99 Yields status updates as the agent works.
100 """
101 # Initialize task context
102 import uuid
103 self.state.current_task = TaskContext(
104 task_id=str(uuid.uuid4())[:8],
105 description=task
106 )
107 self.state.status = "thinking"
108 self.state.iteration = 0
109
110 yield {
111 "event": "task_started",
112 "task_id": self.state.current_task.task_id,
113 "description": task
114 }
115
116 # Build initial context
117 context = await self._gather_initial_context()
118
119 yield {
120 "event": "context_gathered",
121 "files_indexed": context.get("file_count", 0)
122 }
123
124 # Main agent loop
125 messages = [{"role": "user", "content": task}]
126
127 while self.state.iteration < self.config.max_iterations:
128 self.state.iteration += 1
129 self.state.status = "thinking"
130
131 yield {
132 "event": "iteration_start",
133 "iteration": self.state.iteration
134 }
135
136 # Get LLM response
137 response = await self._get_completion(messages, context)
138
139 # Check for text response
140 if response.stop_reason == "end_turn":
141 yield {
142 "event": "response",
143 "content": response.content[0].text
144 }
145
146 # Check if task is complete
147 if self._is_complete_response(response.content[0].text):
148 break
149
150 # Wait for user input in interactive mode
151 if self.config.mode == AgentMode.INTERACTIVE:
152 yield {
153 "event": "awaiting_input",
154 "prompt": "Continue or provide feedback:"
155 }
156 break
157
158 # Handle tool use
159 if response.stop_reason == "tool_use":
160 for block in response.content:
161 if block.type == "tool_use":
162 yield {
163 "event": "tool_call",
164 "tool": block.name,
165 "input": block.input
166 }
167
168 # Execute tool
169 result = await self._execute_tool(block.name, block.input)
170
171 yield {
172 "event": "tool_result",
173 "tool": block.name,
174 "success": result.get("success", True),
175 "summary": str(result)[:200]
176 }
177
178 # Add to messages
179 messages.append({
180 "role": "assistant",
181 "content": response.content
182 })
183 messages.append({
184 "role": "user",
185 "content": [{
186 "type": "tool_result",
187 "tool_use_id": block.id,
188 "content": json.dumps(result)
189 }]
190 })
191
192 # Check for completion
193 if self.state.status == "complete":
194 break
195
196 # Run verification if tests are enabled
197 if self.config.enable_tests and self.state.current_task.files_modified:
198 yield {"event": "running_verification"}
199 test_result = await self.test_runner.run_tests()
200
201 if not test_result.success and self.config.enable_debugging:
202 yield {
203 "event": "debugging",
204 "failed_tests": test_result.failed
205 }
206
207 async for debug_update in self.debug_loop.debug_loop():
208 yield {"event": "debug_update", **debug_update}
209
210 # Final summary
211 yield {
212 "event": "task_complete",
213 "task_id": self.state.current_task.task_id,
214 "iterations": self.state.iteration,
215 "files_modified": self.state.current_task.files_modified,
216 "tests_run": len(self.state.current_task.tests_run)
217 }
218
219 async def _get_completion(
220 self,
221 messages: List[Dict],
222 context: Dict
223 ):
224 """Get completion from LLM."""
225 system_prompt = self._build_system_prompt(context)
226
227 response = await self.llm.messages.create(
228 model=self.config.model_name,
229 max_tokens=self.config.max_tokens,
230 system=system_prompt,
231 messages=messages,
232 tools=self.tools.get_tools_for_llm()
233 )
234
235 return response
236
237 async def _execute_tool(self, name: str, params: Dict) -> Dict[str, Any]:
238 """Execute a tool and return results."""
239 tool = self.tools.get_tool(name)
240
241 if not tool:
242 return {"success": False, "error": f"Unknown tool: {name}"}
243
244 # Check if approval is needed
245 if tool.requires_approval or name in self.config.require_approval_for:
246 approved = await self._request_approval(name, params)
247 if not approved:
248 return {"success": False, "error": "Action not approved"}
249
250 try:
251 result = await tool.handler(**params)
252
253 # Track changes
254 if name == "write_file" or name == "edit_file":
255 self.state.current_task.files_modified.append(params.get("path"))
256 elif name == "read_file":
257 self.state.current_task.files_read.append(params.get("path"))
258 elif name == "run_command":
259 self.state.current_task.commands_executed.append({
260 "command": params.get("command"),
261 "success": result.get("success", True)
262 })
263
264 return {"success": True, "result": result}
265
266 except Exception as e:
267 self.state.current_task.errors_encountered.append({
268 "tool": name,
269 "error": str(e)
270 })
271 return {"success": False, "error": str(e)}
272
273 async def _gather_initial_context(self) -> Dict[str, Any]:
274 """Gather initial context about the codebase."""
275 context = {
276 "workspace": str(self.config.workspace),
277 "files": [],
278 "file_count": 0,
279 }
280
281 # List files
282 files = await self.file_tools.list_files(".", recursive=True)
283 context["files"] = files[:100] # Limit for context
284 context["file_count"] = len(files)
285
286 # Get git status if available
287 if self.git:
288 try:
289 status = await self.git.git.status()
290 context["git_status"] = {
291 "branch": status.branch,
292 "has_changes": status.has_changes
293 }
294 except:
295 pass
296
297 # Detect project type
298 context["project_type"] = await self._detect_project_type()
299
300 return context
301
302 async def _detect_project_type(self) -> str:
303 """Detect the type of project."""
304 workspace = self.config.workspace
305
306 if (workspace / "package.json").exists():
307 return "javascript/node"
308 elif (workspace / "pyproject.toml").exists():
309 return "python"
310 elif (workspace / "Cargo.toml").exists():
311 return "rust"
312 elif (workspace / "go.mod").exists():
313 return "go"
314 else:
315 return "unknown"
316
317 def _is_complete_response(self, text: str) -> bool:
318 """Check if the response indicates task completion."""
319 completion_phrases = [
320 "task is complete",
321 "i've completed",
322 "the task has been completed",
323 "successfully completed",
324 ]
325 text_lower = text.lower()
326 return any(phrase in text_lower for phrase in completion_phrases)
327
328 async def _request_approval(self, action: str, details: Any) -> bool:
329 """Request approval for a sensitive action."""
330 if self.config.mode == AgentMode.AUTONOMOUS:
331 return True
332
333 self.state.pending_approvals.append({
334 "action": action,
335 "details": details,
336 "timestamp": datetime.now().isoformat()
337 })
338
339 # In real implementation, this would wait for user input
340 return True
341
342 async def _run_command(self, command: str, timeout: int = 30):
343 """Run a command in the sandbox."""
344 await self.sandbox.initialize()
345 return await self.sandbox.execute(command, timeout=timeout)
346
347 async def _run_tests(self, test_path: str = None, test_name: str = None):
348 """Run tests."""
349 return await self.test_runner.run_tests(test_path=test_path)
350
351 async def _mark_complete(self, summary: str, files_changed: List[str] = None):
352 """Mark the current task as complete."""
353 self.state.status = "complete"
354 return {"status": "complete", "summary": summary}
355
356 async def _ask_user(self, question: str):
357 """Ask the user a question."""
358 self.state.status = "waiting"
359 return {"type": "question", "content": question}Prompt Engineering
The system prompt is crucial for effective agent behavior:
1class CodingAgent:
2 # ... (continued)
3
4 def _build_system_prompt(self, context: Dict) -> str:
5 """Build the system prompt for the agent."""
6 tools_by_category = self.tools.get_tools_by_category()
7
8 tool_docs = []
9 for category, tools in tools_by_category.items():
10 tool_docs.append(f"### {category.title()} Tools")
11 for tool in tools:
12 tool_docs.append(f"- **{tool.name}**: {tool.description}")
13
14 return f"""You are an expert coding agent that helps developers with programming tasks.
15
16## Your Capabilities
17
18You can read, write, and edit code files, run commands, execute tests, and manage git operations.
19
20## Available Tools
21
22{chr(10).join(tool_docs)}
23
24## Current Context
25
26- **Workspace**: {context.get('workspace')}
27- **Project Type**: {context.get('project_type', 'unknown')}
28- **Files**: {context.get('file_count', 0)} files in workspace
29{f"- **Git Branch**: {context.get('git_status', {}).get('branch', 'unknown')}" if self.git else ""}
30
31## Guidelines
32
33### Understanding the Task
341. Read existing code before making changes
352. Understand the project structure and conventions
363. Ask clarifying questions if the task is ambiguous
37
38### Making Changes
391. Make minimal, focused changes
402. Preserve existing code style and conventions
413. Add appropriate comments for complex logic
424. Consider edge cases and error handling
43
44### Testing and Verification
451. Run existing tests after making changes
462. Add tests for new functionality
473. Verify the code works as expected
48
49### Git Workflow
501. Create a feature branch for significant changes
512. Make atomic commits with clear messages
523. Don't push without explicit permission
53
54### Safety
551. Never execute destructive commands without confirmation
562. Back up files before major changes
573. Stop and ask if something seems wrong
58
59## Response Format
60
61For each step:
621. Explain what you're about to do and why
632. Use the appropriate tools to accomplish it
643. Report the results
654. Continue to the next step or mark complete
66
67When the task is complete, summarize what was done.
68
69## Error Handling
70
71If you encounter an error:
721. Analyze the error message
732. Determine the root cause
743. Attempt a fix
754. Re-run to verify
76
77If you cannot fix an error after multiple attempts, explain the issue and ask for help."""CLI Interface
A CLI interface makes the agent easy to use:
1#!/usr/bin/env python3
2"""
3Coding Agent CLI
4================
5
6A command-line interface for the coding agent.
7
8Usage:
9 coding-agent run "Add a new function to calculate fibonacci"
10 coding-agent interactive
11 coding-agent --help
12"""
13
14import asyncio
15import argparse
16import sys
17from pathlib import Path
18from rich.console import Console
19from rich.markdown import Markdown
20from rich.panel import Panel
21from rich.progress import Progress, SpinnerColumn, TextColumn
22from rich.syntax import Syntax
23
24
25console = Console()
26
27
28def create_parser():
29 """Create the argument parser."""
30 parser = argparse.ArgumentParser(
31 description="AI-powered coding agent",
32 formatter_class=argparse.RawDescriptionHelpFormatter
33 )
34
35 parser.add_argument(
36 "--workspace", "-w",
37 type=Path,
38 default=Path.cwd(),
39 help="Workspace directory (default: current directory)"
40 )
41
42 parser.add_argument(
43 "--mode", "-m",
44 choices=["interactive", "autonomous", "supervised"],
45 default="supervised",
46 help="Agent mode"
47 )
48
49 parser.add_argument(
50 "--model",
51 default="claude-3-5-sonnet-20241022",
52 help="Model to use"
53 )
54
55 parser.add_argument(
56 "--no-git",
57 action="store_true",
58 help="Disable git integration"
59 )
60
61 parser.add_argument(
62 "--no-tests",
63 action="store_true",
64 help="Disable test running"
65 )
66
67 subparsers = parser.add_subparsers(dest="command", help="Commands")
68
69 # Run command
70 run_parser = subparsers.add_parser("run", help="Run a single task")
71 run_parser.add_argument("task", help="Task description")
72
73 # Interactive mode
74 subparsers.add_parser("interactive", help="Start interactive session")
75
76 # Status command
77 subparsers.add_parser("status", help="Show agent status")
78
79 return parser
80
81
82async def run_task(agent, task: str):
83 """Run a single task and display progress."""
84 with Progress(
85 SpinnerColumn(),
86 TextColumn("[progress.description]{task.description}"),
87 console=console
88 ) as progress:
89 task_id = progress.add_task("Starting agent...", total=None)
90
91 async for event in agent.run(task):
92 event_type = event.get("event")
93
94 if event_type == "task_started":
95 progress.update(task_id, description="Analyzing task...")
96
97 elif event_type == "context_gathered":
98 progress.update(task_id, description=f"Found {event['files_indexed']} files")
99
100 elif event_type == "iteration_start":
101 progress.update(task_id, description=f"Iteration {event['iteration']}")
102
103 elif event_type == "tool_call":
104 progress.update(task_id, description=f"Using {event['tool']}...")
105
106 elif event_type == "tool_result":
107 status = "β" if event["success"] else "β"
108 console.print(f" {status} {event['tool']}: {event['summary']}")
109
110 elif event_type == "response":
111 progress.stop()
112 console.print(Panel(
113 Markdown(event["content"]),
114 title="Agent Response",
115 border_style="green"
116 ))
117 progress.start()
118
119 elif event_type == "running_verification":
120 progress.update(task_id, description="Running tests...")
121
122 elif event_type == "debugging":
123 progress.update(task_id, description="Debugging...")
124 console.print(f" [yellow]Debugging {len(event['failed_tests'])} failed tests[/]")
125
126 elif event_type == "task_complete":
127 progress.update(task_id, description="Complete!")
128 progress.stop()
129
130 console.print("\n[bold green]Task Complete![/]\n")
131 console.print(f"Iterations: {event['iterations']}")
132 if event["files_modified"]:
133 console.print(f"Files modified: {', '.join(event['files_modified'])}")
134
135
136async def interactive_mode(agent):
137 """Run in interactive mode."""
138 console.print(Panel(
139 "Coding Agent Interactive Mode\n\n"
140 "Type your requests, or 'quit' to exit.",
141 title="Welcome",
142 border_style="blue"
143 ))
144
145 while True:
146 try:
147 user_input = console.input("\n[bold cyan]You:[/] ")
148
149 if user_input.lower() in ["quit", "exit", "q"]:
150 console.print("[yellow]Goodbye![/]")
151 break
152
153 if not user_input.strip():
154 continue
155
156 await run_task(agent, user_input)
157
158 except KeyboardInterrupt:
159 console.print("\n[yellow]Interrupted. Type 'quit' to exit.[/]")
160 except Exception as e:
161 console.print(f"[red]Error: {e}[/]")
162
163
164async def main():
165 """Main entry point."""
166 parser = create_parser()
167 args = parser.parse_args()
168
169 # Validate workspace
170 if not args.workspace.is_dir():
171 console.print(f"[red]Error: {args.workspace} is not a directory[/]")
172 sys.exit(1)
173
174 # Create config
175 config = AgentConfig(
176 workspace=args.workspace,
177 mode=AgentMode(args.mode),
178 model_name=args.model,
179 enable_git=not args.no_git,
180 enable_tests=not args.no_tests
181 )
182
183 # Create agent
184 agent = CodingAgent(config)
185
186 # Run appropriate command
187 if args.command == "run":
188 await run_task(agent, args.task)
189 elif args.command == "interactive":
190 await interactive_mode(agent)
191 elif args.command == "status":
192 console.print(f"Workspace: {args.workspace}")
193 console.print(f"Mode: {args.mode}")
194 console.print(f"Git: {'enabled' if not args.no_git else 'disabled'}")
195 console.print(f"Tests: {'enabled' if not args.no_tests else 'disabled'}")
196 else:
197 parser.print_help()
198
199
200if __name__ == "__main__":
201 asyncio.run(main())Putting It All Together
Here's how the complete agent handles a typical task:
1"""
2Example: Complete workflow of the coding agent
3"""
4
5async def example_workflow():
6 """Demonstrate a complete coding agent workflow."""
7
8 # 1. Initialize the agent
9 config = AgentConfig(
10 workspace=Path("./my-project"),
11 mode=AgentMode.SUPERVISED,
12 enable_git=True,
13 enable_tests=True,
14 enable_debugging=True
15 )
16
17 agent = CodingAgent(config)
18
19 # 2. Run a task
20 task = """
21 Add a new utility function called 'validate_email' that:
22 - Takes an email string as input
23 - Returns True if the email is valid, False otherwise
24 - Uses regex for validation
25 - Add appropriate tests for the function
26 """
27
28 async for event in agent.run(task):
29 print(f"Event: {event['event']}")
30
31 # The agent will:
32 # 1. Analyze the codebase structure
33 # 2. Find existing utility functions
34 # 3. Create the new function following project conventions
35 # 4. Write tests for the function
36 # 5. Run tests to verify
37 # 6. Debug any failures
38 # 7. Commit changes (with approval)
39
40
41# Example session output:
42"""
43Event: task_started
44Event: context_gathered
45Event: iteration_start
46Event: tool_call (read_file: src/utils.py)
47Event: tool_result
48Event: tool_call (write_file: src/utils.py)
49Event: tool_result
50Event: tool_call (write_file: tests/test_utils.py)
51Event: tool_result
52Event: tool_call (run_tests)
53Event: tool_result
54Event: response
55
56Agent Response:
57I've added the validate_email function to src/utils.py and created
58comprehensive tests in tests/test_utils.py. All tests are passing.
59
60Files modified:
61- src/utils.py: Added validate_email function
62- tests/test_utils.py: Added 6 test cases
63
64Would you like me to commit these changes?
65
66Event: task_complete
67"""Project Structure
The complete coding agent should be organized as follows:
1coding_agent/
2βββ __init__.py
3βββ agent.py # Main CodingAgent class
4βββ config.py # Configuration classes
5βββ tools/
6β βββ __init__.py
7β βββ registry.py # ToolRegistry
8β βββ file_tools.py # File system operations
9β βββ git_tools.py # Git integration
10β βββ test_tools.py # Test running
11βββ sandbox/
12β βββ __init__.py
13β βββ docker.py # Docker sandbox
14β βββ subprocess.py # Subprocess sandbox
15β βββ manager.py # SandboxManager
16βββ debug/
17β βββ __init__.py
18β βββ parser.py # Error parsing
19β βββ analyzer.py # Error analysis
20β βββ strategies.py # Fix strategies
21β βββ loop.py # Debug iteration loop
22βββ prompts/
23β βββ __init__.py
24β βββ system.py # System prompts
25β βββ templates.py # Prompt templates
26βββ cli/
27β βββ __init__.py
28β βββ main.py # CLI entry point
29βββ utils/
30 βββ __init__.py
31 βββ patterns.py # Fix pattern learningSummary
In this chapter, we built a complete coding agent from scratch:
- Architecture: A modular design with distinct components for file operations, execution, git, testing, and debugging
- File System Tools: Intelligent reading, writing, editing, and searching with safety guards
- Execution Sandbox: Docker and subprocess-based isolation with resource limits
- Git Integration: Branch management, commit generation, diff analysis, and safety guards
- Test-Driven Development: Framework detection, test running, parsing, and generation
- Debugging Loop: Error categorization, analysis, fix strategies, and pattern learning
- Complete Agent: Unified tool registry, main agent class, prompt engineering, and CLI interface
Next Steps: The coding agent we've built is a solid foundation. In the next chapter, we'll build a research agent that can search the web, read documents, and synthesize informationβa different but equally powerful type of agent.
The patterns and techniques from this chapterβtool design, sandbox execution, iterative debuggingβapply broadly to many types of agents. Master these fundamentals, and you'll be able to build agents for almost any domain.