AI Book - Master Artificial Intelligence by Building from Scratch

Introduction

This section brings together everything we've built in this chapter into a complete, production-ready coding agent. We'll integrate the file system tools, code execution sandbox, Git integration, test-driven development support, and debugging loop into a unified agent that can tackle real coding tasks.

What We're Building: A coding agent that can understand natural language requests, navigate codebases, write and edit code, run tests, debug failures, and commit changes—all while maintaining safety and providing clear feedback.

Complete Agent Architecture

Our complete coding agent consists of several integrated components:

🐍python

1"""
2Complete Coding Agent Implementation
3====================================
4
5A production-ready coding agent with:
6- File system operations
7- Code execution sandbox
8- Git integration
9- Test-driven development
10- Automatic debugging
11"""
12
13from dataclasses import dataclass, field
14from typing import List, Dict, Any, Optional, AsyncGenerator
15from pathlib import Path
16from enum import Enum
17from datetime import datetime
18import asyncio
19import json
20import os
21
22
23class AgentMode(Enum):
24    """Operating modes for the coding agent."""
25    INTERACTIVE = "interactive"   # Chat-based interaction
26    AUTONOMOUS = "autonomous"     # Execute tasks automatically
27    SUPERVISED = "supervised"     # Require approval for changes
28
29
30@dataclass
31class AgentConfig:
32    """Configuration for the coding agent."""
33    # Core settings
34    workspace: Path
35    mode: AgentMode = AgentMode.SUPERVISED
36
37    # Model settings
38    model_name: str = "claude-3-5-sonnet-20241022"
39    max_tokens: int = 4096
40    temperature: float = 0.1
41
42    # Safety settings
43    isolation_level: str = "container"  # none, restricted, container
44    require_approval_for: List[str] = field(default_factory=lambda: [
45        "git_push", "delete_file", "run_unsafe_command"
46    ])
47
48    # Limits
49    max_iterations: int = 20
50    max_file_edits_per_task: int = 10
51    timeout_seconds: int = 300
52
53    # Features
54    enable_git: bool = True
55    enable_tests: bool = True
56    enable_debugging: bool = True
57    auto_commit: bool = False
58
59    # Paths
60    log_path: Optional[Path] = None
61    cache_path: Optional[Path] = None
62
63
64@dataclass
65class TaskContext:
66    """Context for a coding task."""
67    task_id: str
68    description: str
69    started_at: str = field(default_factory=lambda: datetime.now().isoformat())
70    files_read: List[str] = field(default_factory=list)
71    files_modified: List[str] = field(default_factory=list)
72    commands_executed: List[Dict] = field(default_factory=list)
73    errors_encountered: List[Dict] = field(default_factory=list)
74    tests_run: List[Dict] = field(default_factory=list)
75    git_operations: List[Dict] = field(default_factory=list)
76
77
78@dataclass
79class AgentState:
80    """Current state of the agent."""
81    status: str = "idle"  # idle, thinking, acting, waiting, complete, error
82    current_task: Optional[TaskContext] = None
83    iteration: int = 0
84    last_action: Optional[str] = None
85    last_observation: Optional[str] = None
86    pending_approvals: List[Dict] = field(default_factory=list)

Unified Tool Registry

We create a unified registry that manages all tools the agent can use:

🐍python

1from abc import ABC, abstractmethod
2from dataclasses import dataclass
3from typing import Callable, Any
4
5
6@dataclass
7class ToolDefinition:
8    """Definition of a tool for the agent."""
9    name: str
10    description: str
11    parameters: Dict[str, Any]  # JSON Schema
12    handler: Callable
13    requires_approval: bool = False
14    category: str = "general"
15
16
17class ToolRegistry:
18    """
19    Registry for all agent tools.
20    """
21
22    def __init__(self, config: AgentConfig):
23        self.config = config
24        self.tools: Dict[str, ToolDefinition] = {}
25        self._register_default_tools()
26
27    def _register_default_tools(self):
28        """Register all default tools."""
29        # File tools
30        self.register(ToolDefinition(
31            name="read_file",
32            description="Read the contents of a file",
33            parameters={
34                "type": "object",
35                "properties": {
36                    "path": {"type": "string", "description": "File path relative to workspace"},
37                    "start_line": {"type": "integer", "description": "Start line (optional)"},
38                    "end_line": {"type": "integer", "description": "End line (optional)"}
39                },
40                "required": ["path"]
41            },
42            handler=self._read_file,
43            category="files"
44        ))
45
46        self.register(ToolDefinition(
47            name="write_file",
48            description="Write content to a file (creates if doesn't exist)",
49            parameters={
50                "type": "object",
51                "properties": {
52                    "path": {"type": "string", "description": "File path"},
53                    "content": {"type": "string", "description": "Content to write"}
54                },
55                "required": ["path", "content"]
56            },
57            handler=self._write_file,
58            category="files"
59        ))
60
61        self.register(ToolDefinition(
62            name="edit_file",
63            description="Replace specific content in a file",
64            parameters={
65                "type": "object",
66                "properties": {
67                    "path": {"type": "string", "description": "File path"},
68                    "old_content": {"type": "string", "description": "Content to replace"},
69                    "new_content": {"type": "string", "description": "New content"}
70                },
71                "required": ["path", "old_content", "new_content"]
72            },
73            handler=self._edit_file,
74            category="files"
75        ))
76
77        self.register(ToolDefinition(
78            name="list_files",
79            description="List files and directories",
80            parameters={
81                "type": "object",
82                "properties": {
83                    "path": {"type": "string", "description": "Directory path"},
84                    "pattern": {"type": "string", "description": "Glob pattern filter"},
85                    "recursive": {"type": "boolean", "description": "List recursively"}
86                }
87            },
88            handler=self._list_files,
89            category="files"
90        ))
91
92        self.register(ToolDefinition(
93            name="search_code",
94            description="Search for patterns in code files",
95            parameters={
96                "type": "object",
97                "properties": {
98                    "pattern": {"type": "string", "description": "Search pattern (regex)"},
99                    "path": {"type": "string", "description": "Directory to search"},
100                    "file_pattern": {"type": "string", "description": "File glob pattern"}
101                },
102                "required": ["pattern"]
103            },
104            handler=self._search_code,
105            category="files"
106        ))
107
108        # Execution tools
109        self.register(ToolDefinition(
110            name="run_command",
111            description="Execute a shell command in the sandbox",
112            parameters={
113                "type": "object",
114                "properties": {
115                    "command": {"type": "string", "description": "Command to execute"},
116                    "timeout": {"type": "integer", "description": "Timeout in seconds"}
117                },
118                "required": ["command"]
119            },
120            handler=self._run_command,
121            category="execution"
122        ))
123
124        self.register(ToolDefinition(
125            name="run_tests",
126            description="Run tests and return results",
127            parameters={
128                "type": "object",
129                "properties": {
130                    "test_path": {"type": "string", "description": "Specific test file or directory"},
131                    "test_name": {"type": "string", "description": "Specific test name pattern"}
132                }
133            },
134            handler=self._run_tests,
135            category="execution"
136        ))
137
138        # Git tools
139        if self.config.enable_git:
140            self.register(ToolDefinition(
141                name="git_status",
142                description="Get git status of the repository",
143                parameters={"type": "object", "properties": {}},
144                handler=self._git_status,
145                category="git"
146            ))
147
148            self.register(ToolDefinition(
149                name="git_diff",
150                description="Get diff of changes",
151                parameters={
152                    "type": "object",
153                    "properties": {
154                        "staged": {"type": "boolean", "description": "Show staged changes only"},
155                        "file": {"type": "string", "description": "Specific file to diff"}
156                    }
157                },
158                handler=self._git_diff,
159                category="git"
160            ))
161
162            self.register(ToolDefinition(
163                name="git_commit",
164                description="Create a git commit",
165                parameters={
166                    "type": "object",
167                    "properties": {
168                        "message": {"type": "string", "description": "Commit message"},
169                        "files": {"type": "array", "items": {"type": "string"}, "description": "Files to commit"}
170                    }
171                },
172                handler=self._git_commit,
173                category="git"
174            ))
175
176            self.register(ToolDefinition(
177                name="git_create_branch",
178                description="Create a new git branch",
179                parameters={
180                    "type": "object",
181                    "properties": {
182                        "name": {"type": "string", "description": "Branch name"},
183                        "checkout": {"type": "boolean", "description": "Checkout after creating"}
184                    },
185                    "required": ["name"]
186                },
187                handler=self._git_create_branch,
188                category="git"
189            ))
190
191        # Task management
192        self.register(ToolDefinition(
193            name="task_complete",
194            description="Mark the current task as complete",
195            parameters={
196                "type": "object",
197                "properties": {
198                    "summary": {"type": "string", "description": "Summary of what was done"},
199                    "files_changed": {"type": "array", "items": {"type": "string"}}
200                },
201                "required": ["summary"]
202            },
203            handler=self._task_complete,
204            category="control"
205        ))
206
207        self.register(ToolDefinition(
208            name="ask_user",
209            description="Ask the user a question when clarification is needed",
210            parameters={
211                "type": "object",
212                "properties": {
213                    "question": {"type": "string", "description": "Question to ask"}
214                },
215                "required": ["question"]
216            },
217            handler=self._ask_user,
218            category="control"
219        ))
220
221    def register(self, tool: ToolDefinition):
222        """Register a tool."""
223        self.tools[tool.name] = tool
224
225    def get_tool(self, name: str) -> Optional[ToolDefinition]:
226        """Get a tool by name."""
227        return self.tools.get(name)
228
229    def get_tools_for_llm(self) -> List[Dict[str, Any]]:
230        """Get tool definitions formatted for LLM."""
231        return [
232            {
233                "name": tool.name,
234                "description": tool.description,
235                "input_schema": tool.parameters
236            }
237            for tool in self.tools.values()
238        ]
239
240    def get_tools_by_category(self) -> Dict[str, List[ToolDefinition]]:
241        """Get tools grouped by category."""
242        by_category = {}
243        for tool in self.tools.values():
244            if tool.category not in by_category:
245                by_category[tool.category] = []
246            by_category[tool.category].append(tool)
247        return by_category
248
249    # Tool implementations (stubs - these will use the components we built earlier)
250    async def _read_file(self, path: str, start_line: int = None, end_line: int = None):
251        pass  # Implemented using FileSystemTools from Section 2
252
253    async def _write_file(self, path: str, content: str):
254        pass  # Implemented using FileSystemTools from Section 2
255
256    async def _edit_file(self, path: str, old_content: str, new_content: str):
257        pass  # Implemented using FileSystemTools from Section 2
258
259    async def _list_files(self, path: str = ".", pattern: str = None, recursive: bool = False):
260        pass  # Implemented using FileSystemTools from Section 2
261
262    async def _search_code(self, pattern: str, path: str = ".", file_pattern: str = None):
263        pass  # Implemented using FileSystemTools from Section 2
264
265    async def _run_command(self, command: str, timeout: int = 30):
266        pass  # Implemented using Sandbox from Section 3
267
268    async def _run_tests(self, test_path: str = None, test_name: str = None):
269        pass  # Implemented using TestRunner from Section 5
270
271    async def _git_status(self):
272        pass  # Implemented using GitTools from Section 4
273
274    async def _git_diff(self, staged: bool = False, file: str = None):
275        pass  # Implemented using GitTools from Section 4
276
277    async def _git_commit(self, message: str = None, files: List[str] = None):
278        pass  # Implemented using GitTools from Section 4
279
280    async def _git_create_branch(self, name: str, checkout: bool = True):
281        pass  # Implemented using GitTools from Section 4
282
283    async def _task_complete(self, summary: str, files_changed: List[str] = None):
284        pass  # Control flow
285
286    async def _ask_user(self, question: str):
287        pass  # Control flow

The Main Agent Class

The main agent class orchestrates all components:

🐍python

1class CodingAgent:
2    """
3    Complete coding agent implementation.
4    """
5
6    def __init__(self, config: AgentConfig):
7        self.config = config
8        self.state = AgentState()
9
10        # Initialize components
11        self.llm = self._init_llm()
12        self.tools = ToolRegistry(config)
13        self.sandbox = self._init_sandbox()
14        self.file_tools = self._init_file_tools()
15        self.git = self._init_git() if config.enable_git else None
16        self.test_runner = self._init_test_runner() if config.enable_tests else None
17        self.debug_loop = self._init_debug_loop() if config.enable_debugging else None
18
19        # Wire up tool implementations
20        self._connect_tools()
21
22    def _init_llm(self):
23        """Initialize LLM client."""
24        from anthropic import AsyncAnthropic
25        return AsyncAnthropic()
26
27    def _init_sandbox(self):
28        """Initialize execution sandbox."""
29        from .sandbox import SandboxManager, SandboxConfig, IsolationLevel
30
31        isolation = {
32            "none": IsolationLevel.NONE,
33            "restricted": IsolationLevel.RESTRICTED,
34            "container": IsolationLevel.CONTAINER,
35        }.get(self.config.isolation_level, IsolationLevel.RESTRICTED)
36
37        sandbox_config = SandboxConfig(
38            isolation_level=isolation,
39            max_cpu_time=self.config.timeout_seconds,
40            allow_network=False
41        )
42        return SandboxManager(self.config.workspace, sandbox_config)
43
44    def _init_file_tools(self):
45        """Initialize file system tools."""
46        from .file_tools import FileSystemTools
47        return FileSystemTools(self.config.workspace)
48
49    def _init_git(self):
50        """Initialize git tools."""
51        from .git_tools import GitTools, GitSafetyGuard
52        git = GitTools(self.config.workspace)
53        return GitSafetyGuard(git, on_approval_needed=self._request_approval)
54
55    def _init_test_runner(self):
56        """Initialize test runner."""
57        from .test_runner import TestRunner
58        return TestRunner(self.config.workspace, self.sandbox)
59
60    def _init_debug_loop(self):
61        """Initialize debugging loop."""
62        from .debug_loop import DebugIterationLoop
63        return DebugIterationLoop(
64            self.llm,
65            self.config.workspace,
66            self.sandbox,
67            self.file_tools,
68            self.test_runner
69        )
70
71    def _connect_tools(self):
72        """Connect tool handlers to their implementations."""
73        # File tools
74        self.tools.tools["read_file"].handler = self.file_tools.read
75        self.tools.tools["write_file"].handler = self.file_tools.write
76        self.tools.tools["edit_file"].handler = self.file_tools.edit
77        self.tools.tools["list_files"].handler = self.file_tools.list_files
78        self.tools.tools["search_code"].handler = self.file_tools.search
79
80        # Execution
81        self.tools.tools["run_command"].handler = self._run_command
82        self.tools.tools["run_tests"].handler = self._run_tests
83
84        # Git
85        if self.git:
86            self.tools.tools["git_status"].handler = self.git.git.status
87            self.tools.tools["git_diff"].handler = self.git.git.diff
88            self.tools.tools["git_commit"].handler = self.git.safe_commit
89            self.tools.tools["git_create_branch"].handler = self.git.git.create_branch
90
91        # Control
92        self.tools.tools["task_complete"].handler = self._mark_complete
93        self.tools.tools["ask_user"].handler = self._ask_user
94
95    async def run(self, task: str) -> AsyncGenerator[Dict[str, Any], None]:
96        """
97        Run the agent on a task.
98
99        Yields status updates as the agent works.
100        """
101        # Initialize task context
102        import uuid
103        self.state.current_task = TaskContext(
104            task_id=str(uuid.uuid4())[:8],
105            description=task
106        )
107        self.state.status = "thinking"
108        self.state.iteration = 0
109
110        yield {
111            "event": "task_started",
112            "task_id": self.state.current_task.task_id,
113            "description": task
114        }
115
116        # Build initial context
117        context = await self._gather_initial_context()
118
119        yield {
120            "event": "context_gathered",
121            "files_indexed": context.get("file_count", 0)
122        }
123
124        # Main agent loop
125        messages = [{"role": "user", "content": task}]
126
127        while self.state.iteration < self.config.max_iterations:
128            self.state.iteration += 1
129            self.state.status = "thinking"
130
131            yield {
132                "event": "iteration_start",
133                "iteration": self.state.iteration
134            }
135
136            # Get LLM response
137            response = await self._get_completion(messages, context)
138
139            # Check for text response
140            if response.stop_reason == "end_turn":
141                yield {
142                    "event": "response",
143                    "content": response.content[0].text
144                }
145
146                # Check if task is complete
147                if self._is_complete_response(response.content[0].text):
148                    break
149
150                # Wait for user input in interactive mode
151                if self.config.mode == AgentMode.INTERACTIVE:
152                    yield {
153                        "event": "awaiting_input",
154                        "prompt": "Continue or provide feedback:"
155                    }
156                    break
157
158            # Handle tool use
159            if response.stop_reason == "tool_use":
160                for block in response.content:
161                    if block.type == "tool_use":
162                        yield {
163                            "event": "tool_call",
164                            "tool": block.name,
165                            "input": block.input
166                        }
167
168                        # Execute tool
169                        result = await self._execute_tool(block.name, block.input)
170
171                        yield {
172                            "event": "tool_result",
173                            "tool": block.name,
174                            "success": result.get("success", True),
175                            "summary": str(result)[:200]
176                        }
177
178                        # Add to messages
179                        messages.append({
180                            "role": "assistant",
181                            "content": response.content
182                        })
183                        messages.append({
184                            "role": "user",
185                            "content": [{
186                                "type": "tool_result",
187                                "tool_use_id": block.id,
188                                "content": json.dumps(result)
189                            }]
190                        })
191
192            # Check for completion
193            if self.state.status == "complete":
194                break
195
196        # Run verification if tests are enabled
197        if self.config.enable_tests and self.state.current_task.files_modified:
198            yield {"event": "running_verification"}
199            test_result = await self.test_runner.run_tests()
200
201            if not test_result.success and self.config.enable_debugging:
202                yield {
203                    "event": "debugging",
204                    "failed_tests": test_result.failed
205                }
206
207                async for debug_update in self.debug_loop.debug_loop():
208                    yield {"event": "debug_update", **debug_update}
209
210        # Final summary
211        yield {
212            "event": "task_complete",
213            "task_id": self.state.current_task.task_id,
214            "iterations": self.state.iteration,
215            "files_modified": self.state.current_task.files_modified,
216            "tests_run": len(self.state.current_task.tests_run)
217        }
218
219    async def _get_completion(
220        self,
221        messages: List[Dict],
222        context: Dict
223    ):
224        """Get completion from LLM."""
225        system_prompt = self._build_system_prompt(context)
226
227        response = await self.llm.messages.create(
228            model=self.config.model_name,
229            max_tokens=self.config.max_tokens,
230            system=system_prompt,
231            messages=messages,
232            tools=self.tools.get_tools_for_llm()
233        )
234
235        return response
236
237    async def _execute_tool(self, name: str, params: Dict) -> Dict[str, Any]:
238        """Execute a tool and return results."""
239        tool = self.tools.get_tool(name)
240
241        if not tool:
242            return {"success": False, "error": f"Unknown tool: {name}"}
243
244        # Check if approval is needed
245        if tool.requires_approval or name in self.config.require_approval_for:
246            approved = await self._request_approval(name, params)
247            if not approved:
248                return {"success": False, "error": "Action not approved"}
249
250        try:
251            result = await tool.handler(**params)
252
253            # Track changes
254            if name == "write_file" or name == "edit_file":
255                self.state.current_task.files_modified.append(params.get("path"))
256            elif name == "read_file":
257                self.state.current_task.files_read.append(params.get("path"))
258            elif name == "run_command":
259                self.state.current_task.commands_executed.append({
260                    "command": params.get("command"),
261                    "success": result.get("success", True)
262                })
263
264            return {"success": True, "result": result}
265
266        except Exception as e:
267            self.state.current_task.errors_encountered.append({
268                "tool": name,
269                "error": str(e)
270            })
271            return {"success": False, "error": str(e)}
272
273    async def _gather_initial_context(self) -> Dict[str, Any]:
274        """Gather initial context about the codebase."""
275        context = {
276            "workspace": str(self.config.workspace),
277            "files": [],
278            "file_count": 0,
279        }
280
281        # List files
282        files = await self.file_tools.list_files(".", recursive=True)
283        context["files"] = files[:100]  # Limit for context
284        context["file_count"] = len(files)
285
286        # Get git status if available
287        if self.git:
288            try:
289                status = await self.git.git.status()
290                context["git_status"] = {
291                    "branch": status.branch,
292                    "has_changes": status.has_changes
293                }
294            except:
295                pass
296
297        # Detect project type
298        context["project_type"] = await self._detect_project_type()
299
300        return context
301
302    async def _detect_project_type(self) -> str:
303        """Detect the type of project."""
304        workspace = self.config.workspace
305
306        if (workspace / "package.json").exists():
307            return "javascript/node"
308        elif (workspace / "pyproject.toml").exists():
309            return "python"
310        elif (workspace / "Cargo.toml").exists():
311            return "rust"
312        elif (workspace / "go.mod").exists():
313            return "go"
314        else:
315            return "unknown"
316
317    def _is_complete_response(self, text: str) -> bool:
318        """Check if the response indicates task completion."""
319        completion_phrases = [
320            "task is complete",
321            "i've completed",
322            "the task has been completed",
323            "successfully completed",
324        ]
325        text_lower = text.lower()
326        return any(phrase in text_lower for phrase in completion_phrases)
327
328    async def _request_approval(self, action: str, details: Any) -> bool:
329        """Request approval for a sensitive action."""
330        if self.config.mode == AgentMode.AUTONOMOUS:
331            return True
332
333        self.state.pending_approvals.append({
334            "action": action,
335            "details": details,
336            "timestamp": datetime.now().isoformat()
337        })
338
339        # In real implementation, this would wait for user input
340        return True
341
342    async def _run_command(self, command: str, timeout: int = 30):
343        """Run a command in the sandbox."""
344        await self.sandbox.initialize()
345        return await self.sandbox.execute(command, timeout=timeout)
346
347    async def _run_tests(self, test_path: str = None, test_name: str = None):
348        """Run tests."""
349        return await self.test_runner.run_tests(test_path=test_path)
350
351    async def _mark_complete(self, summary: str, files_changed: List[str] = None):
352        """Mark the current task as complete."""
353        self.state.status = "complete"
354        return {"status": "complete", "summary": summary}
355
356    async def _ask_user(self, question: str):
357        """Ask the user a question."""
358        self.state.status = "waiting"
359        return {"type": "question", "content": question}

Prompt Engineering

The system prompt is crucial for effective agent behavior:

🐍python

1class CodingAgent:
2    # ... (continued)
3
4    def _build_system_prompt(self, context: Dict) -> str:
5        """Build the system prompt for the agent."""
6        tools_by_category = self.tools.get_tools_by_category()
7
8        tool_docs = []
9        for category, tools in tools_by_category.items():
10            tool_docs.append(f"### {category.title()} Tools")
11            for tool in tools:
12                tool_docs.append(f"- **{tool.name}**: {tool.description}")
13
14        return f"""You are an expert coding agent that helps developers with programming tasks.
15
16## Your Capabilities
17
18You can read, write, and edit code files, run commands, execute tests, and manage git operations.
19
20## Available Tools
21
22{chr(10).join(tool_docs)}
23
24## Current Context
25
26- **Workspace**: {context.get('workspace')}
27- **Project Type**: {context.get('project_type', 'unknown')}
28- **Files**: {context.get('file_count', 0)} files in workspace
29{f"- **Git Branch**: {context.get('git_status', {}).get('branch', 'unknown')}" if self.git else ""}
30
31## Guidelines
32
33### Understanding the Task
341. Read existing code before making changes
352. Understand the project structure and conventions
363. Ask clarifying questions if the task is ambiguous
37
38### Making Changes
391. Make minimal, focused changes
402. Preserve existing code style and conventions
413. Add appropriate comments for complex logic
424. Consider edge cases and error handling
43
44### Testing and Verification
451. Run existing tests after making changes
462. Add tests for new functionality
473. Verify the code works as expected
48
49### Git Workflow
501. Create a feature branch for significant changes
512. Make atomic commits with clear messages
523. Don't push without explicit permission
53
54### Safety
551. Never execute destructive commands without confirmation
562. Back up files before major changes
573. Stop and ask if something seems wrong
58
59## Response Format
60
61For each step:
621. Explain what you're about to do and why
632. Use the appropriate tools to accomplish it
643. Report the results
654. Continue to the next step or mark complete
66
67When the task is complete, summarize what was done.
68
69## Error Handling
70
71If you encounter an error:
721. Analyze the error message
732. Determine the root cause
743. Attempt a fix
754. Re-run to verify
76
77If you cannot fix an error after multiple attempts, explain the issue and ask for help."""

The system prompt establishes the agent's personality and behavior. Be explicit about expectations for code quality, testing, and safety.

CLI Interface

A CLI interface makes the agent easy to use:

🐍python

1#!/usr/bin/env python3
2"""
3Coding Agent CLI
4================
5
6A command-line interface for the coding agent.
7
8Usage:
9    coding-agent run "Add a new function to calculate fibonacci"
10    coding-agent interactive
11    coding-agent --help
12"""
13
14import asyncio
15import argparse
16import sys
17from pathlib import Path
18from rich.console import Console
19from rich.markdown import Markdown
20from rich.panel import Panel
21from rich.progress import Progress, SpinnerColumn, TextColumn
22from rich.syntax import Syntax
23
24
25console = Console()
26
27
28def create_parser():
29    """Create the argument parser."""
30    parser = argparse.ArgumentParser(
31        description="AI-powered coding agent",
32        formatter_class=argparse.RawDescriptionHelpFormatter
33    )
34
35    parser.add_argument(
36        "--workspace", "-w",
37        type=Path,
38        default=Path.cwd(),
39        help="Workspace directory (default: current directory)"
40    )
41
42    parser.add_argument(
43        "--mode", "-m",
44        choices=["interactive", "autonomous", "supervised"],
45        default="supervised",
46        help="Agent mode"
47    )
48
49    parser.add_argument(
50        "--model",
51        default="claude-3-5-sonnet-20241022",
52        help="Model to use"
53    )
54
55    parser.add_argument(
56        "--no-git",
57        action="store_true",
58        help="Disable git integration"
59    )
60
61    parser.add_argument(
62        "--no-tests",
63        action="store_true",
64        help="Disable test running"
65    )
66
67    subparsers = parser.add_subparsers(dest="command", help="Commands")
68
69    # Run command
70    run_parser = subparsers.add_parser("run", help="Run a single task")
71    run_parser.add_argument("task", help="Task description")
72
73    # Interactive mode
74    subparsers.add_parser("interactive", help="Start interactive session")
75
76    # Status command
77    subparsers.add_parser("status", help="Show agent status")
78
79    return parser
80
81
82async def run_task(agent, task: str):
83    """Run a single task and display progress."""
84    with Progress(
85        SpinnerColumn(),
86        TextColumn("[progress.description]{task.description}"),
87        console=console
88    ) as progress:
89        task_id = progress.add_task("Starting agent...", total=None)
90
91        async for event in agent.run(task):
92            event_type = event.get("event")
93
94            if event_type == "task_started":
95                progress.update(task_id, description="Analyzing task...")
96
97            elif event_type == "context_gathered":
98                progress.update(task_id, description=f"Found {event['files_indexed']} files")
99
100            elif event_type == "iteration_start":
101                progress.update(task_id, description=f"Iteration {event['iteration']}")
102
103            elif event_type == "tool_call":
104                progress.update(task_id, description=f"Using {event['tool']}...")
105
106            elif event_type == "tool_result":
107                status = "✓" if event["success"] else "✗"
108                console.print(f"  {status} {event['tool']}: {event['summary']}")
109
110            elif event_type == "response":
111                progress.stop()
112                console.print(Panel(
113                    Markdown(event["content"]),
114                    title="Agent Response",
115                    border_style="green"
116                ))
117                progress.start()
118
119            elif event_type == "running_verification":
120                progress.update(task_id, description="Running tests...")
121
122            elif event_type == "debugging":
123                progress.update(task_id, description="Debugging...")
124                console.print(f"  [yellow]Debugging {len(event['failed_tests'])} failed tests[/]")
125
126            elif event_type == "task_complete":
127                progress.update(task_id, description="Complete!")
128                progress.stop()
129
130                console.print("\n[bold green]Task Complete![/]\n")
131                console.print(f"Iterations: {event['iterations']}")
132                if event["files_modified"]:
133                    console.print(f"Files modified: {', '.join(event['files_modified'])}")
134
135
136async def interactive_mode(agent):
137    """Run in interactive mode."""
138    console.print(Panel(
139        "Coding Agent Interactive Mode\n\n"
140        "Type your requests, or 'quit' to exit.",
141        title="Welcome",
142        border_style="blue"
143    ))
144
145    while True:
146        try:
147            user_input = console.input("\n[bold cyan]You:[/] ")
148
149            if user_input.lower() in ["quit", "exit", "q"]:
150                console.print("[yellow]Goodbye![/]")
151                break
152
153            if not user_input.strip():
154                continue
155
156            await run_task(agent, user_input)
157
158        except KeyboardInterrupt:
159            console.print("\n[yellow]Interrupted. Type 'quit' to exit.[/]")
160        except Exception as e:
161            console.print(f"[red]Error: {e}[/]")
162
163
164async def main():
165    """Main entry point."""
166    parser = create_parser()
167    args = parser.parse_args()
168
169    # Validate workspace
170    if not args.workspace.is_dir():
171        console.print(f"[red]Error: {args.workspace} is not a directory[/]")
172        sys.exit(1)
173
174    # Create config
175    config = AgentConfig(
176        workspace=args.workspace,
177        mode=AgentMode(args.mode),
178        model_name=args.model,
179        enable_git=not args.no_git,
180        enable_tests=not args.no_tests
181    )
182
183    # Create agent
184    agent = CodingAgent(config)
185
186    # Run appropriate command
187    if args.command == "run":
188        await run_task(agent, args.task)
189    elif args.command == "interactive":
190        await interactive_mode(agent)
191    elif args.command == "status":
192        console.print(f"Workspace: {args.workspace}")
193        console.print(f"Mode: {args.mode}")
194        console.print(f"Git: {'enabled' if not args.no_git else 'disabled'}")
195        console.print(f"Tests: {'enabled' if not args.no_tests else 'disabled'}")
196    else:
197        parser.print_help()
198
199
200if __name__ == "__main__":
201    asyncio.run(main())

Putting It All Together

Here's how the complete agent handles a typical task:

🐍python

1"""
2Example: Complete workflow of the coding agent
3"""
4
5async def example_workflow():
6    """Demonstrate a complete coding agent workflow."""
7
8    # 1. Initialize the agent
9    config = AgentConfig(
10        workspace=Path("./my-project"),
11        mode=AgentMode.SUPERVISED,
12        enable_git=True,
13        enable_tests=True,
14        enable_debugging=True
15    )
16
17    agent = CodingAgent(config)
18
19    # 2. Run a task
20    task = """
21    Add a new utility function called 'validate_email' that:
22    - Takes an email string as input
23    - Returns True if the email is valid, False otherwise
24    - Uses regex for validation
25    - Add appropriate tests for the function
26    """
27
28    async for event in agent.run(task):
29        print(f"Event: {event['event']}")
30
31        # The agent will:
32        # 1. Analyze the codebase structure
33        # 2. Find existing utility functions
34        # 3. Create the new function following project conventions
35        # 4. Write tests for the function
36        # 5. Run tests to verify
37        # 6. Debug any failures
38        # 7. Commit changes (with approval)
39
40
41# Example session output:
42"""
43Event: task_started
44Event: context_gathered
45Event: iteration_start
46Event: tool_call (read_file: src/utils.py)
47Event: tool_result
48Event: tool_call (write_file: src/utils.py)
49Event: tool_result
50Event: tool_call (write_file: tests/test_utils.py)
51Event: tool_result
52Event: tool_call (run_tests)
53Event: tool_result
54Event: response
55
56Agent Response:
57I've added the validate_email function to src/utils.py and created
58comprehensive tests in tests/test_utils.py. All tests are passing.
59
60Files modified:
61- src/utils.py: Added validate_email function
62- tests/test_utils.py: Added 6 test cases
63
64Would you like me to commit these changes?
65
66Event: task_complete
67"""

Project Structure

The complete coding agent should be organized as follows:

⚡bash

1coding_agent/
2├── __init__.py
3├── agent.py              # Main CodingAgent class
4├── config.py             # Configuration classes
5├── tools/
6│   ├── __init__.py
7│   ├── registry.py       # ToolRegistry
8│   ├── file_tools.py     # File system operations
9│   ├── git_tools.py      # Git integration
10│   └── test_tools.py     # Test running
11├── sandbox/
12│   ├── __init__.py
13│   ├── docker.py         # Docker sandbox
14│   ├── subprocess.py     # Subprocess sandbox
15│   └── manager.py        # SandboxManager
16├── debug/
17│   ├── __init__.py
18│   ├── parser.py         # Error parsing
19│   ├── analyzer.py       # Error analysis
20│   ├── strategies.py     # Fix strategies
21│   └── loop.py           # Debug iteration loop
22├── prompts/
23│   ├── __init__.py
24│   ├── system.py         # System prompts
25│   └── templates.py      # Prompt templates
26├── cli/
27│   ├── __init__.py
28│   └── main.py           # CLI entry point
29└── utils/
30    ├── __init__.py
31    └── patterns.py       # Fix pattern learning

Summary

In this chapter, we built a complete coding agent from scratch:

Architecture: A modular design with distinct components for file operations, execution, git, testing, and debugging
File System Tools: Intelligent reading, writing, editing, and searching with safety guards
Execution Sandbox: Docker and subprocess-based isolation with resource limits
Git Integration: Branch management, commit generation, diff analysis, and safety guards
Test-Driven Development: Framework detection, test running, parsing, and generation
Debugging Loop: Error categorization, analysis, fix strategies, and pattern learning
Complete Agent: Unified tool registry, main agent class, prompt engineering, and CLI interface

Next Steps: The coding agent we've built is a solid foundation. In the next chapter, we'll build a research agent that can search the web, read documents, and synthesize information—a different but equally powerful type of agent.

The patterns and techniques from this chapter—tool design, sandbox execution, iterative debugging—apply broadly to many types of agents. Master these fundamentals, and you'll be able to build agents for almost any domain.