Introduction
The term "AI agent" gets thrown around a lot, often inconsistently. Some call any LLM with tools an agent. Others reserve the term for fully autonomous systems. In this section, we'll establish a clear, practical definition and explore the key components that transform an LLM into a true agent.
Our Working Definition: An AI agent is a system that uses an LLM as its reasoning engine to perceive its environment, make decisions, and take actions to achieve specific goals, adapting its behavior based on feedback.
Defining AI Agents
The Essential Properties
For an AI system to be considered an agent, it must exhibit these core properties:
| Property | Description | Example |
|---|---|---|
| Goal-Directed | Works toward achieving specific objectives | "Deploy this application to production" |
| Autonomous | Makes decisions without constant human input | Chooses which files to edit, which tools to use |
| Adaptive | Adjusts behavior based on feedback and results | Retries with different approach when tests fail |
| Interactive | Perceives and acts on its environment | Reads files, executes code, makes API calls |
| Persistent | Continues working across multiple steps | Keeps working until the goal is achieved or abandoned |
What an Agent is NOT
To sharpen our definition, let's clarify what doesn't qualify as an agent:
- Not just a chatbot - Agents take actions, not just produce text
- Not just RAG - Retrieval is a tool agents can use, not what makes them agents
- Not just function calling - A single tool call isn't an agent; the loop is essential
- Not just a workflow - Agents make decisions; workflows follow fixed paths
The Litmus Test
The Core Components
Every agent architecture includes these fundamental components:
1. The Reasoning Engine (LLM)
The LLM is the "brain" of the agent. It:
- Interprets the current state and goal
- Decides which action to take next
- Generates plans for complex tasks
- Reflects on results and adjusts strategy
1# The LLM as reasoning engine
2def decide_action(context: str, goal: str, available_tools: list) -> Action:
3 """Use LLM to decide the next action."""
4
5 prompt = f"""
6 Current context: {context}
7 Goal: {goal}
8 Available tools: {[t.name for t in available_tools]}
9
10 Based on the current context and goal, what action should I take next?
11 Choose a tool and specify the parameters.
12 """
13
14 response = llm.generate(prompt)
15 return parse_action(response) # Returns structured Action object2. Tools (Actions)
Tools are the agent's interface to the world. They allow the agent to:
- Read - Access files, databases, APIs
- Write - Create and modify files
- Execute - Run code, shell commands
- Search - Find information on the web or in codebases
- Communicate - Send messages, make API calls
1from typing import Protocol
2
3class Tool(Protocol):
4 """Interface that all tools must implement."""
5 name: str
6 description: str
7
8 def execute(self, **kwargs) -> str:
9 """Execute the tool and return results."""
10 ...
11
12# Example tools
13class ReadFileTool:
14 name = "read_file"
15 description = "Read the contents of a file at the given path"
16
17 def execute(self, file_path: str) -> str:
18 with open(file_path, 'r') as f:
19 return f.read()
20
21class RunCommandTool:
22 name = "run_command"
23 description = "Execute a shell command and return output"
24
25 def execute(self, command: str) -> str:
26 import subprocess
27 result = subprocess.run(command, shell=True, capture_output=True)
28 return result.stdout.decode() + result.stderr.decode()
29
30class WebSearchTool:
31 name = "web_search"
32 description = "Search the web and return results"
33
34 def execute(self, query: str) -> str:
35 # Implementation using search API
36 ...3. Memory
Memory allows agents to maintain context across interactions:
| Memory Type | Duration | Use Case |
|---|---|---|
| Working Memory | Single task | Current conversation, immediate context |
| Short-term Memory | Session | Recent actions, intermediate results |
| Long-term Memory | Persistent | Learned patterns, user preferences, past solutions |
1class AgentMemory:
2 def __init__(self):
3 self.working_memory = [] # Current task context
4 self.short_term = [] # Recent actions (last ~100)
5 self.long_term = VectorStore() # Persistent embeddings
6
7 def add(self, entry: str, persist: bool = False):
8 self.working_memory.append(entry)
9 self.short_term.append(entry)
10
11 if persist:
12 embedding = embed(entry)
13 self.long_term.add(embedding, entry)
14
15 def recall(self, query: str, k: int = 5) -> list[str]:
16 """Retrieve relevant memories."""
17 # Search long-term memory using semantic similarity
18 relevant = self.long_term.search(query, k=k)
19
20 # Add recent short-term entries
21 recent = self.short_term[-10:]
22
23 return relevant + recent4. Planning System
Complex goals require breaking down into achievable steps:
1def create_plan(goal: str, context: str) -> list[Step]:
2 """Break down a complex goal into steps."""
3
4 prompt = f"""
5 Goal: {goal}
6 Context: {context}
7
8 Break this goal into concrete, achievable steps.
9 For each step, specify:
10 - What needs to be done
11 - What tools might be needed
12 - What success looks like
13 """
14
15 response = llm.generate(prompt)
16 return parse_steps(response)
17
18# Example output for goal: "Add user authentication to the app"
19# [
20# Step("Research existing auth patterns in codebase", tools=["grep", "read_file"]),
21# Step("Choose authentication strategy", tools=["web_search"]),
22# Step("Implement login endpoint", tools=["edit_file"]),
23# Step("Implement registration endpoint", tools=["edit_file"]),
24# Step("Add session management", tools=["edit_file"]),
25# Step("Write tests", tools=["edit_file", "run_command"]),
26# Step("Run tests and fix issues", tools=["run_command", "edit_file"]),
27# ]5. Observation and Feedback
Agents must process the results of their actions:
1def process_observation(action: Action, result: str) -> Observation:
2 """Process the result of an action."""
3
4 observation = {
5 "action_taken": action.name,
6 "parameters": action.params,
7 "result": result,
8 "success": determine_success(result),
9 "insights": extract_insights(result),
10 }
11
12 # Did we encounter an error?
13 if "error" in result.lower() or "failed" in result.lower():
14 observation["needs_retry"] = True
15 observation["suggested_fix"] = analyze_error(result)
16
17 return observationThe Agent Loop in Detail
The agent loop is the heartbeat of any agentic system. Here's a complete implementation:
1class Agent:
2 def __init__(self, tools: list[Tool], llm: LLM):
3 self.tools = {t.name: t for t in tools}
4 self.llm = llm
5 self.memory = AgentMemory()
6 self.max_iterations = 50
7
8 def run(self, goal: str) -> AgentResult:
9 """Main agent loop."""
10
11 # Initialize state
12 state = AgentState(goal=goal)
13 self.memory.add(f"Goal: {goal}")
14
15 for iteration in range(self.max_iterations):
16 # 1. PERCEIVE: Gather current context
17 context = self.gather_context(state)
18
19 # 2. REASON: Decide next action
20 action = self.decide_action(context, goal)
21
22 # Check for completion
23 if action.type == "finish":
24 return AgentResult(
25 success=True,
26 output=action.params.get("result"),
27 iterations=iteration
28 )
29
30 # 3. ACT: Execute the action
31 self.memory.add(f"Action: {action.name}({action.params})")
32
33 try:
34 result = self.execute_action(action)
35 self.memory.add(f"Result: {result[:500]}") # Truncate long results
36 except Exception as e:
37 result = f"Error: {str(e)}"
38 self.memory.add(f"Error: {str(e)}")
39
40 # 4. OBSERVE: Update state with results
41 observation = self.process_observation(action, result)
42 state.update(observation)
43
44 # 5. REFLECT: Check if we need to replan
45 if observation.get("needs_replan"):
46 new_plan = self.create_plan(goal, state.summary())
47 state.update_plan(new_plan)
48
49 # Max iterations reached
50 return AgentResult(
51 success=False,
52 output="Max iterations reached without completing goal",
53 iterations=self.max_iterations
54 )
55
56 def gather_context(self, state: AgentState) -> str:
57 """Compile relevant context for decision making."""
58 return f"""
59 Goal: {state.goal}
60 Progress: {state.progress_summary()}
61 Recent actions: {state.recent_actions(5)}
62 Current plan: {state.current_plan}
63 Relevant memories: {self.memory.recall(state.goal)}
64 """
65
66 def decide_action(self, context: str, goal: str) -> Action:
67 """Use LLM to decide the next action."""
68 tool_descriptions = [
69 f"{name}: {tool.description}"
70 for name, tool in self.tools.items()
71 ]
72
73 prompt = f"""
74 {context}
75
76 Available tools:
77 {chr(10).join(tool_descriptions)}
78
79 Special action:
80 - finish: Complete the task with a result
81
82 What should I do next to achieve the goal?
83 Respond with the tool name and parameters.
84 """
85
86 response = self.llm.generate(prompt)
87 return parse_action(response)
88
89 def execute_action(self, action: Action) -> str:
90 """Execute the chosen action."""
91 tool = self.tools.get(action.name)
92 if tool is None:
93 raise ValueError(f"Unknown tool: {action.name}")
94 return tool.execute(**action.params)The Loop is Everything
The Autonomy Spectrum
Agents exist on a spectrum of autonomy:
| Level | Description | Human Involvement | Example |
|---|---|---|---|
| 1: Copilot | Suggests actions, human approves | Every action | GitHub Copilot suggestions |
| 2: Supervised | Executes safe actions, asks for risky ones | Important decisions | Claude Code with permission prompts |
| 3: Autonomous | Executes all actions, reports results | Review after completion | Devin on standard tasks |
| 4: Self-Directed | Sets own goals based on high-level objectives | Periodic check-ins | AutoGPT with a mission |
Most Production Agents are Level 2-3
Types of Agents
By Application Domain
| Type | Primary Task | Key Tools | Examples |
|---|---|---|---|
| Coding Agent | Write and modify code | File ops, code execution, Git | Claude Code, Cursor, Codex |
| Research Agent | Find and synthesize information | Web search, document reading | Perplexity, custom research bots |
| Data Agent | Analyze and transform data | SQL, Python, visualization | Data analysis assistants |
| Browser Agent | Navigate and interact with websites | Browser control, screenshots | Browserbase agents |
| Assistant Agent | Handle diverse tasks | Multiple tool categories | General-purpose assistants |
By Architecture
- Single Agent - One LLM handling the entire task
- Multi-Agent - Multiple specialized agents collaborating
- Hierarchical - Manager agent coordinating worker agents
- Swarm - Many simple agents working in parallel
Summary
An AI agent is more than just an LLM with tools. It's a complete system with:
- Reasoning Engine - The LLM that makes decisions
- Tools - Capabilities to interact with the world
- Memory - Context and learning across interactions
- Planning - Ability to break down complex goals
- Feedback Loops - Observation and adaptation
Remember: The agent loop - perceive, reason, act, observe, reflect - is the pattern that transforms an LLM from a text generator into an autonomous problem solver.
In the next section, we'll survey the current landscape of AI agents, from research prototypes to production systems.