Chapter 1
15 min read
Section 6 of 175

What Makes an AI Agent

The Agentic AI Revolution

Introduction

The term "AI agent" gets thrown around a lot, often inconsistently. Some call any LLM with tools an agent. Others reserve the term for fully autonomous systems. In this section, we'll establish a clear, practical definition and explore the key components that transform an LLM into a true agent.

Our Working Definition: An AI agent is a system that uses an LLM as its reasoning engine to perceive its environment, make decisions, and take actions to achieve specific goals, adapting its behavior based on feedback.

Defining AI Agents

The Essential Properties

For an AI system to be considered an agent, it must exhibit these core properties:

PropertyDescriptionExample
Goal-DirectedWorks toward achieving specific objectives"Deploy this application to production"
AutonomousMakes decisions without constant human inputChooses which files to edit, which tools to use
AdaptiveAdjusts behavior based on feedback and resultsRetries with different approach when tests fail
InteractivePerceives and acts on its environmentReads files, executes code, makes API calls
PersistentContinues working across multiple stepsKeeps working until the goal is achieved or abandoned

What an Agent is NOT

To sharpen our definition, let's clarify what doesn't qualify as an agent:

  • Not just a chatbot - Agents take actions, not just produce text
  • Not just RAG - Retrieval is a tool agents can use, not what makes them agents
  • Not just function calling - A single tool call isn't an agent; the loop is essential
  • Not just a workflow - Agents make decisions; workflows follow fixed paths

The Litmus Test

Ask yourself: "If I give this system a goal, can it autonomously work toward that goal, adapt when things go wrong, and tell me when it's done?" If yes, you have an agent.

The Core Components

Every agent architecture includes these fundamental components:

1. The Reasoning Engine (LLM)

The LLM is the "brain" of the agent. It:

  • Interprets the current state and goal
  • Decides which action to take next
  • Generates plans for complex tasks
  • Reflects on results and adjusts strategy
🐍reasoning_engine.py
1# The LLM as reasoning engine
2def decide_action(context: str, goal: str, available_tools: list) -> Action:
3    """Use LLM to decide the next action."""
4
5    prompt = f"""
6    Current context: {context}
7    Goal: {goal}
8    Available tools: {[t.name for t in available_tools]}
9
10    Based on the current context and goal, what action should I take next?
11    Choose a tool and specify the parameters.
12    """
13
14    response = llm.generate(prompt)
15    return parse_action(response)  # Returns structured Action object

2. Tools (Actions)

Tools are the agent's interface to the world. They allow the agent to:

  • Read - Access files, databases, APIs
  • Write - Create and modify files
  • Execute - Run code, shell commands
  • Search - Find information on the web or in codebases
  • Communicate - Send messages, make API calls
🐍tool_example.py
1from typing import Protocol
2
3class Tool(Protocol):
4    """Interface that all tools must implement."""
5    name: str
6    description: str
7
8    def execute(self, **kwargs) -> str:
9        """Execute the tool and return results."""
10        ...
11
12# Example tools
13class ReadFileTool:
14    name = "read_file"
15    description = "Read the contents of a file at the given path"
16
17    def execute(self, file_path: str) -> str:
18        with open(file_path, 'r') as f:
19            return f.read()
20
21class RunCommandTool:
22    name = "run_command"
23    description = "Execute a shell command and return output"
24
25    def execute(self, command: str) -> str:
26        import subprocess
27        result = subprocess.run(command, shell=True, capture_output=True)
28        return result.stdout.decode() + result.stderr.decode()
29
30class WebSearchTool:
31    name = "web_search"
32    description = "Search the web and return results"
33
34    def execute(self, query: str) -> str:
35        # Implementation using search API
36        ...

3. Memory

Memory allows agents to maintain context across interactions:

Memory TypeDurationUse Case
Working MemorySingle taskCurrent conversation, immediate context
Short-term MemorySessionRecent actions, intermediate results
Long-term MemoryPersistentLearned patterns, user preferences, past solutions
🐍memory_system.py
1class AgentMemory:
2    def __init__(self):
3        self.working_memory = []      # Current task context
4        self.short_term = []          # Recent actions (last ~100)
5        self.long_term = VectorStore() # Persistent embeddings
6
7    def add(self, entry: str, persist: bool = False):
8        self.working_memory.append(entry)
9        self.short_term.append(entry)
10
11        if persist:
12            embedding = embed(entry)
13            self.long_term.add(embedding, entry)
14
15    def recall(self, query: str, k: int = 5) -> list[str]:
16        """Retrieve relevant memories."""
17        # Search long-term memory using semantic similarity
18        relevant = self.long_term.search(query, k=k)
19
20        # Add recent short-term entries
21        recent = self.short_term[-10:]
22
23        return relevant + recent

4. Planning System

Complex goals require breaking down into achievable steps:

🐍planning_system.py
1def create_plan(goal: str, context: str) -> list[Step]:
2    """Break down a complex goal into steps."""
3
4    prompt = f"""
5    Goal: {goal}
6    Context: {context}
7
8    Break this goal into concrete, achievable steps.
9    For each step, specify:
10    - What needs to be done
11    - What tools might be needed
12    - What success looks like
13    """
14
15    response = llm.generate(prompt)
16    return parse_steps(response)
17
18# Example output for goal: "Add user authentication to the app"
19# [
20#     Step("Research existing auth patterns in codebase", tools=["grep", "read_file"]),
21#     Step("Choose authentication strategy", tools=["web_search"]),
22#     Step("Implement login endpoint", tools=["edit_file"]),
23#     Step("Implement registration endpoint", tools=["edit_file"]),
24#     Step("Add session management", tools=["edit_file"]),
25#     Step("Write tests", tools=["edit_file", "run_command"]),
26#     Step("Run tests and fix issues", tools=["run_command", "edit_file"]),
27# ]

5. Observation and Feedback

Agents must process the results of their actions:

🐍observation.py
1def process_observation(action: Action, result: str) -> Observation:
2    """Process the result of an action."""
3
4    observation = {
5        "action_taken": action.name,
6        "parameters": action.params,
7        "result": result,
8        "success": determine_success(result),
9        "insights": extract_insights(result),
10    }
11
12    # Did we encounter an error?
13    if "error" in result.lower() or "failed" in result.lower():
14        observation["needs_retry"] = True
15        observation["suggested_fix"] = analyze_error(result)
16
17    return observation

The Agent Loop in Detail

The agent loop is the heartbeat of any agentic system. Here's a complete implementation:

🐍complete_agent_loop.py
1class Agent:
2    def __init__(self, tools: list[Tool], llm: LLM):
3        self.tools = {t.name: t for t in tools}
4        self.llm = llm
5        self.memory = AgentMemory()
6        self.max_iterations = 50
7
8    def run(self, goal: str) -> AgentResult:
9        """Main agent loop."""
10
11        # Initialize state
12        state = AgentState(goal=goal)
13        self.memory.add(f"Goal: {goal}")
14
15        for iteration in range(self.max_iterations):
16            # 1. PERCEIVE: Gather current context
17            context = self.gather_context(state)
18
19            # 2. REASON: Decide next action
20            action = self.decide_action(context, goal)
21
22            # Check for completion
23            if action.type == "finish":
24                return AgentResult(
25                    success=True,
26                    output=action.params.get("result"),
27                    iterations=iteration
28                )
29
30            # 3. ACT: Execute the action
31            self.memory.add(f"Action: {action.name}({action.params})")
32
33            try:
34                result = self.execute_action(action)
35                self.memory.add(f"Result: {result[:500]}")  # Truncate long results
36            except Exception as e:
37                result = f"Error: {str(e)}"
38                self.memory.add(f"Error: {str(e)}")
39
40            # 4. OBSERVE: Update state with results
41            observation = self.process_observation(action, result)
42            state.update(observation)
43
44            # 5. REFLECT: Check if we need to replan
45            if observation.get("needs_replan"):
46                new_plan = self.create_plan(goal, state.summary())
47                state.update_plan(new_plan)
48
49        # Max iterations reached
50        return AgentResult(
51            success=False,
52            output="Max iterations reached without completing goal",
53            iterations=self.max_iterations
54        )
55
56    def gather_context(self, state: AgentState) -> str:
57        """Compile relevant context for decision making."""
58        return f"""
59        Goal: {state.goal}
60        Progress: {state.progress_summary()}
61        Recent actions: {state.recent_actions(5)}
62        Current plan: {state.current_plan}
63        Relevant memories: {self.memory.recall(state.goal)}
64        """
65
66    def decide_action(self, context: str, goal: str) -> Action:
67        """Use LLM to decide the next action."""
68        tool_descriptions = [
69            f"{name}: {tool.description}"
70            for name, tool in self.tools.items()
71        ]
72
73        prompt = f"""
74        {context}
75
76        Available tools:
77        {chr(10).join(tool_descriptions)}
78
79        Special action:
80        - finish: Complete the task with a result
81
82        What should I do next to achieve the goal?
83        Respond with the tool name and parameters.
84        """
85
86        response = self.llm.generate(prompt)
87        return parse_action(response)
88
89    def execute_action(self, action: Action) -> str:
90        """Execute the chosen action."""
91        tool = self.tools.get(action.name)
92        if tool is None:
93            raise ValueError(f"Unknown tool: {action.name}")
94        return tool.execute(**action.params)

The Loop is Everything

This loop - perceive, reason, act, observe, reflect - is the fundamental pattern of all agents. Every agent framework (LangGraph, CrewAI, AutoGPT) implements some variation of this loop.

The Autonomy Spectrum

Agents exist on a spectrum of autonomy:

LevelDescriptionHuman InvolvementExample
1: CopilotSuggests actions, human approvesEvery actionGitHub Copilot suggestions
2: SupervisedExecutes safe actions, asks for risky onesImportant decisionsClaude Code with permission prompts
3: AutonomousExecutes all actions, reports resultsReview after completionDevin on standard tasks
4: Self-DirectedSets own goals based on high-level objectivesPeriodic check-insAutoGPT with a mission

Most Production Agents are Level 2-3

Fully autonomous agents (Level 4) are risky in production. Most practical deployments use supervised or autonomous modes with human oversight for critical actions.

Types of Agents

By Application Domain

TypePrimary TaskKey ToolsExamples
Coding AgentWrite and modify codeFile ops, code execution, GitClaude Code, Cursor, Codex
Research AgentFind and synthesize informationWeb search, document readingPerplexity, custom research bots
Data AgentAnalyze and transform dataSQL, Python, visualizationData analysis assistants
Browser AgentNavigate and interact with websitesBrowser control, screenshotsBrowserbase agents
Assistant AgentHandle diverse tasksMultiple tool categoriesGeneral-purpose assistants

By Architecture

  • Single Agent - One LLM handling the entire task
  • Multi-Agent - Multiple specialized agents collaborating
  • Hierarchical - Manager agent coordinating worker agents
  • Swarm - Many simple agents working in parallel

Summary

An AI agent is more than just an LLM with tools. It's a complete system with:

  1. Reasoning Engine - The LLM that makes decisions
  2. Tools - Capabilities to interact with the world
  3. Memory - Context and learning across interactions
  4. Planning - Ability to break down complex goals
  5. Feedback Loops - Observation and adaptation
Remember: The agent loop - perceive, reason, act, observe, reflect - is the pattern that transforms an LLM from a text generator into an autonomous problem solver.

In the next section, we'll survey the current landscape of AI agents, from research prototypes to production systems.