Introduction
The Thought-Action-Observation (TAO) cycle is the heartbeat of ReAct. Understanding each component and how they interact is essential for building effective agents.
The Cycle: Think about what to do → Do it → See what happened → Think about what to do next. This simple loop enables complex, adaptive behavior.
The Thought Component
Thoughts are the reasoning traces that guide agent behavior. They serve multiple purposes:
Types of Thoughts
| Type | Purpose | Example |
|---|---|---|
| Planning | Decide next steps | I need to find the file first, then edit it |
| Analysis | Understand observations | This error means the file doesn't exist |
| Synthesis | Combine information | Based on A and B, I should try C |
| Verification | Check work | Let me verify this change works correctly |
| Conclusion | Finalize results | I have all the info needed to answer |
🐍thought_generation.py
1def generate_thought(
2 task: str,
3 history: list[dict],
4 llm: LLM,
5) -> str:
6 """Generate a thought based on current context."""
7
8 prompt = f"""
9You are working on the following task:
10{task}
11
12Previous steps:
13{format_history(history)}
14
15Think about:
161. What have you learned so far?
172. What do you still need to do?
183. What is the next logical step?
19
20Provide your thought process:
21Thought: """
22
23 response = llm.generate(prompt)
24 return response.strip()
25
26
27def format_history(history: list[dict]) -> str:
28 """Format history for prompt."""
29 lines = []
30 for i, step in enumerate(history, 1):
31 lines.append(f"Thought {i}: {step['thought']}")
32 lines.append(f"Action {i}: {step['action']}")
33 lines.append(f"Observation {i}: {step['observation']}")
34 return "\n".join(lines)Effective Thoughts
📝good_vs_bad_thoughts.txt
1GOOD THOUGHTS (Specific, Actionable):
2
3"The error says 'module not found'. I should check if the package
4 is installed by running pip list."
5
6"I found 3 files matching the pattern. The most recent one
7 (config.json) is likely the one I need."
8
9"The test passed but I should also check edge cases with
10 empty input and null values."
11
12---
13
14BAD THOUGHTS (Vague, Repetitive):
15
16"I need to do something about this error."
17→ Doesn't specify what action to take
18
19"Let me try again."
20→ No analysis of what went wrong
21
22"I'm not sure what to do."
23→ Gives up instead of reasoning through optionsThought Quality Matters
The quality of thoughts directly affects agent performance. Encourage specific, analytical thoughts through good prompting and examples.
The Action Component
Actions are the interface between the agent's reasoning and the external world:
Action Types
🐍action_types.py
1from dataclasses import dataclass
2from enum import Enum
3from typing import Any
4
5class ActionType(Enum):
6 TOOL = "tool" # Call a tool
7 SEARCH = "search" # Search for information
8 EXECUTE = "execute" # Run code/commands
9 FINISH = "finish" # Complete the task
10 DELEGATE = "delegate" # Hand off to another agent
11
12@dataclass
13class Action:
14 type: ActionType
15 name: str
16 parameters: dict[str, Any]
17
18 @classmethod
19 def from_text(cls, text: str) -> "Action":
20 """Parse action from LLM output."""
21 # Example: "Action: search(query='python async')"
22 import re
23
24 match = re.match(r"(\w+)\((.*)\)", text)
25 if not match:
26 raise ValueError(f"Invalid action format: {text}")
27
28 name = match.group(1)
29 args_str = match.group(2)
30
31 # Parse arguments
32 params = {}
33 if args_str:
34 for arg in args_str.split(","):
35 key, value = arg.split("=")
36 params[key.strip()] = eval(value.strip())
37
38 return cls(
39 type=cls._infer_type(name),
40 name=name,
41 parameters=params,
42 )
43
44 @staticmethod
45 def _infer_type(name: str) -> ActionType:
46 if name == "finish":
47 return ActionType.FINISH
48 elif name == "search":
49 return ActionType.SEARCH
50 elif name in ["bash", "python", "execute"]:
51 return ActionType.EXECUTE
52 else:
53 return ActionType.TOOLAction Execution
🐍action_execution.py
1class ActionExecutor:
2 """Execute actions and return observations."""
3
4 def __init__(self, tools: dict[str, Callable]):
5 self.tools = tools
6
7 def execute(self, action: Action) -> str:
8 """Execute an action and return the observation."""
9
10 try:
11 if action.type == ActionType.FINISH:
12 return action.parameters.get("result", "Task completed")
13
14 if action.type == ActionType.TOOL:
15 tool = self.tools.get(action.name)
16 if not tool:
17 return f"Error: Unknown tool '{action.name}'"
18 return tool(**action.parameters)
19
20 if action.type == ActionType.EXECUTE:
21 return self._execute_code(
22 action.parameters.get("code", ""),
23 action.parameters.get("language", "python"),
24 )
25
26 return f"Unknown action type: {action.type}"
27
28 except Exception as e:
29 return f"Error executing {action.name}: {str(e)}"
30
31 def _execute_code(self, code: str, language: str) -> str:
32 """Execute code in sandbox."""
33 import subprocess
34
35 if language == "python":
36 result = subprocess.run(
37 ["python", "-c", code],
38 capture_output=True,
39 text=True,
40 timeout=30,
41 )
42 return result.stdout + result.stderr
43
44 return f"Unsupported language: {language}"Action Validation
🐍action_validation.py
1class ActionValidator:
2 """Validate actions before execution."""
3
4 def __init__(self, allowed_tools: set[str]):
5 self.allowed_tools = allowed_tools
6 self.dangerous_patterns = [
7 "rm -rf",
8 "DROP TABLE",
9 "sudo",
10 "chmod 777",
11 ]
12
13 def validate(self, action: Action) -> tuple[bool, str]:
14 """Validate an action. Returns (is_valid, reason)."""
15
16 # Check if tool is allowed
17 if action.type == ActionType.TOOL:
18 if action.name not in self.allowed_tools:
19 return False, f"Tool '{action.name}' is not allowed"
20
21 # Check for dangerous patterns
22 action_str = str(action.parameters)
23 for pattern in self.dangerous_patterns:
24 if pattern in action_str:
25 return False, f"Dangerous pattern detected: {pattern}"
26
27 # Check parameter types
28 if not self._validate_parameters(action):
29 return False, "Invalid parameter types"
30
31 return True, "Valid"
32
33 def _validate_parameters(self, action: Action) -> bool:
34 """Validate parameter types and values."""
35 for key, value in action.parameters.items():
36 # Check for injection attempts
37 if isinstance(value, str) and len(value) > 10000:
38 return False
39 return TrueThe Observation Component
Observations are the feedback from actions that inform the next thought:
Observation Types
| Type | Content | Example |
|---|---|---|
| Success | Action result | File created at /path/to/file.txt |
| Error | Error message | Permission denied: /etc/passwd |
| Data | Retrieved information | Current temperature: 72°F |
| Partial | Incomplete result | Found 100 results, showing first 10 |
| Empty | No result | No matches found |
🐍observation_processing.py
1from dataclasses import dataclass
2from enum import Enum
3
4class ObservationType(Enum):
5 SUCCESS = "success"
6 ERROR = "error"
7 DATA = "data"
8 PARTIAL = "partial"
9 EMPTY = "empty"
10
11@dataclass
12class Observation:
13 type: ObservationType
14 content: str
15 metadata: dict = None
16
17 @classmethod
18 def from_result(cls, result: str) -> "Observation":
19 """Create observation from action result."""
20
21 # Classify observation type
22 if not result or result.strip() == "":
23 return cls(ObservationType.EMPTY, "No output")
24
25 if "error" in result.lower() or "exception" in result.lower():
26 return cls(ObservationType.ERROR, result)
27
28 if "found" in result.lower() and "showing" in result.lower():
29 return cls(ObservationType.PARTIAL, result)
30
31 return cls(ObservationType.SUCCESS, result)
32
33 def summarize(self, max_length: int = 500) -> str:
34 """Summarize observation for context."""
35 if len(self.content) <= max_length:
36 return self.content
37
38 # Truncate with indicator
39 return self.content[:max_length] + "\n... (truncated)"
40
41
42class ObservationProcessor:
43 """Process and format observations."""
44
45 def process(self, raw_result: str) -> Observation:
46 """Process raw action result into observation."""
47
48 # Clean up result
49 cleaned = self._clean_result(raw_result)
50
51 # Classify and create observation
52 observation = Observation.from_result(cleaned)
53
54 # Extract metadata if present
55 observation.metadata = self._extract_metadata(cleaned)
56
57 return observation
58
59 def _clean_result(self, result: str) -> str:
60 """Clean up raw result."""
61 # Remove ANSI codes
62 import re
63 ansi_escape = re.compile(r'\x1B(?:[@-Z\\-_]|\[[0-?]*[ -/]*[@-~])')
64 return ansi_escape.sub('', result)
65
66 def _extract_metadata(self, result: str) -> dict:
67 """Extract metadata from result."""
68 metadata = {}
69
70 # Count lines
71 lines = result.split("\n")
72 metadata["line_count"] = len(lines)
73
74 # Check for common patterns
75 if "created" in result.lower():
76 metadata["action"] = "create"
77 elif "updated" in result.lower():
78 metadata["action"] = "update"
79 elif "deleted" in result.lower():
80 metadata["action"] = "delete"
81
82 return metadataObservation Handling
📝observation_handling.txt
1SUCCESS OBSERVATION:
2Thought: The file was created successfully. Let me verify its contents.
3Action: read_file(path="/path/to/file.txt")
4
5ERROR OBSERVATION:
6Thought: Permission denied. I should try a different approach
7 or ask for elevated permissions.
8Action: ask_user("Need permission to access /etc/passwd. Proceed?")
9
10PARTIAL OBSERVATION:
11Thought: There are 100 results. Let me refine my search to find
12 the most relevant one.
13Action: search(query="config.json in src/", limit=10)
14
15EMPTY OBSERVATION:
16Thought: No results found. Let me try a different search term
17 or check if the file exists.
18Action: list_files(directory="./")
19
20DATA OBSERVATION:
21Thought: Got the temperature (72°F). Now I can compare it
22 to the threshold and decide the action.
23Action: finish(result="Temperature 72°F is below threshold 75°F")The Complete Cycle
🐍complete_cycle.py
1class ReActCycle:
2 """Complete Thought-Action-Observation cycle."""
3
4 def __init__(self, llm: LLM, executor: ActionExecutor):
5 self.llm = llm
6 self.executor = executor
7 self.processor = ObservationProcessor()
8
9 def run(self, task: str, max_steps: int = 10) -> str:
10 """Run the complete ReAct cycle."""
11
12 history = []
13 step = 0
14
15 while step < max_steps:
16 step += 1
17
18 # 1. THOUGHT
19 thought = self.generate_thought(task, history)
20 print(f"Thought {step}: {thought}")
21
22 # Check if done
23 if self.is_finished(thought):
24 return self.extract_answer(thought)
25
26 # 2. ACTION
27 action = self.generate_action(task, history, thought)
28 print(f"Action {step}: {action.name}({action.parameters})")
29
30 # Validate action
31 valid, reason = self.validate_action(action)
32 if not valid:
33 observation = Observation(
34 ObservationType.ERROR,
35 f"Invalid action: {reason}"
36 )
37 else:
38 # 3. EXECUTE
39 raw_result = self.executor.execute(action)
40 observation = self.processor.process(raw_result)
41
42 print(f"Observation {step}: {observation.content[:200]}...")
43
44 # Update history
45 history.append({
46 "thought": thought,
47 "action": f"{action.name}({action.parameters})",
48 "observation": observation.content,
49 })
50
51 # Check for terminal action
52 if action.type == ActionType.FINISH:
53 return observation.content
54
55 return "Max steps reached without completing task"
56
57 def generate_thought(self, task: str, history: list) -> str:
58 """Generate next thought."""
59 prompt = self._build_prompt(task, history, "thought")
60 response = self.llm.generate(prompt)
61 return self._extract_thought(response)
62
63 def generate_action(
64 self,
65 task: str,
66 history: list,
67 thought: str,
68 ) -> Action:
69 """Generate next action based on thought."""
70 prompt = self._build_prompt(
71 task, history, "action",
72 current_thought=thought,
73 )
74 response = self.llm.generate(prompt)
75 return Action.from_text(self._extract_action(response))
76
77 def is_finished(self, thought: str) -> bool:
78 """Check if thought indicates completion."""
79 finish_indicators = [
80 "I have enough information",
81 "task is complete",
82 "I can now provide the answer",
83 "I'm done",
84 ]
85 thought_lower = thought.lower()
86 return any(ind in thought_lower for ind in finish_indicators)Cycle Termination
Always include clear termination conditions: explicit finish actions, max step limits, and detection of completion in thoughts. Without these, agents can loop forever.
Summary
The Thought-Action-Observation cycle:
- Thought: Reasoning about situation and next steps
- Action: Interface to external tools and world
- Observation: Feedback that informs next thought
- Cycle: Iterates until task complete or limit reached
- Key: Each component informs and improves the others
Next: Let's implement a complete ReAct agent from scratch to see these concepts in action.