Boo-AI — Master Artificial Intelligence by Building from Scratch

Introduction

The Thought-Action-Observation (TAO) cycle is the heartbeat of ReAct. Understanding each component and how they interact is essential for building effective agents.

The Cycle: Think about what to do → Do it → See what happened → Think about what to do next. This simple loop enables complex, adaptive behavior.

The Thought Component

Thoughts are the reasoning traces that guide agent behavior. They serve multiple purposes:

Types of Thoughts

Type	Purpose	Example
Planning	Decide next steps	I need to find the file first, then edit it
Analysis	Understand observations	This error means the file doesn't exist
Synthesis	Combine information	Based on A and B, I should try C
Verification	Check work	Let me verify this change works correctly
Conclusion	Finalize results	I have all the info needed to answer

🐍thought_generation.py

1def generate_thought(
2    task: str,
3    history: list[dict],
4    llm: LLM,
5) -> str:
6    """Generate a thought based on current context."""
7
8    prompt = f"""
9You are working on the following task:
10{task}
11
12Previous steps:
13{format_history(history)}
14
15Think about:
161. What have you learned so far?
172. What do you still need to do?
183. What is the next logical step?
19
20Provide your thought process:
21Thought: """
22
23    response = llm.generate(prompt)
24    return response.strip()
25
26
27def format_history(history: list[dict]) -> str:
28    """Format history for prompt."""
29    lines = []
30    for i, step in enumerate(history, 1):
31        lines.append(f"Thought {i}: {step['thought']}")
32        lines.append(f"Action {i}: {step['action']}")
33        lines.append(f"Observation {i}: {step['observation']}")
34    return "\n".join(lines)

Effective Thoughts

📝good_vs_bad_thoughts.txt

1GOOD THOUGHTS (Specific, Actionable):
2
3"The error says 'module not found'. I should check if the package
4 is installed by running pip list."
5
6"I found 3 files matching the pattern. The most recent one
7 (config.json) is likely the one I need."
8
9"The test passed but I should also check edge cases with
10 empty input and null values."
11
12---
13
14BAD THOUGHTS (Vague, Repetitive):
15
16"I need to do something about this error."
17→ Doesn't specify what action to take
18
19"Let me try again."
20→ No analysis of what went wrong
21
22"I'm not sure what to do."
23→ Gives up instead of reasoning through options

Thought Quality Matters

The quality of thoughts directly affects agent performance. Encourage specific, analytical thoughts through good prompting and examples.

The Action Component

Actions are the interface between the agent's reasoning and the external world:

Action Types

🐍action_types.py

1from dataclasses import dataclass
2from enum import Enum
3from typing import Any
4
5class ActionType(Enum):
6    TOOL = "tool"           # Call a tool
7    SEARCH = "search"       # Search for information
8    EXECUTE = "execute"     # Run code/commands
9    FINISH = "finish"       # Complete the task
10    DELEGATE = "delegate"   # Hand off to another agent
11
12@dataclass
13class Action:
14    type: ActionType
15    name: str
16    parameters: dict[str, Any]
17
18    @classmethod
19    def from_text(cls, text: str) -> "Action":
20        """Parse action from LLM output."""
21        # Example: "Action: search(query='python async')"
22        import re
23
24        match = re.match(r"(\w+)\((.*)\)", text)
25        if not match:
26            raise ValueError(f"Invalid action format: {text}")
27
28        name = match.group(1)
29        args_str = match.group(2)
30
31        # Parse arguments
32        params = {}
33        if args_str:
34            for arg in args_str.split(","):
35                key, value = arg.split("=")
36                params[key.strip()] = eval(value.strip())
37
38        return cls(
39            type=cls._infer_type(name),
40            name=name,
41            parameters=params,
42        )
43
44    @staticmethod
45    def _infer_type(name: str) -> ActionType:
46        if name == "finish":
47            return ActionType.FINISH
48        elif name == "search":
49            return ActionType.SEARCH
50        elif name in ["bash", "python", "execute"]:
51            return ActionType.EXECUTE
52        else:
53            return ActionType.TOOL

Action Execution

🐍action_execution.py

1class ActionExecutor:
2    """Execute actions and return observations."""
3
4    def __init__(self, tools: dict[str, Callable]):
5        self.tools = tools
6
7    def execute(self, action: Action) -> str:
8        """Execute an action and return the observation."""
9
10        try:
11            if action.type == ActionType.FINISH:
12                return action.parameters.get("result", "Task completed")
13
14            if action.type == ActionType.TOOL:
15                tool = self.tools.get(action.name)
16                if not tool:
17                    return f"Error: Unknown tool '{action.name}'"
18                return tool(**action.parameters)
19
20            if action.type == ActionType.EXECUTE:
21                return self._execute_code(
22                    action.parameters.get("code", ""),
23                    action.parameters.get("language", "python"),
24                )
25
26            return f"Unknown action type: {action.type}"
27
28        except Exception as e:
29            return f"Error executing {action.name}: {str(e)}"
30
31    def _execute_code(self, code: str, language: str) -> str:
32        """Execute code in sandbox."""
33        import subprocess
34
35        if language == "python":
36            result = subprocess.run(
37                ["python", "-c", code],
38                capture_output=True,
39                text=True,
40                timeout=30,
41            )
42            return result.stdout + result.stderr
43
44        return f"Unsupported language: {language}"

Action Validation

🐍action_validation.py

1class ActionValidator:
2    """Validate actions before execution."""
3
4    def __init__(self, allowed_tools: set[str]):
5        self.allowed_tools = allowed_tools
6        self.dangerous_patterns = [
7            "rm -rf",
8            "DROP TABLE",
9            "sudo",
10            "chmod 777",
11        ]
12
13    def validate(self, action: Action) -> tuple[bool, str]:
14        """Validate an action. Returns (is_valid, reason)."""
15
16        # Check if tool is allowed
17        if action.type == ActionType.TOOL:
18            if action.name not in self.allowed_tools:
19                return False, f"Tool '{action.name}' is not allowed"
20
21        # Check for dangerous patterns
22        action_str = str(action.parameters)
23        for pattern in self.dangerous_patterns:
24            if pattern in action_str:
25                return False, f"Dangerous pattern detected: {pattern}"
26
27        # Check parameter types
28        if not self._validate_parameters(action):
29            return False, "Invalid parameter types"
30
31        return True, "Valid"
32
33    def _validate_parameters(self, action: Action) -> bool:
34        """Validate parameter types and values."""
35        for key, value in action.parameters.items():
36            # Check for injection attempts
37            if isinstance(value, str) and len(value) > 10000:
38                return False
39        return True

The Observation Component

Observations are the feedback from actions that inform the next thought:

Observation Types

Type	Content	Example
Success	Action result	File created at /path/to/file.txt
Error	Error message	Permission denied: /etc/passwd
Data	Retrieved information	Current temperature: 72°F
Partial	Incomplete result	Found 100 results, showing first 10
Empty	No result	No matches found

🐍observation_processing.py

1from dataclasses import dataclass
2from enum import Enum
3
4class ObservationType(Enum):
5    SUCCESS = "success"
6    ERROR = "error"
7    DATA = "data"
8    PARTIAL = "partial"
9    EMPTY = "empty"
10
11@dataclass
12class Observation:
13    type: ObservationType
14    content: str
15    metadata: dict = None
16
17    @classmethod
18    def from_result(cls, result: str) -> "Observation":
19        """Create observation from action result."""
20
21        # Classify observation type
22        if not result or result.strip() == "":
23            return cls(ObservationType.EMPTY, "No output")
24
25        if "error" in result.lower() or "exception" in result.lower():
26            return cls(ObservationType.ERROR, result)
27
28        if "found" in result.lower() and "showing" in result.lower():
29            return cls(ObservationType.PARTIAL, result)
30
31        return cls(ObservationType.SUCCESS, result)
32
33    def summarize(self, max_length: int = 500) -> str:
34        """Summarize observation for context."""
35        if len(self.content) <= max_length:
36            return self.content
37
38        # Truncate with indicator
39        return self.content[:max_length] + "\n... (truncated)"
40
41
42class ObservationProcessor:
43    """Process and format observations."""
44
45    def process(self, raw_result: str) -> Observation:
46        """Process raw action result into observation."""
47
48        # Clean up result
49        cleaned = self._clean_result(raw_result)
50
51        # Classify and create observation
52        observation = Observation.from_result(cleaned)
53
54        # Extract metadata if present
55        observation.metadata = self._extract_metadata(cleaned)
56
57        return observation
58
59    def _clean_result(self, result: str) -> str:
60        """Clean up raw result."""
61        # Remove ANSI codes
62        import re
63        ansi_escape = re.compile(r'\x1B(?:[@-Z\\-_]|\[[0-?]*[ -/]*[@-~])')
64        return ansi_escape.sub('', result)
65
66    def _extract_metadata(self, result: str) -> dict:
67        """Extract metadata from result."""
68        metadata = {}
69
70        # Count lines
71        lines = result.split("\n")
72        metadata["line_count"] = len(lines)
73
74        # Check for common patterns
75        if "created" in result.lower():
76            metadata["action"] = "create"
77        elif "updated" in result.lower():
78            metadata["action"] = "update"
79        elif "deleted" in result.lower():
80            metadata["action"] = "delete"
81
82        return metadata

Observation Handling

📝observation_handling.txt

1SUCCESS OBSERVATION:
2Thought: The file was created successfully. Let me verify its contents.
3Action: read_file(path="/path/to/file.txt")
4
5ERROR OBSERVATION:
6Thought: Permission denied. I should try a different approach
7         or ask for elevated permissions.
8Action: ask_user("Need permission to access /etc/passwd. Proceed?")
9
10PARTIAL OBSERVATION:
11Thought: There are 100 results. Let me refine my search to find
12         the most relevant one.
13Action: search(query="config.json in src/", limit=10)
14
15EMPTY OBSERVATION:
16Thought: No results found. Let me try a different search term
17         or check if the file exists.
18Action: list_files(directory="./")
19
20DATA OBSERVATION:
21Thought: Got the temperature (72°F). Now I can compare it
22         to the threshold and decide the action.
23Action: finish(result="Temperature 72°F is below threshold 75°F")

The Complete Cycle

🐍complete_cycle.py

1class ReActCycle:
2    """Complete Thought-Action-Observation cycle."""
3
4    def __init__(self, llm: LLM, executor: ActionExecutor):
5        self.llm = llm
6        self.executor = executor
7        self.processor = ObservationProcessor()
8
9    def run(self, task: str, max_steps: int = 10) -> str:
10        """Run the complete ReAct cycle."""
11
12        history = []
13        step = 0
14
15        while step < max_steps:
16            step += 1
17
18            # 1. THOUGHT
19            thought = self.generate_thought(task, history)
20            print(f"Thought {step}: {thought}")
21
22            # Check if done
23            if self.is_finished(thought):
24                return self.extract_answer(thought)
25
26            # 2. ACTION
27            action = self.generate_action(task, history, thought)
28            print(f"Action {step}: {action.name}({action.parameters})")
29
30            # Validate action
31            valid, reason = self.validate_action(action)
32            if not valid:
33                observation = Observation(
34                    ObservationType.ERROR,
35                    f"Invalid action: {reason}"
36                )
37            else:
38                # 3. EXECUTE
39                raw_result = self.executor.execute(action)
40                observation = self.processor.process(raw_result)
41
42            print(f"Observation {step}: {observation.content[:200]}...")
43
44            # Update history
45            history.append({
46                "thought": thought,
47                "action": f"{action.name}({action.parameters})",
48                "observation": observation.content,
49            })
50
51            # Check for terminal action
52            if action.type == ActionType.FINISH:
53                return observation.content
54
55        return "Max steps reached without completing task"
56
57    def generate_thought(self, task: str, history: list) -> str:
58        """Generate next thought."""
59        prompt = self._build_prompt(task, history, "thought")
60        response = self.llm.generate(prompt)
61        return self._extract_thought(response)
62
63    def generate_action(
64        self,
65        task: str,
66        history: list,
67        thought: str,
68    ) -> Action:
69        """Generate next action based on thought."""
70        prompt = self._build_prompt(
71            task, history, "action",
72            current_thought=thought,
73        )
74        response = self.llm.generate(prompt)
75        return Action.from_text(self._extract_action(response))
76
77    def is_finished(self, thought: str) -> bool:
78        """Check if thought indicates completion."""
79        finish_indicators = [
80            "I have enough information",
81            "task is complete",
82            "I can now provide the answer",
83            "I'm done",
84        ]
85        thought_lower = thought.lower()
86        return any(ind in thought_lower for ind in finish_indicators)

Cycle Termination

Always include clear termination conditions: explicit finish actions, max step limits, and detection of completion in thoughts. Without these, agents can loop forever.

Summary

The Thought-Action-Observation cycle:

Thought: Reasoning about situation and next steps
Action: Interface to external tools and world
Observation: Feedback that informs next thought
Cycle: Iterates until task complete or limit reached
Key: Each component informs and improves the others

Next: Let's implement a complete ReAct agent from scratch to see these concepts in action.