Introduction
The large language model is the brain of your agent. It interprets context, makes decisions, generates tool calls, and synthesizes results. Understanding how to effectively use LLMs for agentic tasks is crucial for building capable agents.
The LLM Paradox: LLMs are remarkably capable yet fundamentally limited. They can reason about complex problems but can't execute code. They can plan strategies but can't verify results. Agents bridge this gap by giving LLMs tools to act.
The LLM's Role in Agents
In an agentic system, the LLM serves several key functions:
| Function | Description | Example |
|---|---|---|
| Decision Making | Choose which action to take next | Decide to read a file vs search the web |
| Parameter Generation | Create inputs for tool calls | Generate the file path to read |
| Synthesis | Combine information into responses | Summarize search results |
| Reflection | Analyze results and adjust strategy | Recognize an error and try a different approach |
| Planning | Break down complex goals | Create a step-by-step plan for a feature |
The Decision Loop
🐍llm_decision_loop.py
1def llm_decide(context: str, tools: list[dict]) -> dict:
2 """Use LLM to decide the next action."""
3
4 system_prompt = """
5You are an AI agent working to accomplish goals.
6Analyze the context and decide your next action.
7
8You can:
91. Call a tool to gather information or take action
102. Finish if the goal is complete
113. Ask for clarification if needed
12
13Always explain your reasoning before acting.
14"""
15
16 response = client.messages.create(
17 model="claude-sonnet-4-20250514",
18 max_tokens=4096,
19 system=system_prompt,
20 messages=[{"role": "user", "content": context}],
21 tools=tools,
22 )
23
24 return parse_response(response)Provider Comparison
Different LLM providers have different strengths for agentic tasks:
Claude (Anthropic)
- Strengths: Long context (200K), careful reasoning, safety-focused
- Best for: Complex multi-step tasks, code understanding, nuanced decisions
- Tool calling: Native with structured outputs
🐍claude_agent.py
1import anthropic
2
3client = anthropic.Anthropic()
4
5def call_claude_with_tools(prompt: str, tools: list[dict]) -> dict:
6 response = client.messages.create(
7 model="claude-sonnet-4-20250514",
8 max_tokens=4096,
9 messages=[{"role": "user", "content": prompt}],
10 tools=tools,
11 )
12
13 result = {"text": "", "tool_calls": []}
14
15 for block in response.content:
16 if block.type == "text":
17 result["text"] = block.text
18 elif block.type == "tool_use":
19 result["tool_calls"].append({
20 "id": block.id,
21 "name": block.name,
22 "input": block.input,
23 })
24
25 return resultGPT-4 / o3 (OpenAI)
- Strengths: Strong multimodal, fast responses, extensive function calling
- o3 special: Extended thinking for complex reasoning tasks
- Best for: Rapid iteration, vision tasks, parallel function calls
🐍openai_agent.py
1from openai import OpenAI
2
3client = OpenAI()
4
5def call_openai_with_tools(prompt: str, tools: list[dict]) -> dict:
6 response = client.chat.completions.create(
7 model="gpt-4o",
8 messages=[{"role": "user", "content": prompt}],
9 tools=[{"type": "function", "function": t} for t in tools],
10 )
11
12 message = response.choices[0].message
13
14 result = {"text": message.content or "", "tool_calls": []}
15
16 if message.tool_calls:
17 for tc in message.tool_calls:
18 result["tool_calls"].append({
19 "id": tc.id,
20 "name": tc.function.name,
21 "input": json.loads(tc.function.arguments),
22 })
23
24 return resultGemini (Google)
- Strengths: Massive context (2M tokens), native multimodal, Google integration
- Best for: Document processing, research tasks, long-context applications
| Provider | Context Window | Tool Calling | Best For |
|---|---|---|---|
| Claude Opus 4 | 200K | Native | Complex reasoning |
| Claude Sonnet 4 | 200K | Native | Balanced performance |
| GPT-4o | 128K | Native | Multimodal, speed |
| o3 | 200K | Native | Extended reasoning |
| Gemini 1.5 Pro | 2M | Native | Long documents |
Agent Prompt Engineering
Agent prompts differ from chatbot prompts. They must guide decision-making, not just generate responses:
System Prompt Structure
🐍agent_system_prompt.py
1AGENT_SYSTEM_PROMPT = """
2You are an AI agent that accomplishes goals by taking actions.
3
4## Your Capabilities
5You have access to these tools:
6{tool_descriptions}
7
8## Decision Framework
9When deciding your next action, consider:
101. What information do I need?
112. What actions can move me toward the goal?
123. Have I verified my assumptions?
134. Is the goal complete?
14
15## Guidelines
16- Always explain your reasoning before acting
17- Use tools to verify information, don't guess
18- If stuck, try a different approach
19- Ask for clarification when requirements are ambiguous
20- Finish when the goal is definitively complete
21
22## Error Handling
23- If a tool fails, analyze the error and try an alternative
24- If stuck after 3 attempts, explain the blocker and ask for help
25- Never make destructive changes without explicit confirmation
26"""Context Prompt Structure
🐍context_prompt.py
1def build_context_prompt(state: AgentState) -> str:
2 return f"""
3## Goal
4{state.goal}
5
6## Progress
7Steps completed: {len(state.completed_steps)}/{len(state.plan.steps)}
8Current step: {state.current_step.description if state.current_step else "None"}
9
10## Recent Actions
11{format_recent_actions(state.recent_actions[-5:])}
12
13## Relevant Context
14{format_memories(state.memories)}
15
16## Current Situation
17{state.environment_summary}
18
19## Your Task
20Based on the above context, decide your next action.
21First explain your reasoning, then specify the tool to use.
22"""Prompt Engineering for Agents
Agent prompts should be more structured than conversational prompts. Use clear sections, explicit guidelines, and consistent formatting to help the LLM make better decisions.
Native Tool Calling
Modern LLMs support native tool/function calling, eliminating the need for fragile prompt-based parsing:
🐍tool_schema.py
1# Tool schema for LLM function calling
2read_file_tool = {
3 "name": "read_file",
4 "description": "Read the contents of a file at the given path",
5 "input_schema": {
6 "type": "object",
7 "properties": {
8 "file_path": {
9 "type": "string",
10 "description": "The absolute or relative path to the file",
11 },
12 },
13 "required": ["file_path"],
14 },
15}
16
17search_web_tool = {
18 "name": "search_web",
19 "description": "Search the web for information",
20 "input_schema": {
21 "type": "object",
22 "properties": {
23 "query": {
24 "type": "string",
25 "description": "The search query",
26 },
27 "num_results": {
28 "type": "integer",
29 "description": "Number of results to return (default: 5)",
30 "default": 5,
31 },
32 },
33 "required": ["query"],
34 },
35}Processing Tool Calls
🐍process_tool_calls.py
1def process_tool_response(response: dict, tools: ToolRegistry) -> list[dict]:
2 """Process tool calls from LLM response."""
3 results = []
4
5 for tool_call in response.get("tool_calls", []):
6 tool_name = tool_call["name"]
7 tool_input = tool_call["input"]
8 tool_id = tool_call["id"]
9
10 # Execute the tool
11 tool_result = tools.execute(tool_name, **tool_input)
12
13 results.append({
14 "tool_use_id": tool_id,
15 "type": "tool_result",
16 "content": tool_result.output if tool_result.success else f"Error: {tool_result.error}",
17 })
18
19 return results
20
21
22def continue_with_results(
23 messages: list[dict],
24 tool_results: list[dict],
25) -> dict:
26 """Continue conversation with tool results."""
27
28 # Add tool results to messages
29 messages.append({"role": "user", "content": tool_results})
30
31 response = client.messages.create(
32 model="claude-sonnet-4-20250514",
33 messages=messages,
34 tools=tools,
35 )
36
37 return parse_response(response)Context Window Management
Agents quickly accumulate context. Managing what fits in the window is critical:
Strategies for Context Management
- Summarization: Compress old context into summaries
- Truncation: Remove least relevant older entries
- Retrieval: Only include relevant memories via RAG
- Sliding window: Keep only the most recent N interactions
🐍context_management.py
1class ContextManager:
2 """Manages context window for agent."""
3
4 def __init__(
5 self,
6 max_tokens: int = 100000,
7 summary_threshold: int = 50000,
8 ):
9 self.max_tokens = max_tokens
10 self.summary_threshold = summary_threshold
11
12 def prepare_context(
13 self,
14 messages: list[dict],
15 memories: list[str],
16 current_task: str,
17 ) -> list[dict]:
18 """Prepare context that fits within token limits."""
19
20 # Estimate current token count
21 current_tokens = self.estimate_tokens(messages)
22
23 # If under threshold, use all context
24 if current_tokens < self.summary_threshold:
25 return self._build_full_context(messages, memories, current_task)
26
27 # Otherwise, summarize older messages
28 return self._build_summarized_context(
29 messages, memories, current_task
30 )
31
32 def _build_summarized_context(
33 self,
34 messages: list[dict],
35 memories: list[str],
36 current_task: str,
37 ) -> list[dict]:
38 """Build context with summarized history."""
39
40 # Keep recent messages
41 recent = messages[-10:]
42
43 # Summarize older messages
44 older = messages[:-10]
45 summary = self._summarize_messages(older)
46
47 # Build new context
48 return [
49 {"role": "user", "content": f"## Previous Context\n{summary}"},
50 *recent,
51 {"role": "user", "content": f"## Current Task\n{current_task}"},
52 ]
53
54 def _summarize_messages(self, messages: list[dict]) -> str:
55 """Create a summary of messages."""
56 # Use LLM to summarize
57 prompt = f"Summarize these agent actions concisely:\n{messages}"
58 response = summarizer_llm.generate(prompt)
59 return response.textToken Costs Add Up
Long agent sessions can consume millions of tokens. Implement summarization early and monitor your token usage to avoid surprise bills.
Summary
The LLM as reasoning engine:
- Role: Decision-making, parameter generation, synthesis, reflection
- Providers: Claude, GPT-4/o3, Gemini - each with different strengths
- Prompts: Structure for decisions, not just responses
- Tool Calling: Native structured outputs for reliable execution
- Context: Manage carefully to stay within limits
Next Up: With the reasoning engine understood, let's explore how to build the tools that give your agent hands to act on the world.