Introduction
When humans solve complex problems, we don't jump straight to answers—we think through steps, consider options, and reason our way to conclusions. Chain-of-Thought (CoT) promptingencourages language models to do the same: explicitly showing their reasoning process rather than producing answers directly.
This section explores how CoT dramatically improves reasoning quality, different techniques for eliciting chain-of-thought reasoning, and how to integrate CoT into agentic systems for more reliable planning and decision-making.
Core Insight: Making reasoning explicit doesn't just help us understand the model—it actually improves the model's reasoning quality. The act of "thinking out loud" enables better problem-solving.
What is Chain-of-Thought?
Chain-of-Thought prompting asks models to show intermediate reasoning steps before providing final answers. This simple technique can dramatically improve performance on reasoning tasks.
Standard vs. Chain-of-Thought
1# Standard prompting - direct answer
2prompt_standard = """
3Q: A store has 45 apples. They sell 23 and receive a new shipment of 17.
4How many apples do they have?
5
6A:
7"""
8# Model might answer: "39" (correct, but how did it get there?)
9
10# Chain-of-Thought prompting - explicit reasoning
11prompt_cot = """
12Q: A store has 45 apples. They sell 23 and receive a new shipment of 17.
13How many apples do they have?
14
15A: Let me work through this step by step:
161. Starting apples: 45
172. After selling 23: 45 - 23 = 22 apples
183. After receiving shipment of 17: 22 + 17 = 39 apples
19
20The store has 39 apples.
21"""
22# Model shows reasoning, making it verifiable and more reliableWhy CoT Works
| Mechanism | Explanation | Benefit |
|---|---|---|
| Decomposition | Breaks problem into smaller steps | Each step is easier to solve correctly |
| Intermediate states | Creates checkpoints in reasoning | Errors are caught and corrected earlier |
| Explicit context | Reasoning is visible to the model | Can reference earlier conclusions |
| Structured thinking | Imposes logical order | Reduces random jumps or omissions |
| Self-consistency | Steps must align | Contradictions become apparent |
When to Use CoT
- Multi-step reasoning: Math, logic, planning
- Complex analysis: Comparing options, weighing trade-offs
- Debugging: Tracing through code or logic
- Decision-making: Evaluating criteria systematically
- Explanation tasks: Teaching or documenting
CoT Prompting Techniques
There are several techniques for eliciting chain-of-thought reasoning:
1. Zero-Shot CoT
Simply ask the model to think step by step without examples:
1def zero_shot_cot(question: str) -> str:
2 """Elicit reasoning with a simple instruction."""
3
4 prompt = f"""{question}
5
6Let's think through this step by step:"""
7
8 response = llm.generate(prompt)
9 return response
10
11# Alternative trigger phrases:
12triggers = [
13 "Let's think step by step.",
14 "Let me work through this carefully.",
15 "Breaking this down:",
16 "Let's reason about this:",
17 "I'll solve this systematically:",
18]2. Few-Shot CoT
Provide examples that demonstrate the reasoning format you want:
1def few_shot_cot(question: str, examples: list[dict]) -> str:
2 """Elicit reasoning by demonstrating with examples."""
3
4 prompt_parts = []
5
6 for ex in examples:
7 prompt_parts.append(f"""Q: {ex['question']}
8
9A: Let me think through this:
10{ex['reasoning']}
11
12Therefore, the answer is: {ex['answer']}
13---""")
14
15 prompt_parts.append(f"""Q: {question}
16
17A: Let me think through this:""")
18
19 prompt = "\n\n".join(prompt_parts)
20 response = llm.generate(prompt)
21 return response
22
23# Example usage
24examples = [
25 {
26 "question": "If a train travels 60 mph for 2.5 hours, how far does it go?",
27 "reasoning": """
281. Distance = Speed × Time
292. Speed = 60 mph
303. Time = 2.5 hours
314. Distance = 60 × 2.5 = 150 miles""",
32 "answer": "150 miles"
33 },
34 {
35 "question": "A recipe calls for 3 cups of flour for 24 cookies. How much for 36 cookies?",
36 "reasoning": """
371. Find flour per cookie: 3 cups / 24 cookies = 0.125 cups/cookie
382. For 36 cookies: 0.125 × 36 = 4.5 cups""",
39 "answer": "4.5 cups of flour"
40 }
41]3. Self-Consistency
Generate multiple reasoning chains and aggregate results:
1import asyncio
2from collections import Counter
3
4async def self_consistent_cot(
5 question: str,
6 num_samples: int = 5,
7 temperature: float = 0.7
8) -> dict:
9 """Generate multiple reasoning chains and find consensus."""
10
11 prompt = f"""{question}
12
13Let me think through this step by step:"""
14
15 # Generate multiple samples
16 tasks = [
17 llm.generate_async(prompt, temperature=temperature)
18 for _ in range(num_samples)
19 ]
20
21 responses = await asyncio.gather(*tasks)
22
23 # Extract final answers from each chain
24 answers = []
25 reasoning_chains = []
26
27 for response in responses:
28 answer = extract_final_answer(response)
29 answers.append(answer)
30 reasoning_chains.append(response)
31
32 # Find most common answer
33 answer_counts = Counter(answers)
34 most_common = answer_counts.most_common(1)[0]
35
36 return {
37 "answer": most_common[0],
38 "confidence": most_common[1] / num_samples,
39 "all_answers": answer_counts,
40 "reasoning_chains": reasoning_chains
41 }
42
43def extract_final_answer(response: str) -> str:
44 """Extract the final answer from a reasoning chain."""
45 # Look for patterns like "the answer is", "therefore", etc.
46 patterns = [
47 r"the answer is[:s]+(.+)",
48 r"therefore[,:s]+(.+)",
49 r"so[,:s]+(.+)$",
50 r"= (d+)",
51 ]
52 # ... implement pattern matching
53 return response.split("\n")[-1] # Simple fallback4. Least-to-Most Prompting
Decompose the problem explicitly, then solve subproblems in order:
1async def least_to_most(question: str) -> str:
2 """Decompose into subproblems, solve in order."""
3
4 # Step 1: Decompose
5 decompose_prompt = f"""Break this question into simpler subquestions that,
6when answered in order, will lead to the final answer.
7
8Question: {question}
9
10List the subquestions in order:"""
11
12 subquestions = await llm.generate_async(decompose_prompt)
13 subq_list = parse_numbered_list(subquestions)
14
15 # Step 2: Solve each subquestion
16 context = []
17 for sq in subq_list:
18 solve_prompt = f"""Given what we know:
19{chr(10).join(context) if context else 'Starting fresh.'}
20
21Answer this subquestion: {sq}
22
23Think step by step:"""
24
25 answer = await llm.generate_async(solve_prompt)
26 context.append(f"Q: {sq}\nA: {answer}")
27
28 # Step 3: Synthesize final answer
29 final_prompt = f"""Based on the following subquestion answers:
30
31{chr(10).join(context)}
32
33Now answer the original question: {question}"""
34
35 final_answer = await llm.generate_async(final_prompt)
36 return final_answerStructured Reasoning Formats
Beyond free-form reasoning, we can impose structure on the chain-of-thought to make it more systematic:
Numbered Steps
1NUMBERED_STEPS_TEMPLATE = """
2Solve this problem using numbered steps:
3
4Problem: {problem}
5
6Solution:
7Step 1: [First step]
8Step 2: [Second step]
9...
10Step N: [Final step with answer]
11
12Final Answer: [Clear statement of answer]
13"""
14
15def numbered_steps_cot(problem: str) -> str:
16 prompt = NUMBERED_STEPS_TEMPLATE.format(problem=problem)
17 return llm.generate(prompt)Pros and Cons Analysis
1PROS_CONS_TEMPLATE = """
2Analyze this decision:
3
4Decision: {decision}
5
6## Option A: {option_a}
7Pros:
8- [List advantages]
9
10Cons:
11- [List disadvantages]
12
13## Option B: {option_b}
14Pros:
15- [List advantages]
16
17Cons:
18- [List disadvantages]
19
20## Analysis
21[Compare options systematically]
22
23## Recommendation
24[Final recommendation with justification]
25"""
26
27def pros_cons_analysis(decision: str, options: list[str]) -> str:
28 prompt = PROS_CONS_TEMPLATE.format(
29 decision=decision,
30 option_a=options[0],
31 option_b=options[1]
32 )
33 return llm.generate(prompt)Hypothesis Testing
1HYPOTHESIS_TEMPLATE = """
2Investigate this question:
3
4Question: {question}
5
6## Hypothesis 1
7Statement: [Proposed answer]
8Evidence For: [Supporting facts]
9Evidence Against: [Contradicting facts]
10Likelihood: [High/Medium/Low]
11
12## Hypothesis 2
13Statement: [Alternative answer]
14Evidence For: [Supporting facts]
15Evidence Against: [Contradicting facts]
16Likelihood: [High/Medium/Low]
17
18## Conclusion
19[Most likely answer with reasoning]
20"""
21
22def hypothesis_testing(question: str) -> str:
23 prompt = HYPOTHESIS_TEMPLATE.format(question=question)
24 return llm.generate(prompt)Decision Matrix
1DECISION_MATRIX_TEMPLATE = """
2Evaluate these options against criteria:
3
4Decision: {decision}
5Options: {options}
6Criteria: {criteria}
7
8## Scoring Matrix
9| Criterion | Weight | {option_headers} |
10|-----------|--------|{option_columns}|
11{criteria_rows}
12
13## Weighted Scores
14{score_calculations}
15
16## Recommendation
17[Best option with explanation]
18"""
19
20def decision_matrix(
21 decision: str,
22 options: list[str],
23 criteria: list[dict] # [{"name": "...", "weight": 0.3}, ...]
24) -> str:
25 # Build the matrix structure
26 option_headers = " | ".join(options)
27 option_columns = " | ".join(["------"] * len(options))
28
29 prompt = f"""Create a decision matrix to evaluate:
30
31Decision: {decision}
32Options: {", ".join(options)}
33Criteria (with weights): {criteria}
34
35For each criterion, score each option from 1-10.
36Then calculate weighted scores and recommend the best option.
37
38Think through each score carefully with brief justification."""
39
40 return llm.generate(prompt)CoT in Agentic Systems
Chain-of-thought reasoning is particularly valuable in agentic systems. Here's how to integrate CoT into agent workflows:
ReAct with Explicit Reasoning
1class ReActWithCoT:
2 """ReAct agent with enhanced chain-of-thought."""
3
4 def __init__(self, tools: list):
5 self.tools = tools
6 self.client = Anthropic()
7
8 async def step(self, task: str, history: list[dict]) -> dict:
9 """Single ReAct step with explicit reasoning."""
10
11 history_text = self._format_history(history)
12
13 prompt = f"""You are solving a task step by step.
14
15Task: {task}
16
17Previous steps:
18{history_text}
19
20Available tools: {[t.name for t in self.tools]}
21
22Now reason about the next step:
23
24## Current Understanding
25[What do I know so far? What have I learned from previous steps?]
26
27## Goal Analysis
28[What am I trying to achieve? What's the gap between current state and goal?]
29
30## Options Consideration
31[What are my options for the next action? What are the trade-offs?]
32
33## Decision
34Thought: [Explicit reasoning about what to do next]
35Action: [tool_name]
36Action Input: [input for the tool]
37
38OR if the task is complete:
39
40## Decision
41Thought: [Reasoning about why the task is complete]
42Final Answer: [The complete answer]"""
43
44 response = self.client.messages.create(
45 model="claude-sonnet-4-20250514",
46 max_tokens=2048,
47 messages=[{"role": "user", "content": prompt}]
48 )
49
50 return self._parse_response(response.content[0].text)
51
52 def _format_history(self, history: list[dict]) -> str:
53 """Format previous steps for context."""
54 if not history:
55 return "No previous steps yet."
56
57 lines = []
58 for i, step in enumerate(history):
59 lines.append(f"Step {i+1}:")
60 lines.append(f" Thought: {step.get('thought', 'N/A')}")
61 if 'action' in step:
62 lines.append(f" Action: {step['action']}")
63 lines.append(f" Result: {step.get('result', 'N/A')}")
64 return "\n".join(lines)Planning with CoT
1async def plan_with_cot(goal: str, context: dict) -> list[dict]:
2 """Create a plan using chain-of-thought reasoning."""
3
4 prompt = f"""Create a plan to achieve this goal.
5
6Goal: {goal}
7Context: {json.dumps(context)}
8
9## Understanding the Goal
10[What exactly needs to be accomplished? What does success look like?]
11
12## Analyzing Constraints
13[What limitations or requirements must be respected?]
14
15## Identifying Key Challenges
16[What are the main obstacles or complexities?]
17
18## Exploring Approaches
19[What are different ways to achieve this goal?]
20
21## Selecting Approach
22[Which approach is best and why?]
23
24## Detailed Plan
25[Step-by-step plan with rationale for each step]
26
27Return the plan as a JSON array:
28[
29 {{"step": 1, "action": "...", "rationale": "...", "dependencies": []}},
30 ...
31]"""
32
33 response = await llm.generate_async(prompt)
34
35 # Extract JSON from response
36 plan = extract_json_from_reasoning(response)
37 return planTool Selection with CoT
1async def select_tool_with_cot(
2 task: str,
3 available_tools: list[dict],
4 context: str
5) -> dict:
6 """Select the best tool using explicit reasoning."""
7
8 tools_desc = "\n".join(
9 f"- {t['name']}: {t['description']}"
10 for t in available_tools
11 )
12
13 prompt = f"""Select the best tool for this task.
14
15Task: {task}
16Context: {context}
17
18Available tools:
19{tools_desc}
20
21## Task Analysis
22[What does this task require? What kind of operation is needed?]
23
24## Tool Evaluation
25[For each potentially relevant tool, assess:
26- Does it have the required capability?
27- What are the limitations?
28- How well does it fit?]
29
30## Selection
31Tool: [chosen tool name]
32Rationale: [why this tool is the best choice]
33Input: [what input to provide]
34Expected Output: [what we expect to get back]"""
35
36 response = await llm.generate_async(prompt)
37
38 # Parse selection
39 return parse_tool_selection(response)Implementing CoT Reasoning
Let's build a complete CoT reasoning system for agents:
1from anthropic import Anthropic
2from dataclasses import dataclass, field
3from typing import Any, Optional
4from enum import Enum
5import json
6import re
7
8class ReasoningStyle(Enum):
9 BRIEF = "brief" # Quick reasoning for simple tasks
10 STANDARD = "standard" # Normal step-by-step
11 THOROUGH = "thorough" # Deep analysis
12 SOCRATIC = "socratic" # Self-questioning
13
14@dataclass
15class ReasoningStep:
16 """A single step in the reasoning chain."""
17 step_number: int
18 thought: str
19 conclusion: Optional[str] = None
20 confidence: float = 1.0
21
22@dataclass
23class ReasoningChain:
24 """Complete chain of reasoning."""
25 question: str
26 steps: list[ReasoningStep]
27 final_answer: str
28 total_confidence: float
29 style_used: ReasoningStyle
30
31class ChainOfThoughtReasoner:
32 """
33 Implements various chain-of-thought reasoning strategies.
34 """
35
36 def __init__(self, model: str = "claude-sonnet-4-20250514"):
37 self.client = Anthropic()
38 self.model = model
39
40 def reason(
41 self,
42 question: str,
43 style: ReasoningStyle = ReasoningStyle.STANDARD,
44 context: str = ""
45 ) -> ReasoningChain:
46 """Perform chain-of-thought reasoning."""
47
48 if style == ReasoningStyle.BRIEF:
49 return self._brief_reasoning(question, context)
50 elif style == ReasoningStyle.STANDARD:
51 return self._standard_reasoning(question, context)
52 elif style == ReasoningStyle.THOROUGH:
53 return self._thorough_reasoning(question, context)
54 elif style == ReasoningStyle.SOCRATIC:
55 return self._socratic_reasoning(question, context)
56
57 def _brief_reasoning(
58 self,
59 question: str,
60 context: str
61 ) -> ReasoningChain:
62 """Quick reasoning for simple questions."""
63
64 prompt = f"""Answer this question with brief reasoning.
65
66{f"Context: {context}" if context else ""}
67Question: {question}
68
69Think: [1-2 sentences of reasoning]
70Answer: [concise answer]"""
71
72 response = self.client.messages.create(
73 model=self.model,
74 max_tokens=512,
75 messages=[{"role": "user", "content": prompt}]
76 )
77
78 text = response.content[0].text
79 thought, answer = self._parse_brief(text)
80
81 return ReasoningChain(
82 question=question,
83 steps=[ReasoningStep(1, thought, answer)],
84 final_answer=answer,
85 total_confidence=0.9,
86 style_used=ReasoningStyle.BRIEF
87 )
88
89 def _standard_reasoning(
90 self,
91 question: str,
92 context: str
93 ) -> ReasoningChain:
94 """Standard step-by-step reasoning."""
95
96 prompt = f"""Answer this question with step-by-step reasoning.
97
98{f"Context: {context}" if context else ""}
99Question: {question}
100
101Let me think through this step by step:
102
103Step 1: [First consideration]
104Step 2: [Second consideration]
105...
106
107Therefore, the answer is: [final answer]"""
108
109 response = self.client.messages.create(
110 model=self.model,
111 max_tokens=2048,
112 messages=[{"role": "user", "content": prompt}]
113 )
114
115 text = response.content[0].text
116 steps, answer = self._parse_standard(text)
117
118 return ReasoningChain(
119 question=question,
120 steps=steps,
121 final_answer=answer,
122 total_confidence=0.85,
123 style_used=ReasoningStyle.STANDARD
124 )
125
126 def _thorough_reasoning(
127 self,
128 question: str,
129 context: str
130 ) -> ReasoningChain:
131 """Deep, comprehensive reasoning."""
132
133 prompt = f"""Perform thorough analysis of this question.
134
135{f"Context: {context}" if context else ""}
136Question: {question}
137
138## Understanding
139[What is being asked? Any ambiguities to resolve?]
140
141## Relevant Knowledge
142[What facts, principles, or frameworks apply?]
143
144## Analysis
145[Systematic examination of the problem]
146
147## Considerations
148[Alternative perspectives, edge cases, limitations]
149
150## Synthesis
151[Bringing together insights]
152
153## Conclusion
154[Final answer with confidence level]"""
155
156 response = self.client.messages.create(
157 model=self.model,
158 max_tokens=4096,
159 messages=[{"role": "user", "content": prompt}]
160 )
161
162 text = response.content[0].text
163 steps, answer = self._parse_thorough(text)
164
165 return ReasoningChain(
166 question=question,
167 steps=steps,
168 final_answer=answer,
169 total_confidence=0.9,
170 style_used=ReasoningStyle.THOROUGH
171 )
172
173 def _socratic_reasoning(
174 self,
175 question: str,
176 context: str
177 ) -> ReasoningChain:
178 """Reasoning through self-questioning."""
179
180 prompt = f"""Answer by asking and answering questions.
181
182{f"Context: {context}" if context else ""}
183Main Question: {question}
184
185Q1: [First clarifying question]
186A1: [Answer to Q1]
187
188Q2: [Follow-up question based on A1]
189A2: [Answer to Q2]
190
191Q3: [Deeper question]
192A3: [Answer to Q3]
193
194...
195
196Final Answer: [Answer to the main question]"""
197
198 response = self.client.messages.create(
199 model=self.model,
200 max_tokens=2048,
201 messages=[{"role": "user", "content": prompt}]
202 )
203
204 text = response.content[0].text
205 steps, answer = self._parse_socratic(text)
206
207 return ReasoningChain(
208 question=question,
209 steps=steps,
210 final_answer=answer,
211 total_confidence=0.88,
212 style_used=ReasoningStyle.SOCRATIC
213 )
214
215 def _parse_brief(self, text: str) -> tuple[str, str]:
216 """Parse brief reasoning format."""
217 thought_match = re.search(r"Think:s*(.+?)(?=Answer:|$)", text, re.S)
218 answer_match = re.search(r"Answer:s*(.+?)$", text, re.S)
219
220 thought = thought_match.group(1).strip() if thought_match else text
221 answer = answer_match.group(1).strip() if answer_match else text
222
223 return thought, answer
224
225 def _parse_standard(self, text: str) -> tuple[list[ReasoningStep], str]:
226 """Parse standard step-by-step format."""
227 steps = []
228 step_pattern = r"Step (d+):s*(.+?)(?=Step d+:|Therefore|$)"
229 matches = re.findall(step_pattern, text, re.S)
230
231 for num, content in matches:
232 steps.append(ReasoningStep(
233 step_number=int(num),
234 thought=content.strip()
235 ))
236
237 answer_match = re.search(r"(?:Therefore|Thus|So),?s*(?:the answer is:?)?s*(.+?)$", text, re.S)
238 answer = answer_match.group(1).strip() if answer_match else ""
239
240 return steps, answer
241
242 def _parse_thorough(self, text: str) -> tuple[list[ReasoningStep], str]:
243 """Parse thorough analysis format."""
244 sections = ["Understanding", "Relevant Knowledge", "Analysis",
245 "Considerations", "Synthesis", "Conclusion"]
246 steps = []
247
248 for i, section in enumerate(sections):
249 pattern = rf"## {section}
250(.+?)(?=## |$)"
251 match = re.search(pattern, text, re.S)
252 if match:
253 steps.append(ReasoningStep(
254 step_number=i + 1,
255 thought=match.group(1).strip()
256 ))
257
258 answer = steps[-1].thought if steps else text
259 return steps, answer
260
261 def _parse_socratic(self, text: str) -> tuple[list[ReasoningStep], str]:
262 """Parse Socratic Q&A format."""
263 steps = []
264 qa_pattern = r"Q(d+):s*(.+?)
265Ad+:s*(.+?)(?=Qd+:|Final|$)"
266 matches = re.findall(qa_pattern, text, re.S)
267
268 for num, question, answer in matches:
269 steps.append(ReasoningStep(
270 step_number=int(num),
271 thought=f"Q: {question.strip()}\nA: {answer.strip()}"
272 ))
273
274 answer_match = re.search(r"Final Answer:s*(.+?)$", text, re.S)
275 answer = answer_match.group(1).strip() if answer_match else ""
276
277 return steps, answerUsage Example
1# Create reasoner
2reasoner = ChainOfThoughtReasoner()
3
4# Brief reasoning for simple question
5brief_result = reasoner.reason(
6 "What's 15% of 80?",
7 style=ReasoningStyle.BRIEF
8)
9print(f"Brief: {brief_result.final_answer}")
10
11# Standard reasoning for moderate question
12standard_result = reasoner.reason(
13 "Should I use SQL or NoSQL for a social media app?",
14 style=ReasoningStyle.STANDARD,
15 context="Expected 1M users, need real-time feeds"
16)
17print(f"\nStandard reasoning ({len(standard_result.steps)} steps):")
18for step in standard_result.steps:
19 print(f" Step {step.step_number}: {step.thought[:100]}...")
20print(f"Answer: {standard_result.final_answer}")
21
22# Thorough reasoning for complex question
23thorough_result = reasoner.reason(
24 "What's the best architecture for a multi-tenant SaaS platform?",
25 style=ReasoningStyle.THOROUGH,
26 context="Enterprise customers, strict data isolation requirements"
27)
28print(f"\nThorough analysis ({len(thorough_result.steps)} sections):")
29print(f"Confidence: {thorough_result.total_confidence:.1%}")Summary
Chain-of-Thought prompting makes reasoning explicit, improving both quality and interpretability. We covered:
- What is CoT: Showing intermediate reasoning steps rather than jumping to answers
- Prompting techniques: Zero-shot, few-shot, self-consistency, and least-to-most approaches
- Structured formats: Numbered steps, pros/cons, hypothesis testing, and decision matrices
- Agentic integration: Using CoT in ReAct loops, planning, and tool selection
- Implementation: A flexible reasoner supporting multiple styles from brief to thorough
In the next section, we'll explore Tree of Thoughts—extending chain-of-thought to explore multiple reasoning paths in parallel.