Back to Blog

ReAct Pattern: Combining Reasoning and Acting in AI Agents

AI AgentsAlin Dobra15 min read

ReAct Pattern: Combining Reasoning and Acting in AI Agents

Most AI agents fail silently. They take an action, it doesn't work, and they flounder—or worse, they confidently do the wrong thing without realizing it.

ReAct fixes this by making agents think out loud.

ReAct (Reasoning + Acting) is a pattern where agents explicitly verbalize their reasoning before each action, observe the results, and reason again. It's simple, powerful, and dramatically improves agent reliability.

This guide shows you how to implement ReAct from scratch and when to use it over other patterns.

What Is ReAct?

ReAct interleaves thinking and acting in a continuous loop:

text
1
2
                     ReAct Loop                              
3
4
                                                             
5
                       
6
     THOUGHT    ACTION  OBSERVATION           
7
                                                      
8
     "I need  │    │ search(  │    │ "Results:            
9
      to find       'weather      London              
10
      the            London'      15°C,               
11
      weather"│    │ )        │    │  cloudy"             
12
                       
13
                                                           
14
                                                           
15
                            
16
                    (repeat)                                 
17
                                                             
18
    Until: Task complete OR max steps reached                
19
                                                             
20
21
 

Each iteration has three parts:

  1. Thought: The agent reasons about what to do next
  2. Action: The agent executes a tool or action
  3. Observation: The agent sees the result

This continues until the agent decides it has enough information to answer.

Why ReAct Works

1. Explicit Reasoning Catches Errors

When agents reason silently, errors compound. When they reason out loud, problems surface early:

text
1
# Silent agent (problematic)
2
Action: search("Python web frameworks")
3
Action: search("Django vs Flask")  
4
Action: write_report(...)  # Report may be wrong, hard to debug
5
 
6
# ReAct agent (transparent)
7
Thought: I need to find popular Python web frameworks
8
Action: search("Python web frameworks 2024")
9
Observation: Django, Flask, FastAPI are the top 3...
10
 
11
Thought: I should compare these frameworks. Let me focus on FastAPI 
12
        since it wasn't in my training data and is newest.
13
Action: search("FastAPI features performance")
14
Observation: FastAPI is async-first, automatic OpenAPI docs...
15
 
16
Thought: Now I have enough information to write a comparison.
17
Action: write_report(...)  # Each step is traceable
18
 

2. Grounded in Observations

Pure reasoning (Chain-of-Thought) can hallucinate. ReAct grounds reasoning in real observations:

text
1
# Chain-of-Thought (can hallucinate)
2
Thought: The capital of Australia is Sydney because it's the largest city.
3
Answer: Sydney  # WRONG
4
 
5
# ReAct (grounded in facts)
6
Thought: I should verify the capital of Australia
7
Action: search("capital of Australia")
8
Observation: The capital of Australia is Canberra, not Sydney...
9
Thought: The observation confirms it's Canberra, not Sydney as I initially thought.
10
Answer: Canberra  # CORRECT
11
 

3. Flexible Problem Solving

The agent can adapt its approach based on observations:

text
1
Thought: I'll try to get the weather from the primary API
2
Action: api_call("weather.primary.com/london")
3
Observation: Error 503 - Service unavailable
4
 
5
Thought: Primary API is down. I'll try the backup API instead.
6
Action: api_call("weather.backup.com/london")
7
Observation: {"temp": 15, "condition": "cloudy"}
8
 
9
Thought: Got the weather from backup. I can now answer the question.
10
 

Basic ReAct Implementation

Here's a complete, minimal ReAct agent:

python
1
import openai
2
import json
3
import re
4
 
5
class ReActAgent:
6
    def __init__(self, tools: dict):
7
        self.client = openai.OpenAI()
8
        self.tools = tools
9
        self.max_steps = 10
10
    
11
    def run(self, task: str) -> str:
12
        """Execute task using ReAct loop"""
13
        
14
        prompt = self._build_initial_prompt(task)
15
        
16
        for step in range(self.max_steps):
17
            # Get next thought and action from LLM
18
            response = self.client.chat.completions.create(
19
                model="gpt-4o",
20
                messages=[{"role": "user", "content": prompt}],
21
                stop=["Observation:"]  # Stop before observation
22
            )
23
            
24
            output = response.choices[0].message.content
25
            prompt += output
26
            
27
            # Check if agent wants to finish
28
            if "Final Answer:" in output:
29
                return self._extract_final_answer(output)
30
            
31
            # Extract and execute action
32
            action, action_input = self._parse_action(output)
33
            
34
            if action is None:
35
                prompt += "\nObservation: Could not parse action. Please use format 'Action: tool_name[input]'\n"
36
                continue
37
            
38
            # Execute the action
39
            observation = self._execute_action(action, action_input)
40
            prompt += f"\nObservation: {observation}\n"
41
        
42
        return "Max steps reached without finding answer."
43
    
44
    def _build_initial_prompt(self, task: str) -> str:
45
        tool_descriptions = "\n".join([
46
            f"- {name}: {func.__doc__ or 'No description'}"
47
            for name, func in self.tools.items()
48
        ])
49
        
50
        return f"""Answer the following question using the available tools.
51
 
52
Available tools:
53
{tool_descriptions}
54
 
55
Use this format:
56
 
57
Thought: [Your reasoning about what to do next]
58
Action: tool_name[input]
59
Observation: [Result of the action - will be provided]
60
... (repeat Thought/Action/Observation as needed)
61
Thought: I now have enough information to answer.
62
Final Answer: [Your final answer]
63
 
64
Question: {task}
65
 
66
"""
67
    
68
    def _parse_action(self, text: str) -> tuple:
69
        """Extract action and input from LLM output"""
70
        # Match pattern: Action: tool_name[input]
71
        match = re.search(r'Action:\s*(\w+)\[([^\]]*)\]', text)
72
        if match:
73
            return match.group(1), match.group(2)
74
        
75
        # Alternative format: Action: tool_name("input")
76
        match = re.search(r'Action:\s*(\w+)\("([^"]*)"\)', text)
77
        if match:
78
            return match.group(1), match.group(2)
79
        
80
        return None, None
81
    
82
    def _execute_action(self, action: str, action_input: str) -> str:
83
        """Execute the specified action"""
84
        if action not in self.tools:
85
            return f"Error: Unknown tool '{action}'. Available: {list(self.tools.keys())}"
86
        
87
        try:
88
            result = self.tools[action](action_input)
89
            return str(result)
90
        except Exception as e:
91
            return f"Error executing {action}: {str(e)}"
92
    
93
    def _extract_final_answer(self, text: str) -> str:
94
        """Extract the final answer from output"""
95
        match = re.search(r'Final Answer:\s*(.+)', text, re.DOTALL)
96
        if match:
97
            return match.group(1).strip()
98
        return text
99
 
100
 
101
# Define tools
102
def search(query: str) -> str:
103
    """Search the web for information"""
104
    # In production, use a real search API
105
    return f"Search results for '{query}': ..."
106
 
107
def calculate(expression: str) -> str:
108
    """Evaluate a mathematical expression"""
109
    try:
110
        return str(eval(expression))
111
    except:
112
        return "Error: Could not evaluate expression"
113
 
114
def lookup(term: str) -> str:
115
    """Look up a term in the knowledge base"""
116
    knowledge = {
117
        "python": "A high-level programming language",
118
        "react": "A JavaScript library for building UIs",
119
    }
120
    return knowledge.get(term.lower(), f"No entry found for '{term}'")
121
 
122
 
123
# Usage
124
agent = ReActAgent(tools={
125
    "search": search,
126
    "calculate": calculate,
127
    "lookup": lookup
128
})
129
 
130
result = agent.run("What is 25% of the population of France?")
131
print(result)
132
 

Example trace:

text
1
Thought: I need to find the population of France first, then calculate 25% of it.
2
Action: search[population of France 2024]
3
Observation: The population of France is approximately 68 million people.
4
 
5
Thought: Now I can calculate 25% of 68 million.
6
Action: calculate[68000000 * 0.25]
7
Observation: 17000000.0
8
 
9
Thought: I now have the answer.
10
Final Answer: 25% of France's population is 17 million people.
11
 

ReAct with Code Execution

For agents that can run code, ReAct is particularly powerful:

python
1
from hopx import Sandbox
2
import openai
3
import re
4
 
5
class CodeReActAgent:
6
    def __init__(self):
7
        self.client = openai.OpenAI()
8
        self.sandbox = None
9
        self.max_steps = 15
10
    
11
    def run(self, task: str) -> str:
12
        # Create sandbox for the session
13
        self.sandbox = Sandbox.create(template="code-interpreter")
14
        
15
        try:
16
            return self._react_loop(task)
17
        finally:
18
            self.sandbox.kill()
19
    
20
    def _react_loop(self, task: str) -> str:
21
        prompt = self._build_prompt(task)
22
        
23
        for step in range(self.max_steps):
24
            response = self.client.chat.completions.create(
25
                model="gpt-4o",
26
                messages=[{"role": "user", "content": prompt}],
27
                stop=["Observation:"]
28
            )
29
            
30
            output = response.choices[0].message.content
31
            prompt += output
32
            
33
            print(f"\n--- Step {step + 1} ---")
34
            print(output)
35
            
36
            if "Final Answer:" in output:
37
                return self._extract_answer(output)
38
            
39
            # Parse action
40
            action_type, action_content = self._parse_action(output)
41
            
42
            if action_type == "python":
43
                observation = self._run_code(action_content)
44
            elif action_type == "bash":
45
                observation = self._run_bash(action_content)
46
            elif action_type == "read_file":
47
                observation = self._read_file(action_content)
48
            elif action_type == "write_file":
49
                path, content = action_content.split("|||", 1)
50
                observation = self._write_file(path.strip(), content.strip())
51
            else:
52
                observation = f"Unknown action type: {action_type}"
53
            
54
            prompt += f"\nObservation: {observation}\n"
55
            print(f"Observation: {observation[:500]}...")
56
        
57
        return "Max steps reached"
58
    
59
    def _build_prompt(self, task: str) -> str:
60
        return f"""You are an AI assistant that solves tasks by writing and executing code.
61
 
62
Available actions:
63
- python[code]: Execute Python code
64
- bash[command]: Run a bash command
65
- read_file[path]: Read a file
66
- write_file[path|||content]: Write content to a file
67
 
68
Format:
69
Thought: [Your reasoning]
70
Action: action_type[content]
71
Observation: [Will be provided]
72
 
73
Rules:
74
- Always think before acting
75
- Check observations carefully for errors
76
- Use print() to see outputs in Python
77
- When done, provide Final Answer:
78
 
79
Task: {task}
80
 
81
"""
82
    
83
    def _parse_action(self, text: str) -> tuple:
84
        # Match: Action: python[...]
85
        match = re.search(r'Action:\s*(\w+)\[(.*?)\](?:\s*$|\n)', text, re.DOTALL)
86
        if match:
87
            return match.group(1), match.group(2)
88
        
89
        # Match multi-line code blocks
90
        match = re.search(r'Action:\s*(\w+)\[(.*)\]', text, re.DOTALL)
91
        if match:
92
            return match.group(1), match.group(2)
93
        
94
        return None, None
95
    
96
    def _run_code(self, code: str) -> str:
97
        self.sandbox.files.write("/app/script.py", code)
98
        result = self.sandbox.commands.run("python /app/script.py", timeout=30)
99
        
100
        if result.exit_code == 0:
101
            return result.stdout if result.stdout else "(No output)"
102
        else:
103
            return f"Error (exit code {result.exit_code}):\n{result.stderr}"
104
    
105
    def _run_bash(self, command: str) -> str:
106
        result = self.sandbox.commands.run(command, timeout=30)
107
        output = result.stdout + result.stderr
108
        return output if output else "(No output)"
109
    
110
    def _read_file(self, path: str) -> str:
111
        try:
112
            return self.sandbox.files.read(path)
113
        except Exception as e:
114
            return f"Error reading file: {e}"
115
    
116
    def _write_file(self, path: str, content: str) -> str:
117
        try:
118
            self.sandbox.files.write(path, content)
119
            return f"Successfully wrote to {path}"
120
        except Exception as e:
121
            return f"Error writing file: {e}"
122
    
123
    def _extract_answer(self, text: str) -> str:
124
        match = re.search(r'Final Answer:\s*(.+)', text, re.DOTALL)
125
        return match.group(1).strip() if match else text
126
 
127
 
128
# Usage
129
agent = CodeReActAgent()
130
 
131
result = agent.run("""
132
Download the iris dataset, train a simple classifier, 
133
and report the accuracy.
134
""")
135
 
136
print(f"\nFinal Result:\n{result}")
137
 

Example trace:

text
1
--- Step 1 ---
2
Thought: I need to download the iris dataset and train a classifier. 
3
I'll use scikit-learn which has the iris dataset built-in.
4
Action: python[
5
from sklearn.datasets import load_iris
6
from sklearn.model_selection import train_test_split
7
from sklearn.ensemble import RandomForestClassifier
8
from sklearn.metrics import accuracy_score
9
 
10
# Load data
11
iris = load_iris()
12
X, y = iris.data, iris.target
13
 
14
# Split
15
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2)
16
 
17
# Train
18
clf = RandomForestClassifier(n_estimators=100)
19
clf.fit(X_train, y_train)
20
 
21
# Evaluate
22
predictions = clf.predict(X_test)
23
accuracy = accuracy_score(y_test, predictions)
24
print(f"Accuracy: {accuracy:.2%}")
25
]
26
Observation: Accuracy: 96.67%
27
 
28
--- Step 2 ---
29
Thought: The classifier trained successfully with 96.67% accuracy. 
30
I have enough information to provide the final answer.
31
Final Answer: I trained a Random Forest classifier on the Iris dataset 
32
and achieved 96.67% accuracy on the test set.
33
 

ReAct Variations

1. ReAct with Self-Reflection

Add a reflection step after observations:

python
1
def _build_prompt_with_reflection(self, task: str) -> str:
2
    return f"""...
3
Format:
4
Thought: [Your reasoning about what to do]
5
Action: action_type[content]
6
Observation: [Result - will be provided]
7
Reflection: [What did I learn? Did it work? What should I do next?]
8
 
9
..."""
10
 

2. ReAct with Critique

Add an inner critic to catch mistakes:

python
1
class CriticalReActAgent(ReActAgent):
2
    def _react_step(self, thought: str, action: str) -> str:
3
        # First, critique the proposed action
4
        critique = self._critique_action(thought, action)
5
        
6
        if "PROBLEM:" in critique:
7
            # Revise action based on critique
8
            revised = self._revise_action(thought, action, critique)
9
            return self._execute_action(*self._parse_action(revised))
10
        
11
        return self._execute_action(*self._parse_action(action))
12
    
13
    def _critique_action(self, thought: str, action: str) -> str:
14
        response = self.client.chat.completions.create(
15
            model="gpt-4o",
16
            messages=[{
17
                "role": "user",
18
                "content": f"""Critique this action. Is it correct and safe?
19
 
20
Thought: {thought}
21
Action: {action}
22
 
23
If there's a problem, start with "PROBLEM:" and explain.
24
If it looks good, say "APPROVED"."""
25
            }]
26
        )
27
        return response.choices[0].message.content
28
 

3. Parallel ReAct

Run multiple ReAct chains and merge:

python
1
import asyncio
2
 
3
async def parallel_react(task: str, perspectives: list[str]) -> str:
4
    """Run multiple ReAct agents with different perspectives"""
5
    
6
    async def run_with_perspective(perspective: str):
7
        prompt = f"You are a {perspective}. {task}"
8
        return await react_agent.run_async(prompt)
9
    
10
    results = await asyncio.gather(*[
11
        run_with_perspective(p) for p in perspectives
12
    ])
13
    
14
    # Merge results
15
    return merge_answers(results)
16
 
17
# Usage
18
answer = await parallel_react(
19
    "What are the pros and cons of microservices?",
20
    perspectives=["software architect", "DevOps engineer", "developer"]
21
)
22
 

ReAct vs Other Patterns

ReAct vs Chain-of-Thought

AspectChain-of-ThoughtReAct
ActionsNoneYes
GroundedNo (can hallucinate)Yes (observations)
DebuggablePartiallyVery
Token usageLowerHigher
Best forReasoning tasksAction tasks

ReAct vs Plan-and-Execute

AspectReActPlan-and-Execute
PlanningStep by stepUpfront
AdaptabilityHighLower
PredictabilityLowerHigher
OverheadLowerHigher
Best forDynamic tasksKnown workflows

When to Use ReAct

Use ReAct when:

  • Tasks require both reasoning and action
  • You need to debug agent behavior
  • The path to solution is unclear
  • Real-time adaptation is needed

Avoid ReAct when:

  • Task is simple (one action)
  • You need maximum speed
  • Token budget is very limited
  • Task is purely reasoning (no actions)

Production Considerations

1. Structured Output for Parsing

Use JSON for more reliable parsing:

python
1
def _build_prompt_structured(self, task: str) -> str:
2
    return f"""...
3
Respond in JSON format:
4
{{
5
    "thought": "your reasoning",
6
    "action": {{
7
        "tool": "tool_name",
8
        "input": "tool input"
9
    }}
10
}}
11
 
12
Or when finished:
13
{{
14
    "thought": "final reasoning",
15
    "final_answer": "your answer"
16
}}
17
..."""
18
 

2. Token Management

ReAct can consume many tokens. Manage context:

python
1
class TokenAwareReActAgent:
2
    def __init__(self, max_context_tokens: int = 8000):
3
        self.max_tokens = max_context_tokens
4
        self.history = []
5
    
6
    def _manage_context(self, prompt: str) -> str:
7
        """Trim history if context too long"""
8
        estimated_tokens = len(prompt) // 4
9
        
10
        if estimated_tokens > self.max_tokens:
11
            # Keep first (task) and last N steps
12
            self.history = self.history[:1] + self.history[-5:]
13
            prompt = self._rebuild_prompt()
14
        
15
        return prompt
16
 

3. Error Recovery

Handle failures gracefully:

python
1
def _execute_with_recovery(self, action: str, input: str, max_retries: int = 3):
2
    for attempt in range(max_retries):
3
        try:
4
            result = self.tools[action](input)
5
            return result
6
        except Exception as e:
7
            if attempt == max_retries - 1:
8
                return f"Failed after {max_retries} attempts: {e}"
9
            # Let agent know about failure
10
            return f"Attempt {attempt + 1} failed: {e}. You can retry."
11
 

4. Observation Limits

Truncate long observations:

python
1
def _truncate_observation(self, obs: str, max_length: int = 2000) -> str:
2
    if len(obs) <= max_length:
3
        return obs
4
    
5
    return obs[:max_length] + f"\n... (truncated, {len(obs) - max_length} chars omitted)"
6
 

Complete Production Example

python
1
from hopx import Sandbox
2
import openai
3
import json
4
from datetime import datetime
5
 
6
class ProductionReActAgent:
7
    def __init__(self):
8
        self.client = openai.OpenAI()
9
        self.max_steps = 20
10
        self.trace = []
11
    
12
    def run(self, task: str, tools: dict) -> dict:
13
        """Run ReAct loop and return structured result"""
14
        
15
        self.trace = []
16
        start_time = datetime.now()
17
        
18
        # Create sandbox if code execution needed
19
        sandbox = None
20
        if "run_code" in tools:
21
            sandbox = Sandbox.create(template="code-interpreter")
22
            tools["run_code"] = lambda code: self._safe_execute(sandbox, code)
23
        
24
        try:
25
            messages = [
26
                {"role": "system", "content": self._system_prompt(tools)},
27
                {"role": "user", "content": task}
28
            ]
29
            
30
            for step in range(self.max_steps):
31
                # Get next action
32
                response = self.client.chat.completions.create(
33
                    model="gpt-4o",
34
                    messages=messages,
35
                    response_format={"type": "json_object"}
36
                )
37
                
38
                output = json.loads(response.choices[0].message.content)
39
                
40
                self.trace.append({
41
                    "step": step + 1,
42
                    "thought": output.get("thought"),
43
                    "action": output.get("action"),
44
                    "timestamp": datetime.now().isoformat()
45
                })
46
                
47
                # Check for completion
48
                if "final_answer" in output:
49
                    return {
50
                        "success": True,
51
                        "answer": output["final_answer"],
52
                        "steps": step + 1,
53
                        "duration": (datetime.now() - start_time).seconds,
54
                        "trace": self.trace
55
                    }
56
                
57
                # Execute action
58
                action = output.get("action", {})
59
                tool_name = action.get("tool")
60
                tool_input = action.get("input")
61
                
62
                if tool_name not in tools:
63
                    observation = f"Error: Unknown tool '{tool_name}'"
64
                else:
65
                    try:
66
                        observation = str(tools[tool_name](tool_input))
67
                    except Exception as e:
68
                        observation = f"Error: {e}"
69
                
70
                # Truncate long observations
71
                if len(observation) > 3000:
72
                    observation = observation[:3000] + "... (truncated)"
73
                
74
                self.trace[-1]["observation"] = observation
75
                
76
                # Add to messages
77
                messages.append({
78
                    "role": "assistant",
79
                    "content": json.dumps(output)
80
                })
81
                messages.append({
82
                    "role": "user", 
83
                    "content": f"Observation: {observation}"
84
                })
85
            
86
            return {
87
                "success": False,
88
                "error": "Max steps reached",
89
                "steps": self.max_steps,
90
                "trace": self.trace
91
            }
92
        
93
        finally:
94
            if sandbox:
95
                sandbox.kill()
96
    
97
    def _system_prompt(self, tools: dict) -> str:
98
        tool_desc = "\n".join([
99
            f"- {name}: {func.__doc__ or 'No description'}"
100
            for name, func in tools.items()
101
        ])
102
        
103
        return f"""You are a ReAct agent. Think step by step, take actions, observe results.
104
 
105
Available tools:
106
{tool_desc}
107
 
108
Always respond with JSON:
109
{{
110
    "thought": "your reasoning about what to do next",
111
    "action": {{
112
        "tool": "tool_name",
113
        "input": "tool input"
114
    }}
115
}}
116
 
117
When you have the final answer:
118
{{
119
    "thought": "I now have enough information",
120
    "final_answer": "your complete answer"
121
}}
122
 
123
Be thorough but efficient. Verify important facts."""
124
    
125
    def _safe_execute(self, sandbox: Sandbox, code: str) -> str:
126
        """Safely execute code in sandbox"""
127
        sandbox.files.write("/app/code.py", code)
128
        result = sandbox.commands.run("timeout 30 python /app/code.py")
129
        
130
        if result.exit_code == 0:
131
            return result.stdout or "(No output)"
132
        elif result.exit_code == 124:
133
            return "Error: Execution timed out after 30 seconds"
134
        else:
135
            return f"Error:\n{result.stderr}"
136
 
137
 
138
# Usage
139
agent = ProductionReActAgent()
140
 
141
def web_search(query: str) -> str:
142
    """Search the web for information"""
143
    # Implement with your search API
144
    return f"Results for '{query}': ..."
145
 
146
def calculator(expression: str) -> str:
147
    """Evaluate a math expression"""
148
    return str(eval(expression))
149
 
150
def run_code(code: str) -> str:
151
    """Execute Python code safely"""
152
    pass  # Handled by agent
153
 
154
result = agent.run(
155
    task="What is the GDP per capita of the top 3 economies?",
156
    tools={
157
        "search": web_search,
158
        "calculate": calculator,
159
        "run_code": run_code
160
    }
161
)
162
 
163
print(f"Answer: {result['answer']}")
164
print(f"Steps: {result['steps']}")
165
print(f"Duration: {result['duration']}s")
166
 
167
# Inspect trace for debugging
168
for step in result['trace']:
169
    print(f"\nStep {step['step']}:")
170
    print(f"  Thought: {step['thought']}")
171
    print(f"  Action: {step.get('action')}")
172
    print(f"  Observation: {step.get('observation', '')[:100]}...")
173
 

Conclusion

ReAct is one of the most practical agentic patterns:

  • Transparent reasoning — See exactly what the agent is thinking
  • Grounded actions — Decisions based on real observations
  • Adaptive execution — Adjusts approach based on results
  • Easy debugging — Full trace of thought-action-observation

Start with basic ReAct for any task requiring tools. Add reflection for complex reasoning. Use structured output for production reliability.

The agent that thinks before acting outperforms the agent that acts blindly. Every time.


Ready to build ReAct agents with code execution? Get started with HopX — sandboxes that let your agents think, act, and observe safely.

Further Reading