ReAct Pattern: Combining Reasoning and Acting in AI Agents

Most AI agents fail silently. They take an action, it doesn't work, and they flounder—or worse, they confidently do the wrong thing without realizing it.

ReAct fixes this by making agents think out loud.

ReAct (Reasoning + Acting) is a pattern where agents explicitly verbalize their reasoning before each action, observe the results, and reason again. It's simple, powerful, and dramatically improves agent reliability.

This guide shows you how to implement ReAct from scratch and when to use it over other patterns.

What Is ReAct?

ReAct interleaves thinking and acting in a continuous loop:

text

1	┌─────────────────────────────────────────────────────────────┐
2	│ ReAct Loop │
3	├─────────────────────────────────────────────────────────────┤
4	│ │
5	│ ┌──────────┐ ┌──────────┐ ┌───────────┐ │
6	│ │ THOUGHT │───▶│ ACTION │───▶│OBSERVATION│ │
7	│ │ │ │ │ │ │ │
8	│ │ "I need │ │ search( │ │ "Results: │ │
9	│ │ to find │ │ 'weather│ │ London │ │
10	│ │ the │ │ London'│ │ 15°C, │ │
11	│ │ weather"│ │ ) │ │ cloudy" │ │
12	│ └──────────┘ └──────────┘ └─────┬─────┘ │
13	│ ▲ │ │
14	│ │ │ │
15	│ └───────────────────────────────┘ │
16	│ (repeat) │
17	│ │
18	│ Until: Task complete OR max steps reached │
19	│ │
20	└─────────────────────────────────────────────────────────────┘
21

Each iteration has three parts:

Thought: The agent reasons about what to do next
Action: The agent executes a tool or action
Observation: The agent sees the result

This continues until the agent decides it has enough information to answer.

Why ReAct Works

1. Explicit Reasoning Catches Errors

When agents reason silently, errors compound. When they reason out loud, problems surface early:

text

# Silent agent (problematic)
Action: search("Python web frameworks")
Action: search("Django vs Flask")  
Action: write_report(...)  # Report may be wrong, hard to debug
 
# ReAct agent (transparent)
Thought: I need to find popular Python web frameworks
Action: search("Python web frameworks 2024")
Observation: Django, Flask, FastAPI are the top 3...
 
Thought: I should compare these frameworks. Let me focus on FastAPI 
        since it wasn't in my training data and is newest.
Action: search("FastAPI features performance")
Observation: FastAPI is async-first, automatic OpenAPI docs...
 
Thought: Now I have enough information to write a comparison.
Action: write_report(...)  # Each step is traceable
 

2. Grounded in Observations

Pure reasoning (Chain-of-Thought) can hallucinate. ReAct grounds reasoning in real observations:

text

# Chain-of-Thought (can hallucinate)
Thought: The capital of Australia is Sydney because it's the largest city.
Answer: Sydney  # WRONG
 
# ReAct (grounded in facts)
Thought: I should verify the capital of Australia
Action: search("capital of Australia")
Observation: The capital of Australia is Canberra, not Sydney...
Thought: The observation confirms it's Canberra, not Sydney as I initially thought.
Answer: Canberra  # CORRECT
 

3. Flexible Problem Solving

The agent can adapt its approach based on observations:

text

Thought: I'll try to get the weather from the primary API
Action: api_call("weather.primary.com/london")
Observation: Error 503 - Service unavailable
 
Thought: Primary API is down. I'll try the backup API instead.
Action: api_call("weather.backup.com/london")
Observation: {"temp": 15, "condition": "cloudy"}
 
Thought: Got the weather from backup. I can now answer the question.
 

Basic ReAct Implementation

Here's a complete, minimal ReAct agent:

python

import openai
import json
import re
 
class ReActAgent:
    def __init__(self, tools: dict):
        self.client = openai.OpenAI()
        self.tools = tools
        self.max_steps = 10
    
    def run(self, task: str) -> str:
        """Execute task using ReAct loop"""
        
        prompt = self._build_initial_prompt(task)
        
        for step in range(self.max_steps):
            # Get next thought and action from LLM
            response = self.client.chat.completions.create(
                model="gpt-4o",
                messages=[{"role": "user", "content": prompt}],
                stop=["Observation:"]  # Stop before observation
            )
            
            output = response.choices[0].message.content
            prompt += output
            
            # Check if agent wants to finish
            if "Final Answer:" in output:
                return self._extract_final_answer(output)
            
            # Extract and execute action
            action, action_input = self._parse_action(output)
            
            if action is None:
                prompt += "\nObservation: Could not parse action. Please use format 'Action: tool_name[input]'\n"
                continue
            
            # Execute the action
            observation = self._execute_action(action, action_input)
            prompt += f"\nObservation: {observation}\n"
        
        return "Max steps reached without finding answer."
    
    def _build_initial_prompt(self, task: str) -> str:
        tool_descriptions = "\n".join([
            f"- {name}: {func.__doc__ or 'No description'}"
            for name, func in self.tools.items()
        ])
        
        return f"""Answer the following question using the available tools.
 
Available tools:
{tool_descriptions}
 
Use this format:
 
Thought: [Your reasoning about what to do next]
Action: tool_name[input]
Observation: [Result of the action - will be provided]
... (repeat Thought/Action/Observation as needed)
Thought: I now have enough information to answer.
Final Answer: [Your final answer]
 
Question: {task}
 
"""
    
    def _parse_action(self, text: str) -> tuple:
        """Extract action and input from LLM output"""
        # Match pattern: Action: tool_name[input]
        match = re.search(r'Action:\s*(\w+)\[([^\]]*)\]', text)
        if match:
            return match.group(1), match.group(2)
        
        # Alternative format: Action: tool_name("input")
        match = re.search(r'Action:\s*(\w+)\("([^"]*)"\)', text)
        if match:
            return match.group(1), match.group(2)
        
        return None, None
    
    def _execute_action(self, action: str, action_input: str) -> str:
        """Execute the specified action"""
        if action not in self.tools:
            return f"Error: Unknown tool '{action}'. Available: {list(self.tools.keys())}"
        
        try:
            result = self.tools[action](action_input)
            return str(result)
        except Exception as e:
            return f"Error executing {action}: {str(e)}"
    
    def _extract_final_answer(self, text: str) -> str:
        """Extract the final answer from output"""
        match = re.search(r'Final Answer:\s*(.+)', text, re.DOTALL)
        if match:
            return match.group(1).strip()
        return text
 
 
# Define tools
def search(query: str) -> str:
    """Search the web for information"""
    # In production, use a real search API
    return f"Search results for '{query}': ..."
 
def calculate(expression: str) -> str:
    """Evaluate a mathematical expression"""
    try:
        return str(eval(expression))
    except:
        return "Error: Could not evaluate expression"
 
def lookup(term: str) -> str:
    """Look up a term in the knowledge base"""
    knowledge = {
        "python": "A high-level programming language",
        "react": "A JavaScript library for building UIs",
    }
    return knowledge.get(term.lower(), f"No entry found for '{term}'")
 
 
# Usage
agent = ReActAgent(tools={
    "search": search,
    "calculate": calculate,
    "lookup": lookup
})
 
result = agent.run("What is 25% of the population of France?")
print(result)
 

Example trace:

text

Thought: I need to find the population of France first, then calculate 25% of it.
Action: search[population of France 2024]
Observation: The population of France is approximately 68 million people.
 
Thought: Now I can calculate 25% of 68 million.
Action: calculate[68000000 * 0.25]
Observation: 17000000.0
 
Thought: I now have the answer.
Final Answer: 25% of France's population is 17 million people.
 

ReAct with Code Execution

For agents that can run code, ReAct is particularly powerful:

python

from hopx import Sandbox
import openai
import re
 
class CodeReActAgent:
    def __init__(self):
        self.client = openai.OpenAI()
        self.sandbox = None
        self.max_steps = 15
    
    def run(self, task: str) -> str:
        # Create sandbox for the session
        self.sandbox = Sandbox.create(template="code-interpreter")
        
        try:
            return self._react_loop(task)
        finally:
            self.sandbox.kill()
    
    def _react_loop(self, task: str) -> str:
        prompt = self._build_prompt(task)
        
        for step in range(self.max_steps):
            response = self.client.chat.completions.create(
                model="gpt-4o",
                messages=[{"role": "user", "content": prompt}],
                stop=["Observation:"]
            )
            
            output = response.choices[0].message.content
            prompt += output
            
            print(f"\n--- Step {step + 1} ---")
            print(output)
            
            if "Final Answer:" in output:
                return self._extract_answer(output)
            
            # Parse action
            action_type, action_content = self._parse_action(output)
            
            if action_type == "python":
                observation = self._run_code(action_content)
            elif action_type == "bash":
                observation = self._run_bash(action_content)
            elif action_type == "read_file":
                observation = self._read_file(action_content)
            elif action_type == "write_file":
                path, content = action_content.split("|||", 1)
                observation = self._write_file(path.strip(), content.strip())
            else:
                observation = f"Unknown action type: {action_type}"
            
            prompt += f"\nObservation: {observation}\n"
            print(f"Observation: {observation[:500]}...")
        
        return "Max steps reached"
    
    def _build_prompt(self, task: str) -> str:
        return f"""You are an AI assistant that solves tasks by writing and executing code.
 
Available actions:
- python[code]: Execute Python code
- bash[command]: Run a bash command
- read_file[path]: Read a file
- write_file[path|||content]: Write content to a file
 
Format:
Thought: [Your reasoning]
Action: action_type[content]
Observation: [Will be provided]
 
Rules:
- Always think before acting
- Check observations carefully for errors
- Use print() to see outputs in Python
- When done, provide Final Answer:
 
Task: {task}
 
"""
    
    def _parse_action(self, text: str) -> tuple:
        # Match: Action: python[...]
        match = re.search(r'Action:\s*(\w+)\[(.*?)\](?:\s*$|\n)', text, re.DOTALL)
        if match:
            return match.group(1), match.group(2)
        
        # Match multi-line code blocks
        match = re.search(r'Action:\s*(\w+)\[(.*)\]', text, re.DOTALL)
        if match:
            return match.group(1), match.group(2)
        
        return None, None
    
    def _run_code(self, code: str) -> str:
        self.sandbox.files.write("/app/script.py", code)
        result = self.sandbox.commands.run("python /app/script.py", timeout=30)
        
        if result.exit_code == 0:
            return result.stdout if result.stdout else "(No output)"
        else:
            return f"Error (exit code {result.exit_code}):\n{result.stderr}"
    
    def _run_bash(self, command: str) -> str:
        result = self.sandbox.commands.run(command, timeout=30)
        output = result.stdout + result.stderr
        return output if output else "(No output)"
    
    def _read_file(self, path: str) -> str:
        try:
            return self.sandbox.files.read(path)
        except Exception as e:
            return f"Error reading file: {e}"
    
    def _write_file(self, path: str, content: str) -> str:
        try:
            self.sandbox.files.write(path, content)
            return f"Successfully wrote to {path}"
        except Exception as e:
            return f"Error writing file: {e}"
    
    def _extract_answer(self, text: str) -> str:
        match = re.search(r'Final Answer:\s*(.+)', text, re.DOTALL)
        return match.group(1).strip() if match else text
 
 
# Usage
agent = CodeReActAgent()
 
result = agent.run("""
Download the iris dataset, train a simple classifier, 
and report the accuracy.
""")
 
print(f"\nFinal Result:\n{result}")
 

Example trace:

text

--- Step 1 ---
Thought: I need to download the iris dataset and train a classifier. 
I'll use scikit-learn which has the iris dataset built-in.
Action: python[
from sklearn.datasets import load_iris
from sklearn.model_selection import train_test_split
from sklearn.ensemble import RandomForestClassifier
from sklearn.metrics import accuracy_score
 
# Load data
iris = load_iris()
X, y = iris.data, iris.target
 
# Split
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2)
 
# Train
clf = RandomForestClassifier(n_estimators=100)
clf.fit(X_train, y_train)
 
# Evaluate
predictions = clf.predict(X_test)
accuracy = accuracy_score(y_test, predictions)
print(f"Accuracy: {accuracy:.2%}")
]
Observation: Accuracy: 96.67%
 
--- Step 2 ---
Thought: The classifier trained successfully with 96.67% accuracy. 
I have enough information to provide the final answer.
Final Answer: I trained a Random Forest classifier on the Iris dataset 
and achieved 96.67% accuracy on the test set.
 

ReAct Variations

1. ReAct with Self-Reflection

Add a reflection step after observations:

python

def _build_prompt_with_reflection(self, task: str) -> str:
    return f"""...
Format:
Thought: [Your reasoning about what to do]
Action: action_type[content]
Observation: [Result - will be provided]
Reflection: [What did I learn? Did it work? What should I do next?]
 
..."""
 

2. ReAct with Critique

Add an inner critic to catch mistakes:

python

class CriticalReActAgent(ReActAgent):
    def _react_step(self, thought: str, action: str) -> str:
        # First, critique the proposed action
        critique = self._critique_action(thought, action)
        
        if "PROBLEM:" in critique:
            # Revise action based on critique
            revised = self._revise_action(thought, action, critique)
            return self._execute_action(*self._parse_action(revised))
        
        return self._execute_action(*self._parse_action(action))
    
    def _critique_action(self, thought: str, action: str) -> str:
        response = self.client.chat.completions.create(
            model="gpt-4o",
            messages=[{
                "role": "user",
                "content": f"""Critique this action. Is it correct and safe?
 
Thought: {thought}
Action: {action}
 
If there's a problem, start with "PROBLEM:" and explain.
If it looks good, say "APPROVED"."""
            }]
        )
        return response.choices[0].message.content
 

3. Parallel ReAct

Run multiple ReAct chains and merge:

python

import asyncio
 
async def parallel_react(task: str, perspectives: list[str]) -> str:
    """Run multiple ReAct agents with different perspectives"""
    
    async def run_with_perspective(perspective: str):
        prompt = f"You are a {perspective}. {task}"
        return await react_agent.run_async(prompt)
    
    results = await asyncio.gather(*[
        run_with_perspective(p) for p in perspectives
    ])
    
    # Merge results
    return merge_answers(results)
 
# Usage
answer = await parallel_react(
    "What are the pros and cons of microservices?",
    perspectives=["software architect", "DevOps engineer", "developer"]
)
 

ReAct vs Other Patterns

ReAct vs Chain-of-Thought

Aspect	Chain-of-Thought	ReAct
Actions	None	Yes
Grounded	No (can hallucinate)	Yes (observations)
Debuggable	Partially	Very
Token usage	Lower	Higher
Best for	Reasoning tasks	Action tasks

ReAct vs Plan-and-Execute

Aspect	ReAct	Plan-and-Execute
Planning	Step by step	Upfront
Adaptability	High	Lower
Predictability	Lower	Higher
Overhead	Lower	Higher
Best for	Dynamic tasks	Known workflows

When to Use ReAct

✅ Use ReAct when:

Tasks require both reasoning and action
You need to debug agent behavior
The path to solution is unclear
Real-time adaptation is needed

❌ Avoid ReAct when:

Task is simple (one action)
You need maximum speed
Token budget is very limited
Task is purely reasoning (no actions)

Production Considerations

1. Structured Output for Parsing

Use JSON for more reliable parsing:

python

def _build_prompt_structured(self, task: str) -> str:
    return f"""...
Respond in JSON format:
{{
    "thought": "your reasoning",
    "action": {{
        "tool": "tool_name",
        "input": "tool input"
    }}
}}
 
Or when finished:
{{
    "thought": "final reasoning",
    "final_answer": "your answer"
}}
..."""
 

2. Token Management

ReAct can consume many tokens. Manage context:

python

class TokenAwareReActAgent:
    def __init__(self, max_context_tokens: int = 8000):
        self.max_tokens = max_context_tokens
        self.history = []
    
    def _manage_context(self, prompt: str) -> str:
        """Trim history if context too long"""
        estimated_tokens = len(prompt) // 4
        
        if estimated_tokens > self.max_tokens:
            # Keep first (task) and last N steps
            self.history = self.history[:1] + self.history[-5:]
            prompt = self._rebuild_prompt()
        
        return prompt
 

3. Error Recovery

Handle failures gracefully:

python

def _execute_with_recovery(self, action: str, input: str, max_retries: int = 3):
    for attempt in range(max_retries):
        try:
            result = self.tools[action](input)
            return result
        except Exception as e:
            if attempt == max_retries - 1:
                return f"Failed after {max_retries} attempts: {e}"
            # Let agent know about failure
            return f"Attempt {attempt + 1} failed: {e}. You can retry."
 

4. Observation Limits

Truncate long observations:

python

def _truncate_observation(self, obs: str, max_length: int = 2000) -> str:
    if len(obs) <= max_length:
        return obs
    
    return obs[:max_length] + f"\n... (truncated, {len(obs) - max_length} chars omitted)"
 

Complete Production Example

python

from hopx import Sandbox
import openai
import json
from datetime import datetime
 
class ProductionReActAgent:
    def __init__(self):
        self.client = openai.OpenAI()
        self.max_steps = 20
        self.trace = []
    
    def run(self, task: str, tools: dict) -> dict:
        """Run ReAct loop and return structured result"""
        
        self.trace = []
        start_time = datetime.now()
        
        # Create sandbox if code execution needed
        sandbox = None
        if "run_code" in tools:
            sandbox = Sandbox.create(template="code-interpreter")
            tools["run_code"] = lambda code: self._safe_execute(sandbox, code)
        
        try:
            messages = [
                {"role": "system", "content": self._system_prompt(tools)},
                {"role": "user", "content": task}
            ]
            
            for step in range(self.max_steps):
                # Get next action
                response = self.client.chat.completions.create(
                    model="gpt-4o",
                    messages=messages,
                    response_format={"type": "json_object"}
                )
                
                output = json.loads(response.choices[0].message.content)
                
                self.trace.append({
                    "step": step + 1,
                    "thought": output.get("thought"),
                    "action": output.get("action"),
                    "timestamp": datetime.now().isoformat()
                })
                
                # Check for completion
                if "final_answer" in output:
                    return {
                        "success": True,
                        "answer": output["final_answer"],
                        "steps": step + 1,
                        "duration": (datetime.now() - start_time).seconds,
                        "trace": self.trace
                    }
                
                # Execute action
                action = output.get("action", {})
                tool_name = action.get("tool")
                tool_input = action.get("input")
                
                if tool_name not in tools:
                    observation = f"Error: Unknown tool '{tool_name}'"
                else:
                    try:
                        observation = str(tools[tool_name](tool_input))
                    except Exception as e:
                        observation = f"Error: {e}"
                
                # Truncate long observations
                if len(observation) > 3000:
                    observation = observation[:3000] + "... (truncated)"
                
                self.trace[-1]["observation"] = observation
                
                # Add to messages
                messages.append({
                    "role": "assistant",
                    "content": json.dumps(output)
                })
                messages.append({
                    "role": "user", 
                    "content": f"Observation: {observation}"
                })
            
            return {
                "success": False,
                "error": "Max steps reached",
                "steps": self.max_steps,
                "trace": self.trace
            }
        
        finally:
            if sandbox:
                sandbox.kill()
    
    def _system_prompt(self, tools: dict) -> str:
        tool_desc = "\n".join([
            f"- {name}: {func.__doc__ or 'No description'}"
            for name, func in tools.items()
        ])
        
        return f"""You are a ReAct agent. Think step by step, take actions, observe results.
 
Available tools:
{tool_desc}
 
Always respond with JSON:
{{
    "thought": "your reasoning about what to do next",
    "action": {{
        "tool": "tool_name",
        "input": "tool input"
    }}
}}
 
When you have the final answer:
{{
    "thought": "I now have enough information",
    "final_answer": "your complete answer"
}}
 
Be thorough but efficient. Verify important facts."""
    
    def _safe_execute(self, sandbox: Sandbox, code: str) -> str:
        """Safely execute code in sandbox"""
        sandbox.files.write("/app/code.py", code)
        result = sandbox.commands.run("timeout 30 python /app/code.py")
        
        if result.exit_code == 0:
            return result.stdout or "(No output)"
        elif result.exit_code == 124:
            return "Error: Execution timed out after 30 seconds"
        else:
            return f"Error:\n{result.stderr}"
 
 
# Usage
agent = ProductionReActAgent()
 
def web_search(query: str) -> str:
    """Search the web for information"""
    # Implement with your search API
    return f"Results for '{query}': ..."
 
def calculator(expression: str) -> str:
    """Evaluate a math expression"""
    return str(eval(expression))
 
def run_code(code: str) -> str:
    """Execute Python code safely"""
    pass  # Handled by agent
 
result = agent.run(
    task="What is the GDP per capita of the top 3 economies?",
    tools={
        "search": web_search,
        "calculate": calculator,
        "run_code": run_code
    }
)
 
print(f"Answer: {result['answer']}")
print(f"Steps: {result['steps']}")
print(f"Duration: {result['duration']}s")
 
# Inspect trace for debugging
for step in result['trace']:
    print(f"\nStep {step['step']}:")
    print(f"  Thought: {step['thought']}")
    print(f"  Action: {step.get('action')}")
    print(f"  Observation: {step.get('observation', '')[:100]}...")
 

Conclusion

ReAct is one of the most practical agentic patterns:

Transparent reasoning — See exactly what the agent is thinking
Grounded actions — Decisions based on real observations
Adaptive execution — Adjusts approach based on results
Easy debugging — Full trace of thought-action-observation

Start with basic ReAct for any task requiring tools. Add reflection for complex reasoning. Use structured output for production reliability.

The agent that thinks before acting outperforms the agent that acts blindly. Every time.

Ready to build ReAct agents with code execution? Get started with HopX — sandboxes that let your agents think, act, and observe safely.

ReAct Pattern: Combining Reasoning and Acting in AI Agents

ReAct Pattern: Combining Reasoning and Acting in AI Agents

What Is ReAct?

Why ReAct Works

1. Explicit Reasoning Catches Errors

2. Grounded in Observations

3. Flexible Problem Solving

Basic ReAct Implementation

ReAct with Code Execution

ReAct Variations

1. ReAct with Self-Reflection

2. ReAct with Critique

3. Parallel ReAct

ReAct vs Other Patterns

ReAct vs Chain-of-Thought

ReAct vs Plan-and-Execute

When to Use ReAct

Production Considerations

1. Structured Output for Parsing

2. Token Management

3. Error Recovery

4. Observation Limits

Complete Production Example

Conclusion

Further Reading

Related articles

Evaluator-Optimizer Loop: Continuous AI Agent Improvement

Human-in-the-Loop: Balancing AI Autonomy and Human Control

Memory for AI Agents: Short-term, Long-term, and RAG

1	# Silent agent (problematic)
2	Action: search("Python web frameworks")
3	Action: search("Django vs Flask")
4	Action: write_report(...) # Report may be wrong, hard to debug
5
6	# ReAct agent (transparent)
7	Thought: I need to find popular Python web frameworks
8	Action: search("Python web frameworks 2024")
9	Observation: Django, Flask, FastAPI are the top 3...
10
11	Thought: I should compare these frameworks. Let me focus on FastAPI
12	since it wasn't in my training data and is newest.
13	Action: search("FastAPI features performance")
14	Observation: FastAPI is async-first, automatic OpenAPI docs...
15
16	Thought: Now I have enough information to write a comparison.
17	Action: write_report(...) # Each step is traceable
18

1	# Chain-of-Thought (can hallucinate)
2	Thought: The capital of Australia is Sydney because it's the largest city.
3	Answer: Sydney # WRONG
4
5	# ReAct (grounded in facts)
6	Thought: I should verify the capital of Australia
7	Action: search("capital of Australia")
8	Observation: The capital of Australia is Canberra, not Sydney...
9	Thought: The observation confirms it's Canberra, not Sydney as I initially thought.
10	Answer: Canberra # CORRECT
11

1	Thought: I'll try to get the weather from the primary API
2	Action: api_call("weather.primary.com/london")
3	Observation: Error 503 - Service unavailable
4
5	Thought: Primary API is down. I'll try the backup API instead.
6	Action: api_call("weather.backup.com/london")
7	Observation: {"temp": 15, "condition": "cloudy"}
8
9	Thought: Got the weather from backup. I can now answer the question.
10

1	import openai
2	import json
3	import re
4
5	class ReActAgent:
6	def __init__(self, tools: dict):
7	self.client = openai.OpenAI()
8	self.tools = tools
9	self.max_steps = 10
10
11	def run(self, task: str) -> str:
12	"""Execute task using ReAct loop"""
13
14	prompt = self._build_initial_prompt(task)
15
16	for step in range(self.max_steps):
17	# Get next thought and action from LLM
18	response = self.client.chat.completions.create(
19	model="gpt-4o",
20	messages=[{"role": "user", "content": prompt}],
21	stop=["Observation:"] # Stop before observation
22	)
23
24	output = response.choices[0].message.content
25	prompt += output
26
27	# Check if agent wants to finish
28	if "Final Answer:" in output:
29	return self._extract_final_answer(output)
30
31	# Extract and execute action
32	action, action_input = self._parse_action(output)
33
34	if action is None:
35	prompt += "\nObservation: Could not parse action. Please use format 'Action: tool_name[input]'\n"
36	continue
37
38	# Execute the action
39	observation = self._execute_action(action, action_input)
40	prompt += f"\nObservation: {observation}\n"
41
42	return "Max steps reached without finding answer."
43
44	def _build_initial_prompt(self, task: str) -> str:
45	tool_descriptions = "\n".join([
46	f"- {name}: {func.__doc__ or 'No description'}"
47	for name, func in self.tools.items()
48	])
49
50	return f"""Answer the following question using the available tools.
51
52	Available tools:
53	{tool_descriptions}
54
55	Use this format:
56
57	Thought: [Your reasoning about what to do next]
58	Action: tool_name[input]
59	Observation: [Result of the action - will be provided]
60	... (repeat Thought/Action/Observation as needed)
61	Thought: I now have enough information to answer.
62	Final Answer: [Your final answer]
63
64	Question: {task}
65
66	"""
67
68	def _parse_action(self, text: str) -> tuple:
69	"""Extract action and input from LLM output"""
70	# Match pattern: Action: tool_name[input]
71	match = re.search(r'Action:\s(\w+)\[([^\]])\]', text)
72	if match:
73	return match.group(1), match.group(2)
74
75	# Alternative format: Action: tool_name("input")
76	match = re.search(r'Action:\s(\w+)\("([^"])"\)', text)
77	if match:
78	return match.group(1), match.group(2)
79
80	return None, None
81
82	def _execute_action(self, action: str, action_input: str) -> str:
83	"""Execute the specified action"""
84	if action not in self.tools:
85	return f"Error: Unknown tool '{action}'. Available: {list(self.tools.keys())}"
86
87	try:
88	result = self.tools[action](action_input)
89	return str(result)
90	except Exception as e:
91	return f"Error executing {action}: {str(e)}"
92
93	def _extract_final_answer(self, text: str) -> str:
94	"""Extract the final answer from output"""
95	match = re.search(r'Final Answer:\s*(.+)', text, re.DOTALL)
96	if match:
97	return match.group(1).strip()
98	return text
99
100
101	# Define tools
102	def search(query: str) -> str:
103	"""Search the web for information"""
104	# In production, use a real search API
105	return f"Search results for '{query}': ..."
106
107	def calculate(expression: str) -> str:
108	"""Evaluate a mathematical expression"""
109	try:
110	return str(eval(expression))
111	except:
112	return "Error: Could not evaluate expression"
113
114	def lookup(term: str) -> str:
115	"""Look up a term in the knowledge base"""
116	knowledge = {
117	"python": "A high-level programming language",
118	"react": "A JavaScript library for building UIs",
119	}
120	return knowledge.get(term.lower(), f"No entry found for '{term}'")
121
122
123	# Usage
124	agent = ReActAgent(tools={
125	"search": search,
126	"calculate": calculate,
127	"lookup": lookup
128	})
129
130	result = agent.run("What is 25% of the population of France?")
131	print(result)
132

1	Thought: I need to find the population of France first, then calculate 25% of it.
2	Action: search[population of France 2024]
3	Observation: The population of France is approximately 68 million people.
4
5	Thought: Now I can calculate 25% of 68 million.
6	Action: calculate[68000000 * 0.25]
7	Observation: 17000000.0
8
9	Thought: I now have the answer.
10	Final Answer: 25% of France's population is 17 million people.
11

1	from hopx import Sandbox
2	import openai
3	import re
4
5	class CodeReActAgent:
6	def __init__(self):
7	self.client = openai.OpenAI()
8	self.sandbox = None
9	self.max_steps = 15
10
11	def run(self, task: str) -> str:
12	# Create sandbox for the session
13	self.sandbox = Sandbox.create(template="code-interpreter")
14
15	try:
16	return self._react_loop(task)
17	finally:
18	self.sandbox.kill()
19
20	def _react_loop(self, task: str) -> str:
21	prompt = self._build_prompt(task)
22
23	for step in range(self.max_steps):
24	response = self.client.chat.completions.create(
25	model="gpt-4o",
26	messages=[{"role": "user", "content": prompt}],
27	stop=["Observation:"]
28	)
29
30	output = response.choices[0].message.content
31	prompt += output
32
33	print(f"\n--- Step {step + 1} ---")
34	print(output)
35
36	if "Final Answer:" in output:
37	return self._extract_answer(output)
38
39	# Parse action
40	action_type, action_content = self._parse_action(output)
41
42	if action_type == "python":
43	observation = self._run_code(action_content)
44	elif action_type == "bash":
45	observation = self._run_bash(action_content)
46	elif action_type == "read_file":
47	observation = self._read_file(action_content)
48	elif action_type == "write_file":
49	path, content = action_content.split("\|\|\|", 1)
50	observation = self._write_file(path.strip(), content.strip())
51	else:
52	observation = f"Unknown action type: {action_type}"
53
54	prompt += f"\nObservation: {observation}\n"
55	print(f"Observation: {observation[:500]}...")
56
57	return "Max steps reached"
58
59	def _build_prompt(self, task: str) -> str:
60	return f"""You are an AI assistant that solves tasks by writing and executing code.
61
62	Available actions:
63	- python[code]: Execute Python code
64	- bash[command]: Run a bash command
65	- read_file[path]: Read a file
66	- write_file[path\|\|\|content]: Write content to a file
67
68	Format:
69	Thought: [Your reasoning]
70	Action: action_type[content]
71	Observation: [Will be provided]
72
73	Rules:
74	- Always think before acting
75	- Check observations carefully for errors
76	- Use print() to see outputs in Python
77	- When done, provide Final Answer:
78
79	Task: {task}
80
81	"""
82
83	def _parse_action(self, text: str) -> tuple:
84	# Match: Action: python[...]
85	match = re.search(r'Action:\s(\w+)\[(.?)\](?:\s*$\|\n)', text, re.DOTALL)
86	if match:
87	return match.group(1), match.group(2)
88
89	# Match multi-line code blocks
90	match = re.search(r'Action:\s(\w+)\[(.)\]', text, re.DOTALL)
91	if match:
92	return match.group(1), match.group(2)
93
94	return None, None
95
96	def _run_code(self, code: str) -> str:
97	self.sandbox.files.write("/app/script.py", code)
98	result = self.sandbox.commands.run("python /app/script.py", timeout=30)
99
100	if result.exit_code == 0:
101	return result.stdout if result.stdout else "(No output)"
102	else:
103	return f"Error (exit code {result.exit_code}):\n{result.stderr}"
104
105	def _run_bash(self, command: str) -> str:
106	result = self.sandbox.commands.run(command, timeout=30)
107	output = result.stdout + result.stderr
108	return output if output else "(No output)"
109
110	def _read_file(self, path: str) -> str:
111	try:
112	return self.sandbox.files.read(path)
113	except Exception as e:
114	return f"Error reading file: {e}"
115
116	def _write_file(self, path: str, content: str) -> str:
117	try:
118	self.sandbox.files.write(path, content)
119	return f"Successfully wrote to {path}"
120	except Exception as e:
121	return f"Error writing file: {e}"
122
123	def _extract_answer(self, text: str) -> str:
124	match = re.search(r'Final Answer:\s*(.+)', text, re.DOTALL)
125	return match.group(1).strip() if match else text
126
127
128	# Usage
129	agent = CodeReActAgent()
130
131	result = agent.run("""
132	Download the iris dataset, train a simple classifier,
133	and report the accuracy.
134	""")
135
136	print(f"\nFinal Result:\n{result}")
137

1	--- Step 1 ---
2	Thought: I need to download the iris dataset and train a classifier.
3	I'll use scikit-learn which has the iris dataset built-in.
4	Action: python[
5	from sklearn.datasets import load_iris
6	from sklearn.model_selection import train_test_split
7	from sklearn.ensemble import RandomForestClassifier
8	from sklearn.metrics import accuracy_score
9
10	# Load data
11	iris = load_iris()
12	X, y = iris.data, iris.target
13
14	# Split
15	X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2)
16
17	# Train
18	clf = RandomForestClassifier(n_estimators=100)
19	clf.fit(X_train, y_train)
20
21	# Evaluate
22	predictions = clf.predict(X_test)
23	accuracy = accuracy_score(y_test, predictions)
24	print(f"Accuracy: {accuracy:.2%}")
25	]
26	Observation: Accuracy: 96.67%
27
28	--- Step 2 ---
29	Thought: The classifier trained successfully with 96.67% accuracy.
30	I have enough information to provide the final answer.
31	Final Answer: I trained a Random Forest classifier on the Iris dataset
32	and achieved 96.67% accuracy on the test set.
33

1	def _build_prompt_with_reflection(self, task: str) -> str:
2	return f"""...
3	Format:
4	Thought: [Your reasoning about what to do]
5	Action: action_type[content]
6	Observation: [Result - will be provided]
7	Reflection: [What did I learn? Did it work? What should I do next?]
8
9	..."""
10

1	class CriticalReActAgent(ReActAgent):
2	def _react_step(self, thought: str, action: str) -> str:
3	# First, critique the proposed action
4	critique = self._critique_action(thought, action)
5
6	if "PROBLEM:" in critique:
7	# Revise action based on critique
8	revised = self._revise_action(thought, action, critique)
9	return self._execute_action(*self._parse_action(revised))
10
11	return self._execute_action(*self._parse_action(action))
12
13	def _critique_action(self, thought: str, action: str) -> str:
14	response = self.client.chat.completions.create(
15	model="gpt-4o",
16	messages=[{
17	"role": "user",
18	"content": f"""Critique this action. Is it correct and safe?
19
20	Thought: {thought}
21	Action: {action}
22
23	If there's a problem, start with "PROBLEM:" and explain.
24	If it looks good, say "APPROVED"."""
25	}]
26	)
27	return response.choices[0].message.content
28

1	import asyncio
2
3	async def parallel_react(task: str, perspectives: list[str]) -> str:
4	"""Run multiple ReAct agents with different perspectives"""
5
6	async def run_with_perspective(perspective: str):
7	prompt = f"You are a {perspective}. {task}"
8	return await react_agent.run_async(prompt)
9
10	results = await asyncio.gather(*[
11	run_with_perspective(p) for p in perspectives
12	])
13
14	# Merge results
15	return merge_answers(results)
16
17	# Usage
18	answer = await parallel_react(
19	"What are the pros and cons of microservices?",
20	perspectives=["software architect", "DevOps engineer", "developer"]
21	)
22

1	def _build_prompt_structured(self, task: str) -> str:
2	return f"""...
3	Respond in JSON format:
4	{{
5	"thought": "your reasoning",
6	"action": {{
7	"tool": "tool_name",
8	"input": "tool input"
9	}}
10	}}
11
12	Or when finished:
13	{{
14	"thought": "final reasoning",
15	"final_answer": "your answer"
16	}}
17	..."""
18

1	class TokenAwareReActAgent:
2	def __init__(self, max_context_tokens: int = 8000):
3	self.max_tokens = max_context_tokens
4	self.history = []
5
6	def _manage_context(self, prompt: str) -> str:
7	"""Trim history if context too long"""
8	estimated_tokens = len(prompt) // 4
9
10	if estimated_tokens > self.max_tokens:
11	# Keep first (task) and last N steps
12	self.history = self.history[:1] + self.history[-5:]
13	prompt = self._rebuild_prompt()
14
15	return prompt
16

1	def _execute_with_recovery(self, action: str, input: str, max_retries: int = 3):
2	for attempt in range(max_retries):
3	try:
4	result = self.tools[action](input)
5	return result
6	except Exception as e:
7	if attempt == max_retries - 1:
8	return f"Failed after {max_retries} attempts: {e}"
9	# Let agent know about failure
10	return f"Attempt {attempt + 1} failed: {e}. You can retry."
11

1	def _truncate_observation(self, obs: str, max_length: int = 2000) -> str:
2	if len(obs) <= max_length:
3	return obs
4
5	return obs[:max_length] + f"\n... (truncated, {len(obs) - max_length} chars omitted)"
6