LlamaIndex + HopX: Building RAG Agents with Code Execution

LlamaIndex excels at Retrieval-Augmented Generation—connecting LLMs to your data. But what happens when the answer isn't in your documents? What if the LLM needs to compute something?

That's where code execution comes in. This tutorial shows how to build LlamaIndex agents that can both retrieve information AND execute Python code to analyze, calculate, and visualize.

The Power of RAG + Code

text

1	┌─────────────────────────────────────────────────────────────────┐
2	│ User: "What was our Q3 revenue and how does it compare to │
3	│ the industry average growth rate?" │
4	└─────────────────────────────────────────────────────────────────┘
5	│
6	▼
7	┌─────────────────────────────────────────────────────────────────┐
8	│ LlamaIndex Agent │
9	│ │
10	│ 1. Query Vector Index → "Q3 revenue was $2.4M" │
11	│ 2. Query Vector Index → "Industry avg growth is 12%" │
12	│ 3. Execute Python → Calculate comparison, growth rate │
13	│ 4. Generate Response → Synthesize with computed values │
14	└─────────────────────────────────────────────────────────────────┘
15	│
16	▼
17	┌─────────────────────────────────────────────────────────────────┐
18	│ Answer: "Q3 revenue was $2.4M, representing 18% YoY growth. │
19	│ This outperforms the industry average of 12% by 6 percentage │
20	│ points, ranking us in the top quartile of our sector." │
21	└─────────────────────────────────────────────────────────────────┘
22

Prerequisites

bash

1	pip install llama-index llama-index-llms-openai llama-index-embeddings-openai hopx-ai
2

Set environment variables:

bash

1	export OPENAI_API_KEY="sk-..."
2	export HOPX_API_KEY="..."
3

Step 1: Create the Code Execution Tool

Build a LlamaIndex-compatible tool for sandboxed execution:

python

from llama_index.core.tools import FunctionTool
from hopx import Sandbox
from typing import Optional
 
def execute_python(code: str) -> str:
    """
    Execute Python code in an isolated sandbox.
    
    Use this tool when you need to:
    - Perform calculations or mathematical operations
    - Analyze data with pandas
    - Create visualizations
    - Process or transform data
    
    Args:
        code: Python code to execute. Must be complete and runnable.
              Always use print() to output results.
    
    Returns:
        The output from code execution or error message.
    """
    sandbox = None
    try:
        sandbox = Sandbox.create(template="code-interpreter")
        result = sandbox.runCode(code, language="python", timeout=60)
        
        if result.exitCode == 0:
            return result.stdout or "Code executed successfully (no output)"
        else:
            return f"Error: {result.stderr}"
    except Exception as e:
        return f"Execution failed: {str(e)}"
    finally:
        if sandbox:
            sandbox.kill()
 
 
# Create LlamaIndex tool
python_tool = FunctionTool.from_defaults(
    fn=execute_python,
    name="python_executor",
    description="""Execute Python code in a secure sandbox.
Use for calculations, data analysis, and any computational task.
The sandbox has pandas, numpy, matplotlib, scipy installed.
Always print() results you want to see."""
)
 

Step 2: Build a RAG Index

Create a simple vector index from documents:

python

from llama_index.core import VectorStoreIndex, Document, Settings
from llama_index.llms.openai import OpenAI
from llama_index.embeddings.openai import OpenAIEmbedding
 
# Configure LlamaIndex
Settings.llm = OpenAI(model="gpt-4o", temperature=0)
Settings.embed_model = OpenAIEmbedding(model="text-embedding-3-small")
 
# Sample documents (replace with your data)
documents = [
    Document(text="""
    Q3 2024 Financial Report
    
    Revenue: $2.4 million
    Operating Expenses: $1.8 million
    Net Profit: $600,000
    
    Year-over-year revenue growth: 18%
    Customer acquisition: 450 new customers
    Churn rate: 3.2%
    
    Key metrics:
    - Average revenue per user (ARPU): $89
    - Customer lifetime value (LTV): $2,340
    - Customer acquisition cost (CAC): $156
    """),
    
    Document(text="""
    Industry Benchmarks 2024
    
    SaaS Industry Average Metrics:
    - Revenue growth: 12% YoY
    - Churn rate: 5.2%
    - ARPU: $75
    - LTV/CAC ratio: 3:1
    
    Top quartile performance:
    - Revenue growth: >15%
    - Churn rate: <3%
    - LTV/CAC ratio: >4:1
    """),
    
    Document(text="""
    Customer Segments Analysis
    
    Enterprise (>1000 employees):
    - 45 customers
    - $450 ARPU
    - 1.5% churn
    
    Mid-Market (100-1000 employees):
    - 180 customers  
    - $120 ARPU
    - 2.8% churn
    
    SMB (<100 employees):
    - 675 customers
    - $45 ARPU
    - 4.1% churn
    """)
]
 
# Create index
index = VectorStoreIndex.from_documents(documents)
 

Step 3: Create a Query Engine Tool

Wrap the index as a tool the agent can use:

python

from llama_index.core.tools import QueryEngineTool
 
# Create query engine
query_engine = index.as_query_engine(similarity_top_k=3)
 
# Wrap as tool
rag_tool = QueryEngineTool.from_defaults(
    query_engine=query_engine,
    name="company_knowledge",
    description="""Search the company knowledge base for information about:
- Financial metrics and reports
- Industry benchmarks
- Customer segments
- Performance data
 
Use this to find specific facts before doing calculations."""
)
 

Step 4: Build the Agent

Combine RAG and code execution in an agent:

python

from llama_index.core.agent import ReActAgent
 
# Create agent with both tools
agent = ReActAgent.from_tools(
    tools=[rag_tool, python_tool],
    llm=Settings.llm,
    verbose=True,
    max_iterations=10
)
 
# Test it
response = agent.chat(
    "What was our Q3 revenue and how does it compare to industry average? "
    "Calculate the exact percentage difference."
)
 
print(response)
 

Example output:

text

Thought: I need to find our Q3 revenue and the industry average, then calculate the comparison.
 
Action: company_knowledge
Action Input: {"input": "Q3 2024 revenue"}
Observation: Q3 revenue was $2.4 million with 18% YoY growth...
 
Action: company_knowledge  
Action Input: {"input": "industry average revenue growth"}
Observation: SaaS industry average revenue growth is 12% YoY...
 
Action: python_executor
Action Input: {"code": "our_growth = 18\nindustry_avg = 12\ndiff = our_growth - industry_avg\npercentage_better = (diff / industry_avg) * 100\nprint(f'Difference: {diff} percentage points')\nprint(f'We outperform by: {percentage_better:.1f}%')"}
Observation: Difference: 6 percentage points
We outperform by: 50.0%
 
Answer: Our Q3 revenue was $2.4 million with 18% year-over-year growth. 
Compared to the industry average of 12%, we outperform by 6 percentage points, 
which represents a 50% better growth rate than the industry benchmark.
 

Step 5: Persistent Sandbox for Complex Analysis

For multi-step analyses, use a persistent sandbox:

python

from llama_index.core.tools import FunctionTool
from hopx import Sandbox
from typing import Optional
 
class PersistentSandbox:
    """Manage a persistent sandbox for multi-step analysis."""
    
    _instance: Optional['PersistentSandbox'] = None
    
    def __init__(self):
        self.sandbox: Optional[Sandbox] = None
    
    @classmethod
    def get(cls) -> 'PersistentSandbox':
        if cls._instance is None:
            cls._instance = cls()
        return cls._instance
    
    def execute(self, code: str) -> str:
        if self.sandbox is None:
            self.sandbox = Sandbox.create(template="code-interpreter", ttl=600)
        
        result = self.sandbox.runCode(code, language="python", timeout=60)
        
        if result.exitCode == 0:
            return result.stdout or "Executed (no output)"
        return f"Error: {result.stderr}"
    
    def cleanup(self):
        if self.sandbox:
            self.sandbox.kill()
            self.sandbox = None
 
 
def execute_python_persistent(code: str) -> str:
    """
    Execute Python with persistent state.
    Variables and imports persist between calls.
    """
    return PersistentSandbox.get().execute(code)
 
 
persistent_python = FunctionTool.from_defaults(
    fn=execute_python_persistent,
    name="python_persistent",
    description="""Execute Python code with PERSISTENT STATE.
Variables, DataFrames, and imports persist between calls.
Use this for multi-step analysis where you need to build on previous results.
"""
)
 

Step 6: Data Analysis Agent

Build a specialized agent for data analysis:

python

from llama_index.core.agent import ReActAgent
from llama_index.core.tools import FunctionTool
from hopx import Sandbox
import json
 
# Data upload tool
def upload_data(filename: str, data: str) -> str:
    """
    Upload CSV data to the sandbox for analysis.
    
    Args:
        filename: Name for the file (e.g., 'sales.csv')
        data: CSV content as a string
    """
    sandbox = PersistentSandbox.get()
    if sandbox.sandbox is None:
        sandbox.sandbox = Sandbox.create(template="code-interpreter", ttl=600)
    
    sandbox.sandbox.files.write(f"/app/{filename}", data)
    return f"Uploaded {filename} to /app/{filename}"
 
 
upload_tool = FunctionTool.from_defaults(
    fn=upload_data,
    name="upload_data",
    description="Upload CSV data to sandbox. Provide filename and CSV content."
)
 
 
# Create data analysis agent
data_agent = ReActAgent.from_tools(
    tools=[rag_tool, persistent_python, upload_tool],
    llm=Settings.llm,
    verbose=True,
    system_prompt="""You are a data analyst assistant.
 
When analyzing data:
1. First check if relevant context exists in the knowledge base
2. Upload data files as needed using upload_data
3. Use python_persistent for multi-step analysis (state persists!)
4. Always show your calculations and explain your methodology
5. Create visualizations when helpful (save to /app/chart.png)
 
For calculations, always use Python to ensure accuracy."""
)
 
 
# Example usage
response = data_agent.chat("""
Here's our monthly revenue data:
 
month,revenue,customers
Jan,180000,520
Feb,195000,545
Mar,210000,580
Apr,225000,610
May,240000,650
Jun,260000,695
 
Upload this data and analyze:
1. Calculate month-over-month growth rates
2. What's the average growth rate?
3. Project July revenue based on the trend
4. Compare to industry benchmark from our knowledge base
""")
 
print(response)
 

Advanced: Sub-Question Query Engine

For complex queries, break them into sub-questions:

python

from llama_index.core.query_engine import SubQuestionQueryEngine
from llama_index.core.tools import QueryEngineTool, ToolMetadata
 
# Multiple specialized indices
financial_index = VectorStoreIndex.from_documents(financial_docs)
customer_index = VectorStoreIndex.from_documents(customer_docs)
market_index = VectorStoreIndex.from_documents(market_docs)
 
# Create query engine tools
query_engine_tools = [
    QueryEngineTool(
        query_engine=financial_index.as_query_engine(),
        metadata=ToolMetadata(
            name="financial_data",
            description="Financial reports, revenue, expenses, profits"
        )
    ),
    QueryEngineTool(
        query_engine=customer_index.as_query_engine(),
        metadata=ToolMetadata(
            name="customer_data", 
            description="Customer segments, churn, acquisition metrics"
        )
    ),
    QueryEngineTool(
        query_engine=market_index.as_query_engine(),
        metadata=ToolMetadata(
            name="market_data",
            description="Industry benchmarks, competitor analysis, market trends"
        )
    )
]
 
# Create sub-question query engine
sub_question_engine = SubQuestionQueryEngine.from_defaults(
    query_engine_tools=query_engine_tools
)
 
# Wrap as tool for agent
sub_question_tool = QueryEngineTool.from_defaults(
    query_engine=sub_question_engine,
    name="comprehensive_search",
    description="""Search across all company data sources.
Use for complex questions that span multiple topics.
Automatically breaks down into sub-questions."""
)
 
# Create powerful agent
comprehensive_agent = ReActAgent.from_tools(
    tools=[sub_question_tool, persistent_python],
    llm=Settings.llm,
    verbose=True
)
 

Multi-Document Analysis with Code

Analyze documents and compute insights:

python

from llama_index.core import SimpleDirectoryReader
from llama_index.core.node_parser import SentenceSplitter
 
# Load documents
documents = SimpleDirectoryReader("./data/reports/").load_data()
 
# Parse into nodes
parser = SentenceSplitter(chunk_size=512, chunk_overlap=50)
nodes = parser.get_nodes_from_documents(documents)
 
# Create index
index = VectorStoreIndex(nodes)
 
# Agent for document analysis
doc_analysis_agent = ReActAgent.from_tools(
    tools=[
        QueryEngineTool.from_defaults(
            query_engine=index.as_query_engine(),
            name="document_search",
            description="Search uploaded documents for information"
        ),
        persistent_python
    ],
    llm=Settings.llm,
    verbose=True,
    system_prompt="""You are a document analysis agent.
 
Your workflow:
1. Search documents to extract relevant data points
2. Use Python to compute statistics, comparisons, trends
3. Always verify calculations by showing the code
4. Provide data-driven conclusions
 
When extracting numbers from documents, use Python to validate and compute."""
)
 

Structured Output with Code Validation

Ensure accuracy by validating with code:

python

from llama_index.core.tools import FunctionTool
from pydantic import BaseModel
from typing import List
 
class FinancialAnalysis(BaseModel):
    revenue: float
    growth_rate: float
    profit_margin: float
    industry_comparison: str
    recommendations: List[str]
 
 
def validated_analysis(query: str) -> str:
    """
    Perform financial analysis with code validation.
    
    Retrieves data, computes metrics in sandbox, returns validated results.
    """
    sandbox = PersistentSandbox.get()
    
    # Step 1: Query for raw data
    raw_data = query_engine.query(query)
    
    # Step 2: Validate and compute in sandbox
    validation_code = f'''
import json
 
# Parse extracted values (from RAG)
raw_text = """{raw_data}"""
 
# Extract and validate numbers
import re
numbers = re.findall(r'\$?([\d,]+(?:\.\d+)?)\s*(?:million|M)?', raw_text)
numbers = [float(n.replace(',', '')) for n in numbers]
 
# Compute derived metrics
if len(numbers) >= 2:
    revenue = numbers[0]
    if 'million' in raw_text.lower():
        revenue *= 1_000_000
    
    # Calculate metrics
    analysis = {{
        "revenue": revenue,
        "extracted_values": numbers,
        "validation": "passed" if revenue > 0 else "failed"
    }}
    print(json.dumps(analysis, indent=2))
else:
    print(json.dumps({{"error": "Could not extract values"}}))
'''
    
    result = sandbox.execute(validation_code)
    return result
 
 
validation_tool = FunctionTool.from_defaults(
    fn=validated_analysis,
    name="validated_financial_analysis",
    description="Perform validated financial analysis with code verification"
)
 

Complete Working Example

Here's a production-ready implementation:

python

"""
LlamaIndex RAG Agent with HopX Code Execution
"""
 
from llama_index.core import VectorStoreIndex, Document, Settings
from llama_index.core.agent import ReActAgent
from llama_index.core.tools import FunctionTool, QueryEngineTool
from llama_index.llms.openai import OpenAI
from llama_index.embeddings.openai import OpenAIEmbedding
from hopx import Sandbox
from typing import Optional
import os
 
# Verify environment
assert os.environ.get("OPENAI_API_KEY"), "Set OPENAI_API_KEY"
assert os.environ.get("HOPX_API_KEY"), "Set HOPX_API_KEY"
 
# Configure LlamaIndex
Settings.llm = OpenAI(model="gpt-4o", temperature=0)
Settings.embed_model = OpenAIEmbedding()
 
 
class SandboxManager:
    """Singleton sandbox manager."""
    _sandbox: Optional[Sandbox] = None
    
    @classmethod
    def execute(cls, code: str) -> str:
        if cls._sandbox is None:
            cls._sandbox = Sandbox.create(template="code-interpreter", ttl=600)
        result = cls._sandbox.runCode(code, language="python", timeout=60)
        return result.stdout if result.exitCode == 0 else f"Error: {result.stderr}"
    
    @classmethod
    def cleanup(cls):
        if cls._sandbox:
            cls._sandbox.kill()
            cls._sandbox = None
 
 
def python_executor(code: str) -> str:
    """Execute Python code with persistent state."""
    return SandboxManager.execute(code)
 
 
def create_rag_agent(documents: list) -> ReActAgent:
    """Create a RAG agent with code execution."""
    
    # Build index
    index = VectorStoreIndex.from_documents(
        [Document(text=d) for d in documents]
    )
    
    # Tools
    tools = [
        QueryEngineTool.from_defaults(
            query_engine=index.as_query_engine(),
            name="knowledge_base",
            description="Search the knowledge base for information"
        ),
        FunctionTool.from_defaults(
            fn=python_executor,
            name="python",
            description="Execute Python for calculations. State persists."
        )
    ]
    
    return ReActAgent.from_tools(
        tools=tools,
        llm=Settings.llm,
        verbose=True,
        system_prompt="""You are an analytical assistant.
1. Search knowledge base for facts
2. Use Python for all calculations
3. Always verify numbers with code
4. Explain your methodology"""
    )
 
 
# Example usage
if __name__ == "__main__":
    docs = [
        "Q3 2024: Revenue $2.4M, Growth 18%, Profit margin 25%",
        "Industry benchmark: Average growth 12%, Top quartile >15%",
        "Customers: 900 total, 45 enterprise ($450 ARPU), 675 SMB ($45 ARPU)"
    ]
    
    agent = create_rag_agent(docs)
    
    try:
        response = agent.chat(
            "What's our revenue per customer segment? "
            "Calculate the contribution of each segment."
        )
        print("\n" + "="*50)
        print(response)
    finally:
        SandboxManager.cleanup()
 

Best Practices

1. Query First, Compute Second

python

# Good pattern:
# 1. Retrieve facts from RAG
# 2. Compute with Python
# 3. Synthesize response
 
# Don't hallucinate numbers - always verify with code
 

2. Use Persistent Sandbox for Multi-Step

python

# For complex analysis:
step1 = agent.chat("Load the sales data and show structure")
step2 = agent.chat("Calculate monthly averages")  # Uses same sandbox
step3 = agent.chat("Create visualization")  # State persists
 

3. Validate RAG Extractions

python

# After RAG retrieval, validate numbers:
validation_code = f"""
extracted_value = {value}
# Sanity checks
assert extracted_value > 0, "Value should be positive"
assert extracted_value < 1e12, "Value seems too large"
print(f"Validated: {extracted_value}")
"""
 

4. Clean Up Resources

python

try:
    result = agent.chat(query)
finally:
    SandboxManager.cleanup()
 

Conclusion

LlamaIndex + HopX enables agents that:

Retrieve facts from your documents
Compute accurate answers with Python
Validate numbers through code execution
Persist state for complex analyses

No more hallucinated calculations. Your agent can reason about data with the precision of code.

Ready to add code execution to your RAG app? Get started with HopX — sandboxes that spin up in 100ms.

LlamaIndex + HopX: Building RAG Agents with Code Execution

LlamaIndex + HopX: Building RAG Agents with Code Execution

The Power of RAG + Code

Prerequisites

Step 1: Create the Code Execution Tool

Step 2: Build a RAG Index

Step 3: Create a Query Engine Tool

Step 4: Build the Agent

Step 5: Persistent Sandbox for Complex Analysis

Step 6: Data Analysis Agent

Advanced: Sub-Question Query Engine

Multi-Document Analysis with Code

Structured Output with Code Validation

Complete Working Example

Best Practices

1. Query First, Compute Second

2. Use Persistent Sandbox for Multi-Step

3. Validate RAG Extractions

4. Clean Up Resources

Conclusion

Further Reading

Related articles

Microsoft Agent Framework with HopX: Secure Code Execution for AI Agents

Microsoft AutoGen with Isolated Code Execution Using HopX

CrewAI Multi-Agent Pipelines with Secure Code Execution

1	from llama_index.core.tools import FunctionTool
2	from hopx import Sandbox
3	from typing import Optional
4
5	def execute_python(code: str) -> str:
6	"""
7	Execute Python code in an isolated sandbox.
8
9	Use this tool when you need to:
10	- Perform calculations or mathematical operations
11	- Analyze data with pandas
12	- Create visualizations
13	- Process or transform data
14
15	Args:
16	code: Python code to execute. Must be complete and runnable.
17	Always use print() to output results.
18
19	Returns:
20	The output from code execution or error message.
21	"""
22	sandbox = None
23	try:
24	sandbox = Sandbox.create(template="code-interpreter")
25	result = sandbox.runCode(code, language="python", timeout=60)
26
27	if result.exitCode == 0:
28	return result.stdout or "Code executed successfully (no output)"
29	else:
30	return f"Error: {result.stderr}"
31	except Exception as e:
32	return f"Execution failed: {str(e)}"
33	finally:
34	if sandbox:
35	sandbox.kill()
36
37
38	# Create LlamaIndex tool
39	python_tool = FunctionTool.from_defaults(
40	fn=execute_python,
41	name="python_executor",
42	description="""Execute Python code in a secure sandbox.
43	Use for calculations, data analysis, and any computational task.
44	The sandbox has pandas, numpy, matplotlib, scipy installed.
45	Always print() results you want to see."""
46	)
47

1	from llama_index.core import VectorStoreIndex, Document, Settings
2	from llama_index.llms.openai import OpenAI
3	from llama_index.embeddings.openai import OpenAIEmbedding
4
5	# Configure LlamaIndex
6	Settings.llm = OpenAI(model="gpt-4o", temperature=0)
7	Settings.embed_model = OpenAIEmbedding(model="text-embedding-3-small")
8
9	# Sample documents (replace with your data)
10	documents = [
11	Document(text="""
12	Q3 2024 Financial Report
13
14	Revenue: $2.4 million
15	Operating Expenses: $1.8 million
16	Net Profit: $600,000
17
18	Year-over-year revenue growth: 18%
19	Customer acquisition: 450 new customers
20	Churn rate: 3.2%
21
22	Key metrics:
23	- Average revenue per user (ARPU): $89
24	- Customer lifetime value (LTV): $2,340
25	- Customer acquisition cost (CAC): $156
26	"""),
27
28	Document(text="""
29	Industry Benchmarks 2024
30
31	SaaS Industry Average Metrics:
32	- Revenue growth: 12% YoY
33	- Churn rate: 5.2%
34	- ARPU: $75
35	- LTV/CAC ratio: 3:1
36
37	Top quartile performance:
38	- Revenue growth: >15%
39	- Churn rate: <3%
40	- LTV/CAC ratio: >4:1
41	"""),
42
43	Document(text="""
44	Customer Segments Analysis
45
46	Enterprise (>1000 employees):
47	- 45 customers
48	- $450 ARPU
49	- 1.5% churn
50
51	Mid-Market (100-1000 employees):
52	- 180 customers
53	- $120 ARPU
54	- 2.8% churn
55
56	SMB (<100 employees):
57	- 675 customers
58	- $45 ARPU
59	- 4.1% churn
60	""")
61	]
62
63	# Create index
64	index = VectorStoreIndex.from_documents(documents)
65

1	from llama_index.core.tools import QueryEngineTool
2
3	# Create query engine
4	query_engine = index.as_query_engine(similarity_top_k=3)
5
6	# Wrap as tool
7	rag_tool = QueryEngineTool.from_defaults(
8	query_engine=query_engine,
9	name="company_knowledge",
10	description="""Search the company knowledge base for information about:
11	- Financial metrics and reports
12	- Industry benchmarks
13	- Customer segments
14	- Performance data
15
16	Use this to find specific facts before doing calculations."""
17	)
18

1	from llama_index.core.agent import ReActAgent
2
3	# Create agent with both tools
4	agent = ReActAgent.from_tools(
5	tools=[rag_tool, python_tool],
6	llm=Settings.llm,
7	verbose=True,
8	max_iterations=10
9	)
10
11	# Test it
12	response = agent.chat(
13	"What was our Q3 revenue and how does it compare to industry average? "
14	"Calculate the exact percentage difference."
15	)
16
17	print(response)
18

1	Thought: I need to find our Q3 revenue and the industry average, then calculate the comparison.
2
3	Action: company_knowledge
4	Action Input: {"input": "Q3 2024 revenue"}
5	Observation: Q3 revenue was $2.4 million with 18% YoY growth...
6
7	Action: company_knowledge
8	Action Input: {"input": "industry average revenue growth"}
9	Observation: SaaS industry average revenue growth is 12% YoY...
10
11	Action: python_executor
12	Action Input: {"code": "our_growth = 18\nindustry_avg = 12\ndiff = our_growth - industry_avg\npercentage_better = (diff / industry_avg) * 100\nprint(f'Difference: {diff} percentage points')\nprint(f'We outperform by: {percentage_better:.1f}%')"}
13	Observation: Difference: 6 percentage points
14	We outperform by: 50.0%
15
16	Answer: Our Q3 revenue was $2.4 million with 18% year-over-year growth.
17	Compared to the industry average of 12%, we outperform by 6 percentage points,
18	which represents a 50% better growth rate than the industry benchmark.
19

1	from llama_index.core.query_engine import SubQuestionQueryEngine
2	from llama_index.core.tools import QueryEngineTool, ToolMetadata
3
4	# Multiple specialized indices
5	financial_index = VectorStoreIndex.from_documents(financial_docs)
6	customer_index = VectorStoreIndex.from_documents(customer_docs)
7	market_index = VectorStoreIndex.from_documents(market_docs)
8
9	# Create query engine tools
10	query_engine_tools = [
11	QueryEngineTool(
12	query_engine=financial_index.as_query_engine(),
13	metadata=ToolMetadata(
14	name="financial_data",
15	description="Financial reports, revenue, expenses, profits"
16	)
17	),
18	QueryEngineTool(
19	query_engine=customer_index.as_query_engine(),
20	metadata=ToolMetadata(
21	name="customer_data",
22	description="Customer segments, churn, acquisition metrics"
23	)
24	),
25	QueryEngineTool(
26	query_engine=market_index.as_query_engine(),
27	metadata=ToolMetadata(
28	name="market_data",
29	description="Industry benchmarks, competitor analysis, market trends"
30	)
31	)
32	]
33
34	# Create sub-question query engine
35	sub_question_engine = SubQuestionQueryEngine.from_defaults(
36	query_engine_tools=query_engine_tools
37	)
38
39	# Wrap as tool for agent
40	sub_question_tool = QueryEngineTool.from_defaults(
41	query_engine=sub_question_engine,
42	name="comprehensive_search",
43	description="""Search across all company data sources.
44	Use for complex questions that span multiple topics.
45	Automatically breaks down into sub-questions."""
46	)
47
48	# Create powerful agent
49	comprehensive_agent = ReActAgent.from_tools(
50	tools=[sub_question_tool, persistent_python],
51	llm=Settings.llm,
52	verbose=True
53	)
54

1	from llama_index.core import SimpleDirectoryReader
2	from llama_index.core.node_parser import SentenceSplitter
3
4	# Load documents
5	documents = SimpleDirectoryReader("./data/reports/").load_data()
6
7	# Parse into nodes
8	parser = SentenceSplitter(chunk_size=512, chunk_overlap=50)
9	nodes = parser.get_nodes_from_documents(documents)
10
11	# Create index
12	index = VectorStoreIndex(nodes)
13
14	# Agent for document analysis
15	doc_analysis_agent = ReActAgent.from_tools(
16	tools=[
17	QueryEngineTool.from_defaults(
18	query_engine=index.as_query_engine(),
19	name="document_search",
20	description="Search uploaded documents for information"
21	),
22	persistent_python
23	],
24	llm=Settings.llm,
25	verbose=True,
26	system_prompt="""You are a document analysis agent.
27
28	Your workflow:
29	1. Search documents to extract relevant data points
30	2. Use Python to compute statistics, comparisons, trends
31	3. Always verify calculations by showing the code
32	4. Provide data-driven conclusions
33
34	When extracting numbers from documents, use Python to validate and compute."""
35	)
36

1	from llama_index.core.tools import FunctionTool
2	from pydantic import BaseModel
3	from typing import List
4
5	class FinancialAnalysis(BaseModel):
6	revenue: float
7	growth_rate: float
8	profit_margin: float
9	industry_comparison: str
10	recommendations: List[str]
11
12
13	def validated_analysis(query: str) -> str:
14	"""
15	Perform financial analysis with code validation.
16
17	Retrieves data, computes metrics in sandbox, returns validated results.
18	"""
19	sandbox = PersistentSandbox.get()
20
21	# Step 1: Query for raw data
22	raw_data = query_engine.query(query)
23
24	# Step 2: Validate and compute in sandbox
25	validation_code = f'''
26	import json
27
28	# Parse extracted values (from RAG)
29	raw_text = """{raw_data}"""
30
31	# Extract and validate numbers
32	import re
33	numbers = re.findall(r'\$?([\d,]+(?:\.\d+)?)\s*(?:million\|M)?', raw_text)
34	numbers = [float(n.replace(',', '')) for n in numbers]
35
36	# Compute derived metrics
37	if len(numbers) >= 2:
38	revenue = numbers[0]
39	if 'million' in raw_text.lower():
40	revenue *= 1_000_000
41
42	# Calculate metrics
43	analysis = {{
44	"revenue": revenue,
45	"extracted_values": numbers,
46	"validation": "passed" if revenue > 0 else "failed"
47	}}
48	print(json.dumps(analysis, indent=2))
49	else:
50	print(json.dumps({{"error": "Could not extract values"}}))
51	'''
52
53	result = sandbox.execute(validation_code)
54	return result
55
56
57	validation_tool = FunctionTool.from_defaults(
58	fn=validated_analysis,
59	name="validated_financial_analysis",
60	description="Perform validated financial analysis with code verification"
61	)
62

1	"""
2	LlamaIndex RAG Agent with HopX Code Execution
3	"""
4
5	from llama_index.core import VectorStoreIndex, Document, Settings
6	from llama_index.core.agent import ReActAgent
7	from llama_index.core.tools import FunctionTool, QueryEngineTool
8	from llama_index.llms.openai import OpenAI
9	from llama_index.embeddings.openai import OpenAIEmbedding
10	from hopx import Sandbox
11	from typing import Optional
12	import os
13
14	# Verify environment
15	assert os.environ.get("OPENAI_API_KEY"), "Set OPENAI_API_KEY"
16	assert os.environ.get("HOPX_API_KEY"), "Set HOPX_API_KEY"
17
18	# Configure LlamaIndex
19	Settings.llm = OpenAI(model="gpt-4o", temperature=0)
20	Settings.embed_model = OpenAIEmbedding()
21
22
23	class SandboxManager:
24	"""Singleton sandbox manager."""
25	_sandbox: Optional[Sandbox] = None
26
27	@classmethod
28	def execute(cls, code: str) -> str:
29	if cls._sandbox is None:
30	cls._sandbox = Sandbox.create(template="code-interpreter", ttl=600)
31	result = cls._sandbox.runCode(code, language="python", timeout=60)
32	return result.stdout if result.exitCode == 0 else f"Error: {result.stderr}"
33
34	@classmethod
35	def cleanup(cls):
36	if cls._sandbox:
37	cls._sandbox.kill()
38	cls._sandbox = None
39
40
41	def python_executor(code: str) -> str:
42	"""Execute Python code with persistent state."""
43	return SandboxManager.execute(code)
44
45
46	def create_rag_agent(documents: list) -> ReActAgent:
47	"""Create a RAG agent with code execution."""
48
49	# Build index
50	index = VectorStoreIndex.from_documents(
51	[Document(text=d) for d in documents]
52	)
53
54	# Tools
55	tools = [
56	QueryEngineTool.from_defaults(
57	query_engine=index.as_query_engine(),
58	name="knowledge_base",
59	description="Search the knowledge base for information"
60	),
61	FunctionTool.from_defaults(
62	fn=python_executor,
63	name="python",
64	description="Execute Python for calculations. State persists."
65	)
66	]
67
68	return ReActAgent.from_tools(
69	tools=tools,
70	llm=Settings.llm,
71	verbose=True,
72	system_prompt="""You are an analytical assistant.
73	1. Search knowledge base for facts
74	2. Use Python for all calculations
75	3. Always verify numbers with code
76	4. Explain your methodology"""
77	)
78
79
80	# Example usage
81	if __name__ == "__main__":
82	docs = [
83	"Q3 2024: Revenue $2.4M, Growth 18%, Profit margin 25%",
84	"Industry benchmark: Average growth 12%, Top quartile >15%",
85	"Customers: 900 total, 45 enterprise ($450 ARPU), 675 SMB ($45 ARPU)"
86	]
87
88	agent = create_rag_agent(docs)
89
90	try:
91	response = agent.chat(
92	"What's our revenue per customer segment? "
93	"Calculate the contribution of each segment."
94	)
95	print("\n" + "="*50)
96	print(response)
97	finally:
98	SandboxManager.cleanup()
99

1	# Good pattern:
2	# 1. Retrieve facts from RAG
3	# 2. Compute with Python
4	# 3. Synthesize response
5
6	# Don't hallucinate numbers - always verify with code
7

1	# For complex analysis:
2	step1 = agent.chat("Load the sales data and show structure")
3	step2 = agent.chat("Calculate monthly averages") # Uses same sandbox
4	step3 = agent.chat("Create visualization") # State persists
5

1	# After RAG retrieval, validate numbers:
2	validation_code = f"""
3	extracted_value = {value}
4	# Sanity checks
5	assert extracted_value > 0, "Value should be positive"
6	assert extracted_value < 1e12, "Value seems too large"
7	print(f"Validated: {extracted_value}")
8	"""
9

1	try:
2	result = agent.chat(query)
3	finally:
4	SandboxManager.cleanup()
5