LlamaIndex + HopX: Building RAG Agents with Code Execution
LlamaIndex excels at Retrieval-Augmented Generation—connecting LLMs to your data. But what happens when the answer isn't in your documents? What if the LLM needs to compute something?
That's where code execution comes in. This tutorial shows how to build LlamaIndex agents that can both retrieve information AND execute Python code to analyze, calculate, and visualize.
The Power of RAG + Code
| 1 | ┌─────────────────────────────────────────────────────────────────┐ |
| 2 | │ User: "What was our Q3 revenue and how does it compare to │ |
| 3 | │ the industry average growth rate?" │ |
| 4 | └─────────────────────────────────────────────────────────────────┘ |
| 5 | │ |
| 6 | ▼ |
| 7 | ┌─────────────────────────────────────────────────────────────────┐ |
| 8 | │ LlamaIndex Agent │ |
| 9 | │ │ |
| 10 | │ 1. Query Vector Index → "Q3 revenue was $2.4M" │ |
| 11 | │ 2. Query Vector Index → "Industry avg growth is 12%" │ |
| 12 | │ 3. Execute Python → Calculate comparison, growth rate │ |
| 13 | │ 4. Generate Response → Synthesize with computed values │ |
| 14 | └─────────────────────────────────────────────────────────────────┘ |
| 15 | │ |
| 16 | ▼ |
| 17 | ┌─────────────────────────────────────────────────────────────────┐ |
| 18 | │ Answer: "Q3 revenue was $2.4M, representing 18% YoY growth. │ |
| 19 | │ This outperforms the industry average of 12% by 6 percentage │ |
| 20 | │ points, ranking us in the top quartile of our sector." │ |
| 21 | └─────────────────────────────────────────────────────────────────┘ |
| 22 | |
Prerequisites
| 1 | pip install llama-index llama-index-llms-openai llama-index-embeddings-openai hopx-ai |
| 2 | |
Set environment variables:
| 1 | export OPENAI_API_KEY="sk-..." |
| 2 | export HOPX_API_KEY="..." |
| 3 | |
Step 1: Create the Code Execution Tool
Build a LlamaIndex-compatible tool for sandboxed execution:
| 1 | from llama_index.core.tools import FunctionTool |
| 2 | from hopx import Sandbox |
| 3 | from typing import Optional |
| 4 | |
| 5 | def execute_python(code: str) -> str: |
| 6 | """ |
| 7 | Execute Python code in an isolated sandbox. |
| 8 | |
| 9 | Use this tool when you need to: |
| 10 | - Perform calculations or mathematical operations |
| 11 | - Analyze data with pandas |
| 12 | - Create visualizations |
| 13 | - Process or transform data |
| 14 | |
| 15 | Args: |
| 16 | code: Python code to execute. Must be complete and runnable. |
| 17 | Always use print() to output results. |
| 18 | |
| 19 | Returns: |
| 20 | The output from code execution or error message. |
| 21 | """ |
| 22 | sandbox = None |
| 23 | try: |
| 24 | sandbox = Sandbox.create(template="code-interpreter") |
| 25 | result = sandbox.runCode(code, language="python", timeout=60) |
| 26 | |
| 27 | if result.exitCode == 0: |
| 28 | return result.stdout or "Code executed successfully (no output)" |
| 29 | else: |
| 30 | return f"Error: {result.stderr}" |
| 31 | except Exception as e: |
| 32 | return f"Execution failed: {str(e)}" |
| 33 | finally: |
| 34 | if sandbox: |
| 35 | sandbox.kill() |
| 36 | |
| 37 | |
| 38 | # Create LlamaIndex tool |
| 39 | python_tool = FunctionTool.from_defaults( |
| 40 | fn=execute_python, |
| 41 | name="python_executor", |
| 42 | description="""Execute Python code in a secure sandbox. |
| 43 | Use for calculations, data analysis, and any computational task. |
| 44 | The sandbox has pandas, numpy, matplotlib, scipy installed. |
| 45 | Always print() results you want to see.""" |
| 46 | ) |
| 47 | |
Step 2: Build a RAG Index
Create a simple vector index from documents:
| 1 | from llama_index.core import VectorStoreIndex, Document, Settings |
| 2 | from llama_index.llms.openai import OpenAI |
| 3 | from llama_index.embeddings.openai import OpenAIEmbedding |
| 4 | |
| 5 | # Configure LlamaIndex |
| 6 | Settings.llm = OpenAI(model="gpt-4o", temperature=0) |
| 7 | Settings.embed_model = OpenAIEmbedding(model="text-embedding-3-small") |
| 8 | |
| 9 | # Sample documents (replace with your data) |
| 10 | documents = [ |
| 11 | Document(text=""" |
| 12 | Q3 2024 Financial Report |
| 13 | |
| 14 | Revenue: $2.4 million |
| 15 | Operating Expenses: $1.8 million |
| 16 | Net Profit: $600,000 |
| 17 | |
| 18 | Year-over-year revenue growth: 18% |
| 19 | Customer acquisition: 450 new customers |
| 20 | Churn rate: 3.2% |
| 21 | |
| 22 | Key metrics: |
| 23 | - Average revenue per user (ARPU): $89 |
| 24 | - Customer lifetime value (LTV): $2,340 |
| 25 | - Customer acquisition cost (CAC): $156 |
| 26 | """), |
| 27 | |
| 28 | Document(text=""" |
| 29 | Industry Benchmarks 2024 |
| 30 | |
| 31 | SaaS Industry Average Metrics: |
| 32 | - Revenue growth: 12% YoY |
| 33 | - Churn rate: 5.2% |
| 34 | - ARPU: $75 |
| 35 | - LTV/CAC ratio: 3:1 |
| 36 | |
| 37 | Top quartile performance: |
| 38 | - Revenue growth: >15% |
| 39 | - Churn rate: <3% |
| 40 | - LTV/CAC ratio: >4:1 |
| 41 | """), |
| 42 | |
| 43 | Document(text=""" |
| 44 | Customer Segments Analysis |
| 45 | |
| 46 | Enterprise (>1000 employees): |
| 47 | - 45 customers |
| 48 | - $450 ARPU |
| 49 | - 1.5% churn |
| 50 | |
| 51 | Mid-Market (100-1000 employees): |
| 52 | - 180 customers |
| 53 | - $120 ARPU |
| 54 | - 2.8% churn |
| 55 | |
| 56 | SMB (<100 employees): |
| 57 | - 675 customers |
| 58 | - $45 ARPU |
| 59 | - 4.1% churn |
| 60 | """) |
| 61 | ] |
| 62 | |
| 63 | # Create index |
| 64 | index = VectorStoreIndex.from_documents(documents) |
| 65 | |
Step 3: Create a Query Engine Tool
Wrap the index as a tool the agent can use:
| 1 | from llama_index.core.tools import QueryEngineTool |
| 2 | |
| 3 | # Create query engine |
| 4 | query_engine = index.as_query_engine(similarity_top_k=3) |
| 5 | |
| 6 | # Wrap as tool |
| 7 | rag_tool = QueryEngineTool.from_defaults( |
| 8 | query_engine=query_engine, |
| 9 | name="company_knowledge", |
| 10 | description="""Search the company knowledge base for information about: |
| 11 | - Financial metrics and reports |
| 12 | - Industry benchmarks |
| 13 | - Customer segments |
| 14 | - Performance data |
| 15 | |
| 16 | Use this to find specific facts before doing calculations.""" |
| 17 | ) |
| 18 | |
Step 4: Build the Agent
Combine RAG and code execution in an agent:
| 1 | from llama_index.core.agent import ReActAgent |
| 2 | |
| 3 | # Create agent with both tools |
| 4 | agent = ReActAgent.from_tools( |
| 5 | tools=[rag_tool, python_tool], |
| 6 | llm=Settings.llm, |
| 7 | verbose=True, |
| 8 | max_iterations=10 |
| 9 | ) |
| 10 | |
| 11 | # Test it |
| 12 | response = agent.chat( |
| 13 | "What was our Q3 revenue and how does it compare to industry average? " |
| 14 | "Calculate the exact percentage difference." |
| 15 | ) |
| 16 | |
| 17 | print(response) |
| 18 | |
Example output:
| 1 | Thought: I need to find our Q3 revenue and the industry average, then calculate the comparison. |
| 2 | |
| 3 | Action: company_knowledge |
| 4 | Action Input: {"input": "Q3 2024 revenue"} |
| 5 | Observation: Q3 revenue was $2.4 million with 18% YoY growth... |
| 6 | |
| 7 | Action: company_knowledge |
| 8 | Action Input: {"input": "industry average revenue growth"} |
| 9 | Observation: SaaS industry average revenue growth is 12% YoY... |
| 10 | |
| 11 | Action: python_executor |
| 12 | Action Input: {"code": "our_growth = 18\nindustry_avg = 12\ndiff = our_growth - industry_avg\npercentage_better = (diff / industry_avg) * 100\nprint(f'Difference: {diff} percentage points')\nprint(f'We outperform by: {percentage_better:.1f}%')"} |
| 13 | Observation: Difference: 6 percentage points |
| 14 | We outperform by: 50.0% |
| 15 | |
| 16 | Answer: Our Q3 revenue was $2.4 million with 18% year-over-year growth. |
| 17 | Compared to the industry average of 12%, we outperform by 6 percentage points, |
| 18 | which represents a 50% better growth rate than the industry benchmark. |
| 19 | |
Step 5: Persistent Sandbox for Complex Analysis
For multi-step analyses, use a persistent sandbox:
| 1 | from llama_index.core.tools import FunctionTool |
| 2 | from hopx import Sandbox |
| 3 | from typing import Optional |
| 4 | |
| 5 | class PersistentSandbox: |
| 6 | """Manage a persistent sandbox for multi-step analysis.""" |
| 7 | |
| 8 | _instance: Optional['PersistentSandbox'] = None |
| 9 | |
| 10 | def __init__(self): |
| 11 | self.sandbox: Optional[Sandbox] = None |
| 12 | |
| 13 | @classmethod |
| 14 | def get(cls) -> 'PersistentSandbox': |
| 15 | if cls._instance is None: |
| 16 | cls._instance = cls() |
| 17 | return cls._instance |
| 18 | |
| 19 | def execute(self, code: str) -> str: |
| 20 | if self.sandbox is None: |
| 21 | self.sandbox = Sandbox.create(template="code-interpreter", ttl=600) |
| 22 | |
| 23 | result = self.sandbox.runCode(code, language="python", timeout=60) |
| 24 | |
| 25 | if result.exitCode == 0: |
| 26 | return result.stdout or "Executed (no output)" |
| 27 | return f"Error: {result.stderr}" |
| 28 | |
| 29 | def cleanup(self): |
| 30 | if self.sandbox: |
| 31 | self.sandbox.kill() |
| 32 | self.sandbox = None |
| 33 | |
| 34 | |
| 35 | def execute_python_persistent(code: str) -> str: |
| 36 | """ |
| 37 | Execute Python with persistent state. |
| 38 | Variables and imports persist between calls. |
| 39 | """ |
| 40 | return PersistentSandbox.get().execute(code) |
| 41 | |
| 42 | |
| 43 | persistent_python = FunctionTool.from_defaults( |
| 44 | fn=execute_python_persistent, |
| 45 | name="python_persistent", |
| 46 | description="""Execute Python code with PERSISTENT STATE. |
| 47 | Variables, DataFrames, and imports persist between calls. |
| 48 | Use this for multi-step analysis where you need to build on previous results. |
| 49 | """ |
| 50 | ) |
| 51 | |
Step 6: Data Analysis Agent
Build a specialized agent for data analysis:
| 1 | from llama_index.core.agent import ReActAgent |
| 2 | from llama_index.core.tools import FunctionTool |
| 3 | from hopx import Sandbox |
| 4 | import json |
| 5 | |
| 6 | # Data upload tool |
| 7 | def upload_data(filename: str, data: str) -> str: |
| 8 | """ |
| 9 | Upload CSV data to the sandbox for analysis. |
| 10 | |
| 11 | Args: |
| 12 | filename: Name for the file (e.g., 'sales.csv') |
| 13 | data: CSV content as a string |
| 14 | """ |
| 15 | sandbox = PersistentSandbox.get() |
| 16 | if sandbox.sandbox is None: |
| 17 | sandbox.sandbox = Sandbox.create(template="code-interpreter", ttl=600) |
| 18 | |
| 19 | sandbox.sandbox.files.write(f"/app/{filename}", data) |
| 20 | return f"Uploaded {filename} to /app/{filename}" |
| 21 | |
| 22 | |
| 23 | upload_tool = FunctionTool.from_defaults( |
| 24 | fn=upload_data, |
| 25 | name="upload_data", |
| 26 | description="Upload CSV data to sandbox. Provide filename and CSV content." |
| 27 | ) |
| 28 | |
| 29 | |
| 30 | # Create data analysis agent |
| 31 | data_agent = ReActAgent.from_tools( |
| 32 | tools=[rag_tool, persistent_python, upload_tool], |
| 33 | llm=Settings.llm, |
| 34 | verbose=True, |
| 35 | system_prompt="""You are a data analyst assistant. |
| 36 | |
| 37 | When analyzing data: |
| 38 | 1. First check if relevant context exists in the knowledge base |
| 39 | 2. Upload data files as needed using upload_data |
| 40 | 3. Use python_persistent for multi-step analysis (state persists!) |
| 41 | 4. Always show your calculations and explain your methodology |
| 42 | 5. Create visualizations when helpful (save to /app/chart.png) |
| 43 | |
| 44 | For calculations, always use Python to ensure accuracy.""" |
| 45 | ) |
| 46 | |
| 47 | |
| 48 | # Example usage |
| 49 | response = data_agent.chat(""" |
| 50 | Here's our monthly revenue data: |
| 51 | |
| 52 | month,revenue,customers |
| 53 | Jan,180000,520 |
| 54 | Feb,195000,545 |
| 55 | Mar,210000,580 |
| 56 | Apr,225000,610 |
| 57 | May,240000,650 |
| 58 | Jun,260000,695 |
| 59 | |
| 60 | Upload this data and analyze: |
| 61 | 1. Calculate month-over-month growth rates |
| 62 | 2. What's the average growth rate? |
| 63 | 3. Project July revenue based on the trend |
| 64 | 4. Compare to industry benchmark from our knowledge base |
| 65 | """) |
| 66 | |
| 67 | print(response) |
| 68 | |
Advanced: Sub-Question Query Engine
For complex queries, break them into sub-questions:
| 1 | from llama_index.core.query_engine import SubQuestionQueryEngine |
| 2 | from llama_index.core.tools import QueryEngineTool, ToolMetadata |
| 3 | |
| 4 | # Multiple specialized indices |
| 5 | financial_index = VectorStoreIndex.from_documents(financial_docs) |
| 6 | customer_index = VectorStoreIndex.from_documents(customer_docs) |
| 7 | market_index = VectorStoreIndex.from_documents(market_docs) |
| 8 | |
| 9 | # Create query engine tools |
| 10 | query_engine_tools = [ |
| 11 | QueryEngineTool( |
| 12 | query_engine=financial_index.as_query_engine(), |
| 13 | metadata=ToolMetadata( |
| 14 | name="financial_data", |
| 15 | description="Financial reports, revenue, expenses, profits" |
| 16 | ) |
| 17 | ), |
| 18 | QueryEngineTool( |
| 19 | query_engine=customer_index.as_query_engine(), |
| 20 | metadata=ToolMetadata( |
| 21 | name="customer_data", |
| 22 | description="Customer segments, churn, acquisition metrics" |
| 23 | ) |
| 24 | ), |
| 25 | QueryEngineTool( |
| 26 | query_engine=market_index.as_query_engine(), |
| 27 | metadata=ToolMetadata( |
| 28 | name="market_data", |
| 29 | description="Industry benchmarks, competitor analysis, market trends" |
| 30 | ) |
| 31 | ) |
| 32 | ] |
| 33 | |
| 34 | # Create sub-question query engine |
| 35 | sub_question_engine = SubQuestionQueryEngine.from_defaults( |
| 36 | query_engine_tools=query_engine_tools |
| 37 | ) |
| 38 | |
| 39 | # Wrap as tool for agent |
| 40 | sub_question_tool = QueryEngineTool.from_defaults( |
| 41 | query_engine=sub_question_engine, |
| 42 | name="comprehensive_search", |
| 43 | description="""Search across all company data sources. |
| 44 | Use for complex questions that span multiple topics. |
| 45 | Automatically breaks down into sub-questions.""" |
| 46 | ) |
| 47 | |
| 48 | # Create powerful agent |
| 49 | comprehensive_agent = ReActAgent.from_tools( |
| 50 | tools=[sub_question_tool, persistent_python], |
| 51 | llm=Settings.llm, |
| 52 | verbose=True |
| 53 | ) |
| 54 | |
Multi-Document Analysis with Code
Analyze documents and compute insights:
| 1 | from llama_index.core import SimpleDirectoryReader |
| 2 | from llama_index.core.node_parser import SentenceSplitter |
| 3 | |
| 4 | # Load documents |
| 5 | documents = SimpleDirectoryReader("./data/reports/").load_data() |
| 6 | |
| 7 | # Parse into nodes |
| 8 | parser = SentenceSplitter(chunk_size=512, chunk_overlap=50) |
| 9 | nodes = parser.get_nodes_from_documents(documents) |
| 10 | |
| 11 | # Create index |
| 12 | index = VectorStoreIndex(nodes) |
| 13 | |
| 14 | # Agent for document analysis |
| 15 | doc_analysis_agent = ReActAgent.from_tools( |
| 16 | tools=[ |
| 17 | QueryEngineTool.from_defaults( |
| 18 | query_engine=index.as_query_engine(), |
| 19 | name="document_search", |
| 20 | description="Search uploaded documents for information" |
| 21 | ), |
| 22 | persistent_python |
| 23 | ], |
| 24 | llm=Settings.llm, |
| 25 | verbose=True, |
| 26 | system_prompt="""You are a document analysis agent. |
| 27 | |
| 28 | Your workflow: |
| 29 | 1. Search documents to extract relevant data points |
| 30 | 2. Use Python to compute statistics, comparisons, trends |
| 31 | 3. Always verify calculations by showing the code |
| 32 | 4. Provide data-driven conclusions |
| 33 | |
| 34 | When extracting numbers from documents, use Python to validate and compute.""" |
| 35 | ) |
| 36 | |
Structured Output with Code Validation
Ensure accuracy by validating with code:
| 1 | from llama_index.core.tools import FunctionTool |
| 2 | from pydantic import BaseModel |
| 3 | from typing import List |
| 4 | |
| 5 | class FinancialAnalysis(BaseModel): |
| 6 | revenue: float |
| 7 | growth_rate: float |
| 8 | profit_margin: float |
| 9 | industry_comparison: str |
| 10 | recommendations: List[str] |
| 11 | |
| 12 | |
| 13 | def validated_analysis(query: str) -> str: |
| 14 | """ |
| 15 | Perform financial analysis with code validation. |
| 16 | |
| 17 | Retrieves data, computes metrics in sandbox, returns validated results. |
| 18 | """ |
| 19 | sandbox = PersistentSandbox.get() |
| 20 | |
| 21 | # Step 1: Query for raw data |
| 22 | raw_data = query_engine.query(query) |
| 23 | |
| 24 | # Step 2: Validate and compute in sandbox |
| 25 | validation_code = f''' |
| 26 | import json |
| 27 | |
| 28 | # Parse extracted values (from RAG) |
| 29 | raw_text = """{raw_data}""" |
| 30 | |
| 31 | # Extract and validate numbers |
| 32 | import re |
| 33 | numbers = re.findall(r'\$?([\d,]+(?:\.\d+)?)\s*(?:million|M)?', raw_text) |
| 34 | numbers = [float(n.replace(',', '')) for n in numbers] |
| 35 | |
| 36 | # Compute derived metrics |
| 37 | if len(numbers) >= 2: |
| 38 | revenue = numbers[0] |
| 39 | if 'million' in raw_text.lower(): |
| 40 | revenue *= 1_000_000 |
| 41 | |
| 42 | # Calculate metrics |
| 43 | analysis = {{ |
| 44 | "revenue": revenue, |
| 45 | "extracted_values": numbers, |
| 46 | "validation": "passed" if revenue > 0 else "failed" |
| 47 | }} |
| 48 | print(json.dumps(analysis, indent=2)) |
| 49 | else: |
| 50 | print(json.dumps({{"error": "Could not extract values"}})) |
| 51 | ''' |
| 52 | |
| 53 | result = sandbox.execute(validation_code) |
| 54 | return result |
| 55 | |
| 56 | |
| 57 | validation_tool = FunctionTool.from_defaults( |
| 58 | fn=validated_analysis, |
| 59 | name="validated_financial_analysis", |
| 60 | description="Perform validated financial analysis with code verification" |
| 61 | ) |
| 62 | |
Complete Working Example
Here's a production-ready implementation:
| 1 | """ |
| 2 | LlamaIndex RAG Agent with HopX Code Execution |
| 3 | """ |
| 4 | |
| 5 | from llama_index.core import VectorStoreIndex, Document, Settings |
| 6 | from llama_index.core.agent import ReActAgent |
| 7 | from llama_index.core.tools import FunctionTool, QueryEngineTool |
| 8 | from llama_index.llms.openai import OpenAI |
| 9 | from llama_index.embeddings.openai import OpenAIEmbedding |
| 10 | from hopx import Sandbox |
| 11 | from typing import Optional |
| 12 | import os |
| 13 | |
| 14 | # Verify environment |
| 15 | assert os.environ.get("OPENAI_API_KEY"), "Set OPENAI_API_KEY" |
| 16 | assert os.environ.get("HOPX_API_KEY"), "Set HOPX_API_KEY" |
| 17 | |
| 18 | # Configure LlamaIndex |
| 19 | Settings.llm = OpenAI(model="gpt-4o", temperature=0) |
| 20 | Settings.embed_model = OpenAIEmbedding() |
| 21 | |
| 22 | |
| 23 | class SandboxManager: |
| 24 | """Singleton sandbox manager.""" |
| 25 | _sandbox: Optional[Sandbox] = None |
| 26 | |
| 27 | @classmethod |
| 28 | def execute(cls, code: str) -> str: |
| 29 | if cls._sandbox is None: |
| 30 | cls._sandbox = Sandbox.create(template="code-interpreter", ttl=600) |
| 31 | result = cls._sandbox.runCode(code, language="python", timeout=60) |
| 32 | return result.stdout if result.exitCode == 0 else f"Error: {result.stderr}" |
| 33 | |
| 34 | @classmethod |
| 35 | def cleanup(cls): |
| 36 | if cls._sandbox: |
| 37 | cls._sandbox.kill() |
| 38 | cls._sandbox = None |
| 39 | |
| 40 | |
| 41 | def python_executor(code: str) -> str: |
| 42 | """Execute Python code with persistent state.""" |
| 43 | return SandboxManager.execute(code) |
| 44 | |
| 45 | |
| 46 | def create_rag_agent(documents: list) -> ReActAgent: |
| 47 | """Create a RAG agent with code execution.""" |
| 48 | |
| 49 | # Build index |
| 50 | index = VectorStoreIndex.from_documents( |
| 51 | [Document(text=d) for d in documents] |
| 52 | ) |
| 53 | |
| 54 | # Tools |
| 55 | tools = [ |
| 56 | QueryEngineTool.from_defaults( |
| 57 | query_engine=index.as_query_engine(), |
| 58 | name="knowledge_base", |
| 59 | description="Search the knowledge base for information" |
| 60 | ), |
| 61 | FunctionTool.from_defaults( |
| 62 | fn=python_executor, |
| 63 | name="python", |
| 64 | description="Execute Python for calculations. State persists." |
| 65 | ) |
| 66 | ] |
| 67 | |
| 68 | return ReActAgent.from_tools( |
| 69 | tools=tools, |
| 70 | llm=Settings.llm, |
| 71 | verbose=True, |
| 72 | system_prompt="""You are an analytical assistant. |
| 73 | 1. Search knowledge base for facts |
| 74 | 2. Use Python for all calculations |
| 75 | 3. Always verify numbers with code |
| 76 | 4. Explain your methodology""" |
| 77 | ) |
| 78 | |
| 79 | |
| 80 | # Example usage |
| 81 | if __name__ == "__main__": |
| 82 | docs = [ |
| 83 | "Q3 2024: Revenue $2.4M, Growth 18%, Profit margin 25%", |
| 84 | "Industry benchmark: Average growth 12%, Top quartile >15%", |
| 85 | "Customers: 900 total, 45 enterprise ($450 ARPU), 675 SMB ($45 ARPU)" |
| 86 | ] |
| 87 | |
| 88 | agent = create_rag_agent(docs) |
| 89 | |
| 90 | try: |
| 91 | response = agent.chat( |
| 92 | "What's our revenue per customer segment? " |
| 93 | "Calculate the contribution of each segment." |
| 94 | ) |
| 95 | print("\n" + "="*50) |
| 96 | print(response) |
| 97 | finally: |
| 98 | SandboxManager.cleanup() |
| 99 | |
Best Practices
1. Query First, Compute Second
| 1 | # Good pattern: |
| 2 | # 1. Retrieve facts from RAG |
| 3 | # 2. Compute with Python |
| 4 | # 3. Synthesize response |
| 5 | |
| 6 | # Don't hallucinate numbers - always verify with code |
| 7 | |
2. Use Persistent Sandbox for Multi-Step
| 1 | # For complex analysis: |
| 2 | step1 = agent.chat("Load the sales data and show structure") |
| 3 | step2 = agent.chat("Calculate monthly averages") # Uses same sandbox |
| 4 | step3 = agent.chat("Create visualization") # State persists |
| 5 | |
3. Validate RAG Extractions
| 1 | # After RAG retrieval, validate numbers: |
| 2 | validation_code = f""" |
| 3 | extracted_value = {value} |
| 4 | # Sanity checks |
| 5 | assert extracted_value > 0, "Value should be positive" |
| 6 | assert extracted_value < 1e12, "Value seems too large" |
| 7 | print(f"Validated: {extracted_value}") |
| 8 | """ |
| 9 | |
4. Clean Up Resources
| 1 | try: |
| 2 | result = agent.chat(query) |
| 3 | finally: |
| 4 | SandboxManager.cleanup() |
| 5 | |
Conclusion
LlamaIndex + HopX enables agents that:
- Retrieve facts from your documents
- Compute accurate answers with Python
- Validate numbers through code execution
- Persist state for complex analyses
No more hallucinated calculations. Your agent can reason about data with the precision of code.
Ready to add code execution to your RAG app? Get started with HopX — sandboxes that spin up in 100ms.
Further Reading
- LangChain Tools with Secure Execution — LangChain integration
- Vercel AI SDK Streaming — Streaming responses
- LlamaIndex Documentation — Official docs
- Build a Code Interpreter — Full tutorial