Back to Blog

Microsoft AutoGen with Isolated Code Execution Using HopX

TutorialsAlin Dobra12 min read

Microsoft AutoGen with Isolated Code Execution Using HopX

⚠️ Update: AutoGen is now in maintenance mode. Microsoft recommends migrating to Agent Framework, the unified successor combining AutoGen and Semantic Kernel. This tutorial remains useful for existing AutoGen projects.

Microsoft's AutoGen framework makes building multi-agent systems remarkably intuitive. Agents converse, collaborate, and execute code—all through natural conversation. But AutoGen's default code execution relies on Docker or local execution, both problematic in production.

This tutorial shows how to replace AutoGen's execution backends with HopX sandboxes: faster startup, better isolation, and no Docker dependency.

Why Replace Docker?

AutoGen's default DockerCommandLineCodeExecutor has issues:

ChallengeDockerHopX
Cold start2-5 seconds~100ms
Resource overheadHeavyMinimal
Setup complexityDocker daemon requiredAPI key only
SecurityContainer escape risksMicroVM isolation
CleanupManual container managementAutomatic

Architecture

text
1
2
                    AutoGen Conversation                          
3
                                                                  
4
                             
5
    User Proxy      Assistant                       
6
      Agent                    Agent                         
7
                             
8
                                                               
9
           Code to execute            Generate code            
10
                                                               
11
   
12
                     HopX Code Executor                         
13
                                                                
14
                  
15
       MicroVM 1       MicroVM 2       MicroVM 3         
16
       (Python)        (Bash)          (Node)            
17
                  
18
           100ms startup, full isolation                        
19
   
20
21
 

Prerequisites

bash
1
pip install pyautogen hopx-ai
2
 

Set environment variables:

bash
1
export OPENAI_API_KEY="sk-..."
2
export HOPX_API_KEY="..."
3
 

Step 1: Create the HopX Code Executor

Build an AutoGen-compatible executor using HopX:

python
1
from autogen.coding import CodeExecutor, CodeBlock, CodeResult
2
from autogen.coding.base import CodeExtractor
3
from hopx import Sandbox
4
from typing import List, Optional, Union
5
import re
6
 
7
class HopXCodeExecutor(CodeExecutor):
8
    """Execute code in HopX sandboxes instead of Docker."""
9
    
10
    def __init__(
11
        self, 
12
        timeout: int = 60,
13
        template: str = "code-interpreter",
14
        sandbox_ttl: int = 300
15
    ):
16
        self.timeout = timeout
17
        self.template = template
18
        self.sandbox_ttl = sandbox_ttl
19
        self._sandbox: Optional[Sandbox] = None
20
    
21
    @property
22
    def sandbox(self) -> Sandbox:
23
        """Get or create sandbox (lazy initialization)."""
24
        if self._sandbox is None:
25
            self._sandbox = Sandbox.create(
26
                template=self.template,
27
                ttl=self.sandbox_ttl
28
            )
29
        return self._sandbox
30
    
31
    def execute_code_blocks(
32
        self, 
33
        code_blocks: List[CodeBlock]
34
    ) -> CodeResult:
35
        """Execute a list of code blocks and return the result."""
36
        
37
        outputs = []
38
        exit_code = 0
39
        
40
        for block in code_blocks:
41
            language = block.language.lower()
42
            code = block.code
43
            
44
            # Map language to HopX language
45
            lang_map = {
46
                "python": "python",
47
                "python3": "python",
48
                "py": "python",
49
                "bash": "bash",
50
                "sh": "bash",
51
                "shell": "bash",
52
                "javascript": "javascript",
53
                "js": "javascript",
54
                "typescript": "typescript",
55
                "ts": "typescript"
56
            }
57
            
58
            exec_lang = lang_map.get(language, "python")
59
            
60
            try:
61
                result = self.sandbox.runCode(
62
                    code, 
63
                    language=exec_lang, 
64
                    timeout=self.timeout
65
                )
66
                
67
                output = ""
68
                if result.stdout:
69
                    output += result.stdout
70
                if result.stderr:
71
                    output += f"\nSTDERR:\n{result.stderr}"
72
                
73
                outputs.append(output.strip())
74
                
75
                if result.exitCode != 0:
76
                    exit_code = result.exitCode
77
                    
78
            except Exception as e:
79
                outputs.append(f"Execution error: {str(e)}")
80
                exit_code = 1
81
        
82
        return CodeResult(
83
            exit_code=exit_code,
84
            output="\n\n".join(outputs)
85
        )
86
    
87
    def reset(self):
88
        """Reset the executor (destroy sandbox)."""
89
        if self._sandbox:
90
            try:
91
                self._sandbox.kill()
92
            except:
93
                pass
94
            self._sandbox = None
95
    
96
    def __del__(self):
97
        """Cleanup on deletion."""
98
        self.reset()
99
 

Step 2: Basic Two-Agent Conversation

Create a simple assistant that can execute code:

python
1
from autogen import AssistantAgent, UserProxyAgent, config_list_from_json
2
 
3
# LLM configuration
4
config_list = [
5
    {
6
        "model": "gpt-4o",
7
        "api_key": os.environ["OPENAI_API_KEY"]
8
    }
9
]
10
 
11
llm_config = {
12
    "config_list": config_list,
13
    "temperature": 0,
14
    "timeout": 120
15
}
16
 
17
# Create executor
18
executor = HopXCodeExecutor(timeout=60)
19
 
20
# Assistant agent - generates code
21
assistant = AssistantAgent(
22
    name="assistant",
23
    llm_config=llm_config,
24
    system_message="""You are a helpful AI assistant that can write and execute Python code.
25
 
26
When asked to solve problems:
27
1. Write clear, well-documented Python code
28
2. Use print() to show results
29
3. Handle potential errors gracefully
30
 
31
Available libraries: pandas, numpy, matplotlib, seaborn, scipy, scikit-learn, requests.
32
For visualizations, save to /app/plot.png using plt.savefig('/app/plot.png')
33
"""
34
)
35
 
36
# User proxy - handles code execution
37
user_proxy = UserProxyAgent(
38
    name="user_proxy",
39
    human_input_mode="NEVER",  # No human intervention
40
    code_execution_config={
41
        "executor": executor
42
    },
43
    max_consecutive_auto_reply=10
44
)
45
 
46
# Start conversation
47
result = user_proxy.initiate_chat(
48
    assistant,
49
    message="Calculate the first 100 prime numbers and find their sum."
50
)
51
 
52
print("\nFinal result:", result.summary)
53
 
54
# Cleanup
55
executor.reset()
56
 

Step 3: Persistent State Across Messages

For multi-turn conversations that need persistent state:

python
1
class PersistentHopXExecutor(CodeExecutor):
2
    """Executor that maintains state across messages."""
3
    
4
    def __init__(self, timeout: int = 60):
5
        self.timeout = timeout
6
        self._sandbox: Optional[Sandbox] = None
7
    
8
    @property
9
    def sandbox(self) -> Sandbox:
10
        if self._sandbox is None:
11
            self._sandbox = Sandbox.create(
12
                template="code-interpreter",
13
                ttl=600  # 10 minute TTL for long conversations
14
            )
15
            # Initialize with common imports
16
            self._sandbox.runCode("""
17
import pandas as pd
18
import numpy as np
19
import matplotlib.pyplot as plt
20
import seaborn as sns
21
import json
22
from datetime import datetime
23
print("Environment ready!")
24
""", language="python", timeout=30)
25
        return self._sandbox
26
    
27
    def execute_code_blocks(self, code_blocks: List[CodeBlock]) -> CodeResult:
28
        outputs = []
29
        exit_code = 0
30
        
31
        for block in code_blocks:
32
            try:
33
                result = self.sandbox.runCode(
34
                    block.code,
35
                    language="python",
36
                    timeout=self.timeout
37
                )
38
                
39
                output = result.stdout or ""
40
                if result.stderr and result.exitCode != 0:
41
                    output += f"\nError: {result.stderr}"
42
                outputs.append(output.strip())
43
                
44
                if result.exitCode != 0:
45
                    exit_code = result.exitCode
46
                    
47
            except Exception as e:
48
                outputs.append(f"Error: {e}")
49
                exit_code = 1
50
        
51
        return CodeResult(exit_code=exit_code, output="\n\n".join(outputs))
52
    
53
    def upload_file(self, local_path: str, sandbox_path: str):
54
        """Upload a file to the sandbox."""
55
        with open(local_path, 'rb') as f:
56
            self.sandbox.files.write(sandbox_path, f.read())
57
    
58
    def download_file(self, sandbox_path: str) -> bytes:
59
        """Download a file from the sandbox."""
60
        return self.sandbox.files.read(sandbox_path)
61
    
62
    def reset(self):
63
        if self._sandbox:
64
            self._sandbox.kill()
65
            self._sandbox = None
66
 
67
 
68
# Usage
69
executor = PersistentHopXExecutor()
70
 
71
# Multiple turns that build on each other
72
user_proxy.initiate_chat(assistant, message="Load pandas and create a DataFrame called 'df' with columns A, B, C and 100 random rows")
73
user_proxy.send(assistant, message="Add a column D that is A + B * C")
74
user_proxy.send(assistant, message="Show statistics and save a histogram of column D")
75
 
76
# State persists across all messages!
77
 

Step 4: Multi-Agent Group Chat

Build a team of specialized agents:

python
1
from autogen import GroupChat, GroupChatManager
2
 
3
# Create specialized agents
4
coder = AssistantAgent(
5
    name="Coder",
6
    llm_config=llm_config,
7
    system_message="""You are an expert Python programmer.
8
Write clean, efficient code. Focus on implementation.
9
Always include docstrings and comments."""
10
)
11
 
12
analyst = AssistantAgent(
13
    name="Analyst",
14
    llm_config=llm_config,
15
    system_message="""You are a data analyst.
16
Interpret code outputs and explain findings in plain English.
17
Ask clarifying questions if needed."""
18
)
19
 
20
reviewer = AssistantAgent(
21
    name="Reviewer",
22
    llm_config=llm_config,
23
    system_message="""You are a code reviewer.
24
Check for bugs, suggest improvements, verify correctness.
25
Be constructive and specific."""
26
)
27
 
28
# User proxy with HopX executor
29
executor = PersistentHopXExecutor()
30
user_proxy = UserProxyAgent(
31
    name="User",
32
    human_input_mode="NEVER",
33
    code_execution_config={"executor": executor}
34
)
35
 
36
# Create group chat
37
group_chat = GroupChat(
38
    agents=[user_proxy, coder, analyst, reviewer],
39
    messages=[],
40
    max_round=15
41
)
42
 
43
manager = GroupChatManager(
44
    groupchat=group_chat,
45
    llm_config=llm_config
46
)
47
 
48
# Start the conversation
49
user_proxy.initiate_chat(
50
    manager,
51
    message="""
52
    Analyze this problem:
53
    
54
    We have sales data with columns: date, product, region, amount.
55
    Create synthetic data, then:
56
    1. Calculate total sales by product
57
    2. Find the best performing region
58
    3. Identify monthly trends
59
    4. Create a visualization
60
    
61
    Work together to solve this step by step.
62
    """
63
)
64
 
65
executor.reset()
66
 

Step 5: Tool-Using Agents

Create agents with specific tools backed by HopX:

python
1
from autogen import register_function
2
from hopx import Sandbox
3
 
4
# Global sandbox for tools
5
tool_sandbox: Optional[Sandbox] = None
6
 
7
def get_tool_sandbox() -> Sandbox:
8
    global tool_sandbox
9
    if tool_sandbox is None:
10
        tool_sandbox = Sandbox.create(template="code-interpreter", ttl=600)
11
    return tool_sandbox
12
 
13
 
14
def run_data_analysis(code: str) -> str:
15
    """Run Python code for data analysis in a secure sandbox."""
16
    sandbox = get_tool_sandbox()
17
    result = sandbox.runCode(code, language="python", timeout=60)
18
    
19
    if result.exitCode == 0:
20
        return result.stdout or "Code executed successfully"
21
    return f"Error: {result.stderr}"
22
 
23
 
24
def create_visualization(code: str) -> str:
25
    """Create a visualization and save it."""
26
    sandbox = get_tool_sandbox()
27
    
28
    # Ensure matplotlib backend is set
29
    full_code = f"""
30
import matplotlib
31
matplotlib.use('Agg')
32
import matplotlib.pyplot as plt
33
 
34
{code}
35
 
36
plt.savefig('/app/chart.png', dpi=150, bbox_inches='tight')
37
plt.close()
38
print("Chart saved to /app/chart.png")
39
"""
40
    
41
    result = sandbox.runCode(full_code, language="python", timeout=60)
42
    
43
    if result.exitCode == 0:
44
        return "Visualization created and saved to /app/chart.png"
45
    return f"Error: {result.stderr}"
46
 
47
 
48
def install_package(package: str) -> str:
49
    """Install a Python package in the sandbox."""
50
    sandbox = get_tool_sandbox()
51
    result = sandbox.runCode(f"pip install {package}", language="bash", timeout=120)
52
    
53
    if result.exitCode == 0:
54
        return f"Successfully installed {package}"
55
    return f"Failed to install {package}: {result.stderr}"
56
 
57
 
58
# Register tools with assistant
59
assistant_with_tools = AssistantAgent(
60
    name="assistant",
61
    llm_config=llm_config,
62
    system_message="""You have access to tools for data analysis.
63
 
64
Available tools:
65
- run_data_analysis: Execute Python code for data analysis
66
- create_visualization: Create charts and plots
67
- install_package: Install additional Python packages
68
 
69
Use these tools to help the user with their analysis tasks."""
70
)
71
 
72
# Register the functions
73
register_function(
74
    run_data_analysis,
75
    caller=assistant_with_tools,
76
    executor=user_proxy,
77
    name="run_data_analysis",
78
    description="Execute Python code for data analysis"
79
)
80
 
81
register_function(
82
    create_visualization,
83
    caller=assistant_with_tools,
84
    executor=user_proxy,
85
    name="create_visualization",
86
    description="Create a visualization with matplotlib"
87
)
88
 
89
register_function(
90
    install_package,
91
    caller=assistant_with_tools,
92
    executor=user_proxy,
93
    name="install_package",
94
    description="Install a Python package"
95
)
96
 

Step 6: Sequential Agent Pipeline

Chain agents in a specific order:

python
1
from autogen import initiate_chats
2
 
3
# Define the pipeline
4
executor = PersistentHopXExecutor()
5
 
6
# Data Engineer - Prepares data
7
data_engineer = AssistantAgent(
8
    name="DataEngineer",
9
    llm_config=llm_config,
10
    system_message="You prepare and clean data. Write code to load, clean, and transform data."
11
)
12
 
13
# Data Scientist - Analyzes data
14
data_scientist = AssistantAgent(
15
    name="DataScientist",
16
    llm_config=llm_config,
17
    system_message="You perform statistical analysis and modeling. Build on the prepared data."
18
)
19
 
20
# Report Writer - Creates reports
21
report_writer = AssistantAgent(
22
    name="ReportWriter",
23
    llm_config=llm_config,
24
    system_message="You create clear, concise reports from analysis results."
25
)
26
 
27
user_proxy = UserProxyAgent(
28
    name="User",
29
    human_input_mode="NEVER",
30
    code_execution_config={"executor": executor}
31
)
32
 
33
# Define chat sequence
34
chat_sequence = [
35
    {
36
        "sender": user_proxy,
37
        "recipient": data_engineer,
38
        "message": "Create a sample sales dataset with 1000 rows",
39
        "summary_method": "last_msg"
40
    },
41
    {
42
        "sender": user_proxy,
43
        "recipient": data_scientist,
44
        "message": "Analyze the data created by the data engineer",
45
        "summary_method": "last_msg"
46
    },
47
    {
48
        "sender": user_proxy,
49
        "recipient": report_writer,
50
        "message": "Write a summary report based on the analysis",
51
        "summary_method": "last_msg"
52
    }
53
]
54
 
55
# Execute pipeline
56
results = initiate_chats(chat_sequence)
57
 
58
# Print final report
59
print("\n=== Final Report ===")
60
print(results[-1].summary)
61
 
62
executor.reset()
63
 

Error Handling and Recovery

Build robust agents that handle failures:

python
1
class RobustHopXExecutor(CodeExecutor):
2
    """Executor with retry logic and error recovery."""
3
    
4
    def __init__(self, max_retries: int = 3):
5
        self.max_retries = max_retries
6
        self._sandbox: Optional[Sandbox] = None
7
    
8
    def _create_sandbox(self) -> Sandbox:
9
        return Sandbox.create(template="code-interpreter", ttl=300)
10
    
11
    @property
12
    def sandbox(self) -> Sandbox:
13
        if self._sandbox is None:
14
            self._sandbox = self._create_sandbox()
15
        return self._sandbox
16
    
17
    def execute_code_blocks(self, code_blocks: List[CodeBlock]) -> CodeResult:
18
        outputs = []
19
        exit_code = 0
20
        
21
        for block in code_blocks:
22
            result = self._execute_with_retry(block)
23
            outputs.append(result.output)
24
            if result.exit_code != 0:
25
                exit_code = result.exit_code
26
        
27
        return CodeResult(exit_code=exit_code, output="\n\n".join(outputs))
28
    
29
    def _execute_with_retry(self, block: CodeBlock) -> CodeResult:
30
        """Execute a single code block with retries."""
31
        last_error = None
32
        
33
        for attempt in range(self.max_retries):
34
            try:
35
                result = self.sandbox.runCode(
36
                    block.code,
37
                    language="python",
38
                    timeout=60
39
                )
40
                
41
                return CodeResult(
42
                    exit_code=result.exitCode,
43
                    output=result.stdout or result.stderr or ""
44
                )
45
                
46
            except Exception as e:
47
                last_error = str(e)
48
                # Reset sandbox on failure
49
                self._sandbox = None
50
                
51
                if attempt < self.max_retries - 1:
52
                    import time
53
                    time.sleep(1)
54
        
55
        return CodeResult(
56
            exit_code=1,
57
            output=f"Failed after {self.max_retries} attempts: {last_error}"
58
        )
59
    
60
    def reset(self):
61
        if self._sandbox:
62
            try:
63
                self._sandbox.kill()
64
            except:
65
                pass
66
            self._sandbox = None
67
 

Complete Working Example

Production-ready AutoGen with HopX:

python
1
"""
2
AutoGen Multi-Agent System with HopX Code Execution
3
"""
4
 
5
import os
6
from typing import Optional, List
7
from autogen import AssistantAgent, UserProxyAgent, GroupChat, GroupChatManager
8
from autogen.coding import CodeExecutor, CodeBlock, CodeResult
9
from hopx import Sandbox
10
 
11
# Verify environment
12
assert os.environ.get("OPENAI_API_KEY"), "Set OPENAI_API_KEY"
13
assert os.environ.get("HOPX_API_KEY"), "Set HOPX_API_KEY"
14
 
15
 
16
class HopXExecutor(CodeExecutor):
17
    """Production-ready HopX executor for AutoGen."""
18
    
19
    def __init__(self, timeout: int = 60, persist: bool = True):
20
        self.timeout = timeout
21
        self.persist = persist
22
        self._sandbox: Optional[Sandbox] = None
23
    
24
    @property
25
    def sandbox(self) -> Sandbox:
26
        if self._sandbox is None:
27
            self._sandbox = Sandbox.create(
28
                template="code-interpreter",
29
                ttl=600 if self.persist else 60
30
            )
31
        return self._sandbox
32
    
33
    def execute_code_blocks(self, code_blocks: List[CodeBlock]) -> CodeResult:
34
        outputs = []
35
        final_exit_code = 0
36
        
37
        for block in code_blocks:
38
            try:
39
                lang = "bash" if block.language.lower() in ["bash", "sh", "shell"] else "python"
40
                result = self.sandbox.runCode(block.code, language=lang, timeout=self.timeout)
41
                
42
                output = result.stdout or ""
43
                if result.stderr and result.exitCode != 0:
44
                    output += f"\n{result.stderr}"
45
                outputs.append(output.strip())
46
                
47
                if result.exitCode != 0:
48
                    final_exit_code = result.exitCode
49
                    
50
            except Exception as e:
51
                outputs.append(f"Error: {e}")
52
                final_exit_code = 1
53
                self._sandbox = None  # Reset on error
54
        
55
        return CodeResult(exit_code=final_exit_code, output="\n---\n".join(outputs))
56
    
57
    def reset(self):
58
        if self._sandbox:
59
            self._sandbox.kill()
60
            self._sandbox = None
61
 
62
 
63
def create_analysis_team():
64
    """Create a multi-agent analysis team."""
65
    
66
    llm_config = {
67
        "config_list": [{"model": "gpt-4o", "api_key": os.environ["OPENAI_API_KEY"]}],
68
        "temperature": 0
69
    }
70
    
71
    executor = HopXExecutor(persist=True)
72
    
73
    # Agents
74
    coder = AssistantAgent(
75
        name="Coder",
76
        llm_config=llm_config,
77
        system_message="Expert Python coder. Write clear, efficient code."
78
    )
79
    
80
    analyst = AssistantAgent(
81
        name="Analyst",
82
        llm_config=llm_config,
83
        system_message="Data analyst. Interpret results and provide insights."
84
    )
85
    
86
    user = UserProxyAgent(
87
        name="User",
88
        human_input_mode="NEVER",
89
        code_execution_config={"executor": executor},
90
        max_consecutive_auto_reply=5
91
    )
92
    
93
    # Group chat
94
    group_chat = GroupChat(
95
        agents=[user, coder, analyst],
96
        messages=[],
97
        max_round=12
98
    )
99
    
100
    manager = GroupChatManager(groupchat=group_chat, llm_config=llm_config)
101
    
102
    return user, manager, executor
103
 
104
 
105
if __name__ == "__main__":
106
    user, manager, executor = create_analysis_team()
107
    
108
    try:
109
        user.initiate_chat(
110
            manager,
111
            message="""
112
            Create a sales analysis:
113
            1. Generate sample sales data (500 rows: date, product, region, amount)
114
            2. Calculate total sales by product and region
115
            3. Identify the top 3 products
116
            4. Show monthly trends
117
            
118
            Execute code and explain findings.
119
            """
120
        )
121
    finally:
122
        executor.reset()
123
        print("\n✅ Sandbox cleaned up")
124
 

Best Practices

1. Always Clean Up

python
1
try:
2
    result = user_proxy.initiate_chat(assistant, message=task)
3
finally:
4
    executor.reset()
5
 

2. Use Persistent Sandbox for Multi-Turn

python
1
# For conversations that build on previous results
2
executor = HopXExecutor(persist=True)  # 10 min TTL
3
 

3. Set Appropriate Timeouts

python
1
executor = HopXExecutor(
2
    timeout=120  # Longer for complex computations
3
)
4
 

4. Handle Large Outputs

python
1
def execute_code_blocks(self, code_blocks):
2
    # ... execution ...
3
    output = result.stdout[:10000]  # Truncate large outputs
4
 

Conclusion

AutoGen + HopX gives you:

  • Faster execution: 100ms vs 2-5s Docker cold starts
  • Better isolation: MicroVM security vs container
  • Simpler setup: No Docker daemon required
  • Auto cleanup: Sandboxes destroyed automatically

Your multi-agent systems can collaborate and execute code safely, without the operational overhead of Docker.


Ready to upgrade your AutoGen agents? Get started with HopX — sandboxes that spin up in 100ms.

Further Reading