Securely Execute LLM-Generated Code with Ease
LLM Sandbox is a lightweight and portable sandbox environment designed to run large language model (LLM) generated code in a safe and isolated manner using Docker containers. This project aims to provide an easy-to-use interface for setting up, managing, and executing code in a controlled Docker environment, simplifying the process of running code generated by LLMs.
- Easy Setup: Quickly create sandbox environments with minimal configuration.
- Isolation: Run your code in isolated Docker containers to prevent interference with your host system.
- Flexibility: Support for multiple programming languages.
- Portability: Use predefined Docker images or custom Dockerfiles.
- Scalability: Support Kubernetes and remote Docker host.
- Ensure you have Poetry installed.
- Add the package to your project:
poetry add llm-sandbox
- Ensure you have pip installed.
- Install the package:
pip install llm-sandbox
The SandboxSession
class manages the lifecycle of the sandbox environment, including the creation and destruction of Docker containers. Here’s a typical lifecycle:
- Initialization: Create a
SandboxSession
object with the desired configuration. - Open Session: Call the
open()
method to build/pull the Docker image and start the Docker container. - Run Code: Use the
run()
method to execute code inside the sandbox. Currently, it supports Python, Java, JavaScript, C++, Go, and Ruby. See examples for more details. - Close Session: Call the
close()
method to stop and remove the Docker container. If thekeep_template
flag is set toTrue
, the Docker image will not be removed, and the last container state will be committed to the image.
Here's a simple example to demonstrate how to use LLM Sandbox:
from llm_sandbox import SandboxSession
# Create a new sandbox session
with SandboxSession(image="python:3.9.19-bullseye", keep_template=True, lang="python") as session:
result = session.run("print('Hello, World!')")
print(result)
# With custom Dockerfile
with SandboxSession(dockerfile="Dockerfile", keep_template=True, lang="python") as session:
result = session.run("print('Hello, World!')")
print(result)
# Or default image
with SandboxSession(lang="python", keep_template=True) as session:
result = session.run("print('Hello, World!')")
print(result)
LLM Sandbox also supports copying files between the host and the sandbox:
from llm_sandbox import SandboxSession
with SandboxSession(lang="python", keep_template=True) as session:
# Copy a file from the host to the sandbox
session.copy_to_runtime("test.py", "/sandbox/test.py")
# Run the copied Python code in the sandbox
result = session.run("python /sandbox/test.py")
print(result)
# Copy a file from the sandbox to the host
session.copy_from_runtime("/sandbox/output.txt", "output.txt")
For other languages usage, please refer to the examples.
You can also use remote Docker host as below:
import docker
from llm_sandbox import SandboxSession
tls_config = docker.tls.TLSConfig(
client_cert=("path/to/cert.pem", "path/to/key.pem"),
ca_cert="path/to/ca.pem",
verify=True
)
docker_client = docker.DockerClient(base_url="tcp://<your_host>:<port>", tls=tls_config)
with SandboxSession(
client=docker_client,
image="python:3.9.19-bullseye",
keep_template=True,
lang="python",
) as session:
result = session.run("print('Hello, World!')")
print(result)
For Kubernetes usage, please refer to the examples. Essentially, you just need to set the use_kubernetes flag to True and provide the Kubernetes client, or leave it as the default for the local context.
With Langchain integration, you can easily run the generated code in a safe and isolated environment. Here's an example of how to use LLM Sandbox with Langchain:
from typing import Optional, List
from llm_sandbox import SandboxSession
from langchain import hub
from langchain_openai import ChatOpenAI
from langchain.tools import tool
from langchain.agents import AgentExecutor, create_tool_calling_agent
@tool
def run_code(lang: str, code: str, libraries: Optional[List] = None) -> str:
"""
Run code in a sandboxed environment.
:param lang: The language of the code.
:param code: The code to run.
:param libraries: The libraries to use, it is optional.
:return: The output of the code.
"""
with SandboxSession(lang=lang, verbose=False) as session: # type: ignore[attr-defined]
return session.run(code, libraries).text
if __name__ == "__main__":
llm = ChatOpenAI(model="gpt-4o", temperature=0)
prompt = hub.pull("hwchase17/openai-functions-agent")
tools = [run_code]
agent = create_tool_calling_agent(llm, tools, prompt)
agent_executor = AgentExecutor(agent=agent, tools=tools, verbose=True)
output = agent_executor.invoke(
{
"input": "Write python code to calculate Pi number by Monte Carlo method then run it."
}
)
print(output)
output = agent_executor.invoke(
{
"input": "Write python code to calculate the factorial of a number then run it."
}
)
print(output)
output = agent_executor.invoke(
{"input": "Write python code to calculate the Fibonacci sequence then run it."}
)
print(output)
For Llama-Index:
from typing import Optional, List
from llm_sandbox import SandboxSession
from llama_index.llms.openai import OpenAI
from llama_index.core.tools import FunctionTool
from llama_index.core.agent import FunctionCallingAgentWorker
import nest_asyncio
nest_asyncio.apply()
def run_code(lang: str, code: str, libraries: Optional[List] = None) -> str:
"""
Run code in a sandboxed environment.
:param lang: The language of the code, must be one of ['python', 'java', 'javascript', 'cpp', 'go', 'ruby'].
:param code: The code to run.
:param libraries: The libraries to use, it is optional.
:return: The output of the code.
"""
with SandboxSession(lang=lang, verbose=False) as session: # type: ignore[attr-defined]
return session.run(code, libraries).text
if __name__ == "__main__":
llm = OpenAI(model="gpt-4o", temperature=0)
code_execution_tool = FunctionTool.from_defaults(fn=run_code)
agent_worker = FunctionCallingAgentWorker.from_tools(
[code_execution_tool],
llm=llm,
verbose=True,
allow_parallel_tool_calls=False,
)
agent = agent_worker.as_agent()
response = agent.chat(
"Write python code to calculate Pi number by Monte Carlo method then run it."
)
print(response)
response = agent.chat(
"Write python code to calculate the factorial of a number then run it."
)
print(response)
response = agent.chat(
"Write python code to calculate the Fibonacci sequence then run it."
)
print(response)
response = agent.chat("Calculate the sum of the first 10000 numbers.")
print(response)
We welcome contributions to improve LLM Sandbox! Since I am a Python developer, I am not familiar with other languages. If you are interested in adding better support for other languages, please feel free to submit a pull request.
Here is a list of things you can do to contribute:
- Add Java maven support.
- Add support for JavaScript.
- Add support for C++.
- Add support for Go.
- Add support for Ruby.
- Add remote Docker host support.
- Add remote Kubernetes cluster support.
- Langchain integration.
- LlamaIndex integration.
- Commit the last container state to the image before closing kubernetes session.
This project is licensed under the MIT License. See the LICENSE file for details.