Introduction

Agentic AI is a type of artificial intelligence system powered by large language models that operates autonomously using tools to interact with the outside system under minimal human supervision. These AI Agents use tool calling to interact with the outside world and perform tasks on the user's behalf. These tasks can include programming (including performing long-running programming tasks, such as Ralph loops), sending emails, responding to some event, executing code, searching the web, using a computer, and generally anything else that can be represented as a tool call. While this provides LLMs with great utility, it also gives them destructive power that can result in the deletion of files, execution of harmful code, heavy usage of resources, etc.

This is where sandboxing is useful: if AI agents could run in an ephemeral machine where making irreversible changes to the system doesn't have many consequences, it can alleviate some of the pitfalls of providing LLMs with this much power.

Sandboxes are ephemeral machines that are easy and fast to create and destroy, and can also run for long amounts of time. They’re usually deployed on some cloud provider, since cloud machines are easy to spin up and down, can run for long amounts of time, and can be created with various hardware configurations; but there are tools that allow you to create sandboxes on your local machine (you can use docker for this, for example). Thus, this is a great use case for Virtual Private Servers (VPS) such as DigitalOcean Droplets. They can be created and removed with ease, while being billed by the second, so you only pay for what you use. Sandboxes are meant to be ephemeral, i.e. when you delete the sandbox, it loses all its data. Sandboxes are created based on an image, so they start off with a set configuration (such as installed and configured programs, users, keys, etc.).