The rapid adoption of autonomous AI agents like OpenClaw has introduced a fundamental security challenge: traditional defenses cannot predict what an LLM-driven application will do at runtime.

While containerization and virtual sandboxes isolate malicious execution from the underlying host machine, recent findings show that a sandbox alone cannot prevent an agent from being manipulated into leaking data or rewriting its own instructions.

Security firm Lasso recently disclosed multiple vulnerabilities in NemoClaw, Nvidia’s sandboxed environment for running OpenClaw. The research reveals that malicious actors can use subtle prompt injection attacks to exploit the autonomous nature of AI agents to distribute malware, bypass static detection filters, and persistently alter an agent’s core identity.

Because an agent’s execution path is determined dynamically by the text it reads, standard security measures are insufficient to protect the systems that host it.

The promise of isolated agency in NemoClaw