What if your computer-use agent could learn a new Command Line Interface (CLI)—and operate it safely without ever writing files or free-typing shell commands?
In Part 1 of our series on building a computer use agent, we built a custom Bash computer-use agent using NVIDIA Nemotron in just one hour. In this sequel, we’ll take it further by teaching the same reasoning model with no prior knowledge to safely operate the LangGraph Platform CLI. This shows how easily a large reasoning model can be specialized to perform new, agentic tasks.Instead of simple file operations, our new agent will learn to start local servers, build containers, and generate Dockerfiles—entirely through a verifiable, human-in-the-loop command interface.
We’ll combine synthetic data generation (SDG) and Reinforcement Learning with Verifiable Rewards (RLVR), optimized via Group Relative Policy Optimization (GRPO), to make training both efficient and safe.
What you’ll build: a specialized agent to run a new CLI tool
You’ll fine-tune an AI agent that can:







