In this tutorial, we implement an instrumented workflow for Microsoft SkillOpt. We set up the SkillOpt repository, connect it to OpenAI-compatible model access, configure the optimizer and target models, and run the SearchQA optimization pipeline with a controlled sample limit to keep costs manageable. We first evaluate the original seed skill as a baseline, then run a real optimization loop in which SkillOpt improves the skill through rollout, reflection, aggregation, selection, updating, and validation-based gating. Along the way, we inspect the training history, visualize changes in accuracy, review edit-budget behavior, monitor cumulative token usage, and compare the evolved skill with the original baseline.

SkillOpt Environment Setup

import os, re, json, glob, subprocess, pathlib, difflib

try:

from google.colab import userdata