In this tutorial, we work with GLM-5.2 and use its hosted, OpenAI-compatible API instead of running the full model locally. We begin by setting up multiple provider options, securely loading the API key, and creating a reusable chat wrapper that supports normal chat, thinking mode, streaming, tool calling, and token tracking. Then we move beyond a simple chatbot example and test the model in more practical situations, including reasoning-effort control, streamed reasoning and answers, function calling, a small tool-using agent, structured JSON output, long-context retrieval, and cost estimation.
Setting Up the GLM-5.2 OpenAI-Compatible Client and Reusable Chat Wrapper
import sys, subprocess
subprocess.run([sys.executable, "-m", "pip", "install", "-q", "-U", "openai"], check=False)
import os, re, json, time, getpass













