Three providers, three modalities, under 55 lines of Python — and a PNG file on disk at the end. Claude writes a sunset description, an image generation model paints it, and Qwen Vision analyzes the result. Each model does one thing well; the script wires them together.
This article walks through building exactly that pipeline using yait_aichain's Skill and Model primitives. We'll go step by step: generate text with Claude, turn that text into an image, then feed the image to Qwen Vision for analysis.
What We're Building
The pipeline has three stages:
Text → Text (Claude claude-3-5-sonnet-20241022): Generate a one-sentence description of a sunset.






