The two-host YouTube pipeline I described last week turns a JSON spec into an MP4 by rendering still images for each dialogue segment, running TTS, and using ffmpeg to concat the clips. The part that deserved more than a paragraph was the still-image renderer. Every frame is produced by slides.py, a 480-line Python module using Pillow. No browser, no Puppeteer, no headless Chromium, no screenshot API.
Here's how it works, what decisions I made along the way, and what I'd change.
Why render slides in Python at all
Three constraints made a purpose-built renderer the right call:
CI-only execution. Every video renders inside a GitHub Actions job triggered by a commit. Tools requiring a GUI, a running browser, or a persistent daemon don't fit. The single CI pipeline has to produce a deterministic PNG from data without any human in the loop.







