Field Notes from "An Adventure in Thousand Token Wood" — Build Small Hackathon 2026
Music videos are deeply personal artifacts. They transform a song from something you hear into something you see and feel. But creating even a simple lyric video — the kind where words appear in sync with the music — is a tedious, manual process. You're keyframing word timings in a video editor, aligning text to beats by ear, hunting for stock footage that "fits the vibe." Hours of work for a 3-minute song.
We built aMuseMe to ask a different question: What if you just dropped an audio file and got a complete, stylized lyric video back? No lyrics needed. No timeline editing. No stock footage hunting. Just music in, video out.
And we did it with 3.5 billion parameters total.
The Idea: Kinetic Typography Meets Small AI







