Pick a better video thumbnail automatically with FFmpeg, PySceneDetect, and CLIP

TL;DR We'll build a pipeline that takes any video file, extracts candidate frames with...

domenica 31 maggio 2026 New tab

1,399 words~6 min read

TL;DR

We'll build a pipeline that takes any video file, extracts candidate frames with FFmpeg and PySceneDetect, filters out blurry ones with OpenCV, scores each candidate with OpenCLIP against a small prompt set, and picks the top-K thumbnails with a diversity constraint. ~200 lines of Python, GPU-accelerated, fully local.

The default thumbnail your encoder generates is "the middle frame." For most videos, the middle frame is a motion blur, a transition, or someone mid-blink. We can do much better with about an hour of effort. Here's the pipeline.

Versions

python 3.12

Pick a better video thumbnail automatically with FFmpeg, PySceneDetect, and CLIP

Pick a better video thumbnail automatically with FFmpeg, PySceneDetect, and CLIP

Other newsrooms on this story

Related reading

Turning a 1-Line Idea Into a 40-Second Short with a 10-Beat Local Video Pipeline

I built a tool to diff video, image, audio, subtitles and text files — all in…

JS video player with ffmpeg HTTP streaming in PHP: state machine, watchdog,…

I built an AI faceless video generator in 2 months — here's the stack

How I Turned an Old Movie Recommendation Project Into a Cinematic AI Platform

A Deep Neural Network that turns Any Image into a Playable Game! All on…

Other newsrooms on this story

Related reading

Turning a 1-Line Idea Into a 40-Second Short with a 10-Beat Local Video Pipeline

I built a tool to diff video, image, audio, subtitles and text files — all in…

JS video player with ffmpeg HTTP streaming in PHP: state machine, watchdog,…

I built an AI faceless video generator in 2 months — here's the stack

How I Turned an Old Movie Recommendation Project Into a Cinematic AI Platform

A Deep Neural Network that turns Any Image into a Playable Game! All on…