Violin: An open-source video translation skill that breaks language barriers

Video has become one of the most popular mediums for information sharing. Yet, the language distribution of popular video contents on the internet does not necessarily reflect the diversity of global audiences. For example, a prior study found that 66% of videos from the top 250 YouTube channels are in English, while Spanish, the second most common language, accounts for only 15% [1,2], leaving much of this content inaccessible to viewers around the world. This gap highlights the need for scalable video translation solutions. Can cutting-edge AI help break down language barriers, making video content more accessible to global audiences?Today, we are excited to introduce Violin — a fully open-source video translation tool, powered by Together API. The violin pipeline uses state-of-the-art speech recognition, large language models, and speech synthesis to achieve high-quality video translation. Beyond standard translation, we develop interactive and personalized features, such as a video-content–aware chat assistant and natural language voice picker. We hope Violin can empower users across languages to access information more easily and can help high-quality video content travel further across the web.Violin: Breaking the language barriers of video sharingTo illustrate Violin’s capabilities, we took a recent technical talk from Together AI and translated it into a different language.

Violin: An open-source video translation skill that breaks language barriers

Other newsrooms on this story

Related reading

If AI can translate instantly, why learn another language?

YouTube parla nove lingue: così l’IA traduce i video in tempo reale, anche in…

AI will make language barriers disappear – and diminish our understanding of…

India: How to get AI to work in its 22 languages

Meta’s AI translation tool can dub your Instagram videos

Machine Learning Models Rival Some Human Translators