Google has launched Gemini Omni, a new family of artificial intelligence (AI) models designed to merge advanced text reasoning with multimedia creation. The model family is built to accept any combination of text, images, audio and video as an input prompt to generate and edit high-quality video content even as it moves toward its ultimate goal of creating Artificial General Intelligence (AGI).

Google's new multimodal AI model powers updates to Flow and Flow Music, including conversational video editing and AI-generated media tools.

The model marks Google's bid to collapse the multimodal generative stack — text-to-image, image-to-video, video-to-video, audio generation — into a single foundation model with a…