A four-model video suite for generation, continuation, reference-driven workflows, and editing, rolling out on Together AI starting with text-to-video.SummaryFour-model suite: Wan 2.7 brings video generation, continuation, and editing to Together AI, starting today with text-to-video and expanding soon to image-to-video, reference-to-video, and video edit.Tighter creative control: Drive generation with optional audio, frame-level conditioning, reference inputs, and continuation workflows to reduce workflow fragmentation.Available on Together AI: Wan 2.7 runs on Together AI, the AI Native Cloud, with the same fast, reliable APIs, authentication, and billing surface developers already use across the rest of their multimodal stack.Simple pricing: Available starting at $0.10 per second of generated video through the Together AI API.AI video is easy to generate and hard to steer. A team can get a promising clip from a prompt, but continuing it, matching a reference, or revising it without starting over usually means leaving the model that made it and patching the rest together somewhere else. The more control a project needs, the more the workflow turns into re-renders, handoffs, and manual cleanup. That is the gap Wan 2.7 is built to close across generation, continuation, reference-driven workflows, and editing.On Together AI, that expanded control surface becomes one platform instead of another disconnected toolchain. Wan 2.7 comes to Together AI, the AI Native Cloud, as a four-model suite rolling out from text-to-video into image-to-video, reference-to-video, and video edit. That gives teams a clearer path from first generation to continuation, reference-driven control, and revision through the same APIs, authentication, and billing surface they already use across the rest of their multimodal stack.Text-to-video available nowWan 2.7 Text-to-Video (Wan-AI/wan2.7-t2v) is available today on Together AI. It provides a stronger starting point for campaign content, product videos, and creative prototyping than a plain prompt-to-video surface by supporting:Flexible resolution: 720P and 1080P generation.Duration control: Video outputs ranging from 2 to 15 seconds.Audio support: Optional audio input to drive the generation.Prompt-driven direction: Multi-shot narrative control directly through prompt language.