Together AI at NVIDIA GTC 2026: Explore our latest innovations across research and products

This year, Together AI is excited to be part of NVIDIA GTC with multiple major announcements and conversations shaping the AI ecosystem — from cutting-edge model releases to new voice AI capabilities, and technical sessions with our research and engineering leaders.If you’re attending GTC, we’d love to connect. Key announcementsAt GTC 2026, several of the announcements we’re participating in highlight a core theme: AI systems are becoming more open, agentic, and production ready. Together AI, the AI Native Cloud, is designed to support this shift — helping developers train, shape, and deploy large-scale AI systems with the performance and cost-efficiency required for real-world applications. We are making multiple announcements today at GTC.Use NVIDIA Dynamo 1.0 in Together AI NVIDIA has launched NVIDIA Dynamo 1.0, an open-source software for generative and agentic inference at scale. We are excited to work with NVIDIA on Dynamo 1.0 and have already been using Dynamo as part of our inference stack to deliver more optimized performance in production use cases. At Together AI, we are committed to open innovation and are looking forward to exploring use cases that Dynamo 1.0 can be applied to.Connect to Together’s high-performance inference through NVIDIA OpenShellTogether AI and NVIDIA are working together on NVIDIA NemoClaw — an open source stack that simplifies running OpenClaw always-on assistants, more safely, with a single command. As part of the NVIDIA Agent Toolkit, it installs the NVIDIA OpenShell runtime—a secure environment for running autonomous agents, and open source models like NVIDIA Nemotron. Together is excited to host NVIDIA OpenShell runtime created for customers who want high performance models to build agents. Together AI has a model library with over 150 optimized models that can now be easily accessed via NemoClaw. Paired with Together’s dedicated endpoints, developers get the speed and cost efficiency of its inference engine at production scale. Leverage NVIDIA Nemotron 3 Super for multi-agent workflowsNVIDIA Nemotron 3 Super is a hybrid mixture-of-experts model designed for high-performance reasoning and multi-agent workflows. It combines a Mamba-Transformer architecture with a 1M-token context window to support long-horizon reasoning and complex agent interactions. With 120B total parameters (12B active per token), the model is optimized to run multiple collaborating agents efficiently — even on a single GPU — making it well suited for AI-native workflows like software development agents, financial analysis, and cybersecurity automation. Nemotron 3 Super can be deployed through our Dedicated Model Inference, providing developers with a simple and scalable way to run advanced reasoning models in production. Build voice agents with NVIDIA Parakeet TDT 0.6B V3As part of our recent voice solutions launch, NVIDIA Parakeet TDT 0.6b V3 automatic speech recognition (ASR) model is now available in the Together AI Model Library, giving developers access to high-performance, low-latency transcription optimized for real-time voice applications. By combining Parakeet’s ASR accuracy with Together’s high-performance inference infrastructure, AI natives can build production-ready voice agents that deliver fast, reliable, and scalable transcription.Together sessions The Together AI team, along with customers like Cursor and Decagon, will share insights across multiple GTC sessions, covering topics from production inference to open AI research.Sessions include:Engineering real-world LLM inference: Bridging open-source and production systemsMarch 17 • 2:00 PMYineng Zhang — Senior Director, Together AIHard-Won Lessons From Production Inference at ScaleMarch 17 • 4:00 PMYuchen Wu, Engineer, Cursor | Ce Zhang — CTO, Together AI Build Trust and Discovery Through Open-Source AI in ResearchMarch 18 • 2:00 PMPercy Liang — Co-Founder, Together AIUnder the Hood of Building and Scaling AI-Native ApplicationsMarch 18 • 4:00 PMAlan Yiu, VP of Product, Decagon | Charles Zedlewski — Chief Product Officer, Together AIVisit us at booth #1213Beyond sessions, the Together team will be hosting booth activations and side events throughout the week, including curated executive meetups focused on next-generation AI infrastructure and AI-native applications.Stop by to:See live demos of Together AI infrastructure and modelsLearn how teams are scaling production inference and agentic systemsMeet researchers and engineers building the future of open AI models and infrastructureTry Nemotron models now on Together AI serverless endpoints: https://www.together.ai/modelsLearn more and request a meeting: https://www.together.ai/gtc-san-jose-2026

Together AI at NVIDIA GTC 2026: Explore our latest innovations across research and products

Together AI at NVIDIA GTC 2026: Explore our latest innovations across research and products

Other newsrooms on this story

Related reading

Highlights from Ai2 at NVIDIA GTC 2026 | Ai2

NVIDIA GTC Taipei at COMPUTEX: Live Updates on What’s Next in AI

NVIDIA GTC Washington, DC: Live Updates on What’s Next in AI

Microsoft at NVIDIA GTC: New solutions for Microsoft Foundry, Azure AI…

Nvidia GTC 2026 And The Ambitious Path To $1 Trillion In AI Revenue

Together AI at ICML 2026: frontier research across the full stack

Other newsrooms on this story

Related reading

Highlights from Ai2 at NVIDIA GTC 2026 | Ai2

NVIDIA GTC Taipei at COMPUTEX: Live Updates on What’s Next in AI

NVIDIA GTC Washington, DC: Live Updates on What’s Next in AI

Microsoft at NVIDIA GTC: New solutions for Microsoft Foundry, Azure AI…

Nvidia GTC 2026 And The Ambitious Path To $1 Trillion In AI Revenue

Together AI at ICML 2026: frontier research across the full stack