Build with DeepSeek V4 Using NVIDIA Blackwell and GPU-Accelerated Endpoints | NVIDIA Technical Blog

DeepSeek just launched its fourth generation of flagship models with DeepSeek-V4-Pro and DeepSeek-V4-Flash, both targeted at enabling highly efficient million-token context inference.

DeepSeek-V4-Pro is the largest model in the family, with 1.6T total parameters and 49B active parameters. DeepSeek-V4-Flash is a smaller 284B-parameter model with 13B active parameters, designed for higher-speed, higher-efficiency workloads. Both models support up to a 1M-token context window, opening new possibilities for long-context coding, document analysis, retrieval, and agentic AI workflows.

SpecificationDeepSeek-V4-ProDeepSeek-V4-FlashModalityTextTextTotal parameters1.6T284BActive parameters49B13BContext length1M tokens1M tokensMax output lengthUp to 384K tokens through DeepSeek API docsUp to 384K tokens through DeepSeek API docsPrimary use casesAdvanced reasoning, coding, long-context agentsHigh-speed efficiency, chat, routing, summarizationLicenseMITMITTable 1. Specifications for the DeepSeek V4 model family

Architectural innovations for long-context inference

The V4 family builds on the DeepSeek MoE architecture, with an increased focus on optimizing the attention component of the transformer architecture. These innovations are designed to achieve a 73% reduction in per-token inference FLOPs and a 90% reduction in KV cache memory burden compared with DeepSeek-V3.2.

DeepSeek just launched its fourth generation of flagship models with DeepSeek-V4-Pro and DeepSeek-V4-Flash, both targeted at enabling highly efficient million-token context inference.

Architectural innovations for long-context inference

Build with DeepSeek V4 Using NVIDIA Blackwell and GPU-Accelerated Endpoints | NVIDIA Technical Blog

Build with DeepSeek V4 Using NVIDIA Blackwell and GPU-Accelerated Endpoints | NVIDIA Technical Blog

Other newsrooms on this story

Related reading

DeepSeek V4—almost on the frontier, a fraction of the price

DeepSeek-V4 preview now available with open-source access · TechNode

As agentic AI pushes rivals to raise prices and cap usage, Deepseek V4 is a…

DeepSeek's new open models give everyone a million-word memory by default

Three reasons why DeepSeek’s new model matters

DeepSeek-V4 Pro now available on Together AI

Related reading

DeepSeek V4—almost on the frontier, a fraction of the price

DeepSeek-V4 preview now available with open-source access · TechNode

As agentic AI pushes rivals to raise prices and cap usage, Deepseek V4 is a…

DeepSeek's new open models give everyone a million-word memory by default

Three reasons why DeepSeek’s new model matters

DeepSeek-V4 Pro now available on Together AI

Other newsrooms on this story