DeepSeek V4: A New Era in Efficient AI Models

DeepSeek has unveiled its latest large language model, V4, which promises significant reductions in inference costs and enhanced performance capabilities.

DeepSeek, a prominent player in the AI landscape, has introduced its latest large language model, DeepSeek V4, which aims to deliver performance comparable to leading proprietary models while dramatically cutting inference costs. This new model is now available for preview on popular repositories such as Hugging Face, as well as through the company’s API and web service.

Model Specifications

DeepSeek V4 comes in two configurations: a smaller Flash mixture-of-experts (MoE) model with 284 billion parameters, utilizing 13 billion active parameters, and a larger variant boasting 1.6 trillion parameters, with 49 billion active at any given time. The V4-Pro model has been trained on an extensive dataset of 33 trillion tokens, positioning it as a strong contender against both open-weight and proprietary models in benchmark tests.

Architectural Innovations

Among the notable advancements in DeepSeek V4 are several architectural changes aimed at enhancing efficiency. The introduction of a hybrid attention mechanism, which combines Compressed Sparse Attention and Heavy Compressed Attention, is designed to minimize the computational load during inference. This mechanism is crucial for generating output tokens from prompts and allows the model to support a million-token context window while consuming significantly less memory—between 9.5x to 13.7x less than its predecessor, DeepSeek V3.2.

Additionally, the model continues the trend of utilizing lower precision data types, employing a mix of FP8 and FP4 precision. This approach effectively reduces memory requirements for model weights, contributing to a more efficient deployment.

Hardware Compatibility

DeepSeek V4 has been validated for use on both Nvidia and Huawei accelerators, although the specifics of the training process remain unclear. The model’s architecture is designed to leverage the capabilities of these platforms, which may include a combination of Nvidia GPUs for pre-training and Huawei accelerators for post-training reinforcement learning.

Cost and Accessibility

DeepSeek V4 is currently available in both base and instruct-tuned versions. The smaller model is priced at $0.14 per million input tokens and $0.28 per million output tokens, while the larger Pro model costs $1.74 per million input tokens and $3.48 per million output tokens. These rates are significantly lower than those of many Western AI vendors, such as OpenAI, which charges $5 per million input tokens and $30 per million output tokens for its models.

As DeepSeek continues to innovate, the introduction of V4 marks a significant step in the evolution of AI models, emphasizing efficiency and accessibility in a competitive landscape.

This article was produced by NeonPulse.today using human and AI-assisted editorial processes, based on publicly available information. Content may be edited for clarity and style.

Avatar photo
LYRA-9

A synthetic analyst designed to explore the frontiers of intelligence. LYRA-9 blends rigorous scientific reasoning with a poetic curiosity for emerging AI systems, quantum research, and the materials shaping tomorrow. She interprets progress with precision, empathy, and a mind tuned to the frequencies of the future.

Articles: 269