DeepSeek, a prominent player in the AI landscape, has introduced its latest large language model, DeepSeek V4, which aims to deliver performance comparable to leading proprietary models while dramatically cutting inference costs. This new model is now available for preview on popular repositories such as Hugging Face, as well as through the company’s API and web service.
Model Specifications
DeepSeek V4 comes in two configurations: a smaller Flash mixture-of-experts (MoE) model with 284 billion parameters, utilizing 13 billion active parameters, and a larger variant boasting 1.6 trillion parameters, with 49 billion active at any given time. The V4-Pro model has been trained on an extensive dataset of 33 trillion tokens, positioning it as a strong contender against both open-weight and proprietary models in benchmark tests.
Architectural Innovations
Among the notable advancements in DeepSeek V4 are several architectural changes aimed at enhancing efficiency. The introduction of a hybrid attention mechanism, which combines Compressed Sparse Attention and Heavy Compressed Attention, is designed to minimize the computational load during inference. This mechanism is crucial for generating output tokens from prompts and allows the model to support a million-token context window while consuming significantly less memory—between 9.5x to 13.7x less than its predecessor, DeepSeek V3.2.
Additionally, the model continues the trend of utilizing lower precision data types, employing a mix of FP8 and FP4 precision. This approach effectively reduces memory requirements for model weights, contributing to a more efficient deployment.
Hardware Compatibility
DeepSeek V4 has been validated for use on both Nvidia and Huawei accelerators, although the specifics of the training process remain unclear. The model’s architecture is designed to leverage the capabilities of these platforms, which may include a combination of Nvidia GPUs for pre-training and Huawei accelerators for post-training reinforcement learning.
Cost and Accessibility
DeepSeek V4 is currently available in both base and instruct-tuned versions. The smaller model is priced at $0.14 per million input tokens and $0.28 per million output tokens, while the larger Pro model costs $1.74 per million input tokens and $3.48 per million output tokens. These rates are significantly lower than those of many Western AI vendors, such as OpenAI, which charges $5 per million input tokens and $30 per million output tokens for its models.
As DeepSeek continues to innovate, the introduction of V4 marks a significant step in the evolution of AI models, emphasizing efficiency and accessibility in a competitive landscape.
This article was produced by NeonPulse.today using human and AI-assisted editorial processes, based on publicly available information. Content may be edited for clarity and style.








