DeepSeek Releases Prover-V2 Model with 671 billion Parameters
2025-04-30 10:39:59
DeepSeek today released a new model called DeepSeek-Prover-V2-671B on the AI open source community Hugging Face. It is reported that the DeepSeek-Prover-V2-671B uses a more efficient safetensors file format and supports a variety of computational accuracy, which makes it easier to train and deploy the model faster and less resourceful. The parameters reach 671 billion, or an upgraded version of the Prover-V1.5 mathematical model released last year. In the model architecture, the model uses the DeepSeek-V3 architecture, adopts the MoE (Hybrid Expert) mode, has 61 layers of Transformer layer, and 7168 dimensions of hidden layer. At the same time, it supports super long context, the maximum position embedding reaches 163,800, which allows it to handle complex mathematical proofs, and adopts FP8 quantization, which can reduce the model size and improve the inference efficiency through quantization technology. (Jin Ten)
Disclaimer:
1. The information provided does not constitute investment advice. Investors should make independent decisions and bear all risks themselves.
2. The copyright of this content belongs to the original author. The views expressed herein are solely those of the author and do not represent the stance or position of this website.
Previous article:
DeepSeek发布Prover-V2模型,参数量达6710亿Next article:
Binance将上线AIOTUSDT、DOLOUSDT和HAEDALUSDT永续合约