DeepSeek V3.2

DeepSeek · Chine
685B successor to V3 with DeepSeek Sparse Attention for long context, scalable RL for agentic tasks. Vendor claims parity with GPT-5 (Speciale variant exceeds). MIT licence keeps weights clean; Chinese-origin considerations unchanged.
Caractéristiques de la licence
Licence
MIT
Commercial use
Unrestricted
Derivatives
Allowed (including distillation)
Attribution
Minimal
Parameters
685B MoE
Architecture
DeepSeek Sparse Attention (DSA)
Training data
Not disclosed
Last updated
2025-Q4
Risques connus
  • Chinese-origin alignment biases (political, historical)
  • Self-hosting 685B MoE is infrastructure-heavy
  • Reasoning traces may leak chain-of-thought considered sensitive
Revu par Ali Madjaji · Dernière revue le 2026-04-15