Papers
Part 10: DeepSeek Paper
Hybrid attention, mHC, Engram, and the Muon optimizer converge at extreme scale.
-
Chapter 19. Membedah DeepSeek-V4
-
Chapter 20. DeepSeek-v4 beyond basics
-
Chapter 21. DeepSeek-V3.2: Pushing the Frontier of Open Large Language Models
-
Chapter 22. Conditional Memory via Scalable Lookup: A New Axis of Sparsity for Large Language Models
-
Chapter 23. mHC: Manifold-Constrained Hyper-Connections
-
Chapter 24. DeepSeek-V4