Build DeepSeek from Scratch
Build DeepSeek from Scratch
Table of Contents
Part 1: Introduction
1. Pengantar tentang DeepSeek
2. Mengatasi Masalah Kemacetan Kinerja
3. Mengatasi Masalah Kemacetan Kinerja
4. Solusi Kemacetan Kinerja
5. Jia Bin Huang DSA
6. Jia Bin Huang DeepSeek V4
7. Jia Bin Huang DeepSeek V4
Part 2: Kv cache
8. 2.1 The LLM inference loop: Generating text one token at a time
9. Terobosan Hebat DeepSeek
10. Meningkatkan Kecerdasan
11. Menebak Banyak Kata Sekaligus
Part 3: Mla
12. 3.1 MLA: The best of both worlds
Part 4: Moe
13. 4.1 The intuition behind mixture of experts
Part 5: Mtp fp8
14. 5.1 The core idea: From single-token to multi-token prediction
15. 5.2 The four key advantages of MTP
Part 6: Dsa
16. 6.1 DSA Prototype: Lightning Indexer and Fine-Grained Token Selection
17. 6.2 DSA Continued Pre-Training: Warm-up and Sparse Stages
18. 6.3 Parity Evaluation and Inference Cost Reduction
Part 7: Papers
19. Membedah DeepSeek-V4
20. DeepSeek-v4 beyond basics
21. DeepSeek-V3.2: Pushing the Frontier of Open Large Language Models
22. Conditional Memory via Scalable Lookup: A New Axis of Sparsity for Large Language Models
23. mHC: Manifold-Constrained Hyper-Connections
24. DeepSeek-V4
010 References
Source on GitHub
References
Build DeepSeek from Scratch
Book
Contents
Part 1:
Introduction
Part 1: Introduction to DeepSeek
Chapter 1.
Pengantar tentang DeepSeek
Chapter 2.
Mengatasi Masalah Kemacetan Kinerja
Chapter 3.
Mengatasi Masalah Kemacetan Kinerja
Chapter 4.
Solusi Kemacetan Kinerja
Chapter 5.
Jia Bin Huang DSA
Chapter 6.
Jia Bin Huang DeepSeek V4
Chapter 7.
Jia Bin Huang DeepSeek V4
Next: Introduction › Chapter 1.
Pengantar tentang DeepSeek
Previous:
Table of Contents