How is the tid2eid in DeepSeek V4 Generated?
科学空间3057 字 (约 13 分钟)
75
The article explores the generation mechanism of the tid2eid mapping table in the DeepSeek V4 model.
入选理由:DeepSeek V4采用hash routing替代first_k_dense策略
FeaturedArticle#Deep Learning#Model Architecture#MoE中文

