GB 200s Change How One Does the Prefill and Decode Disaggregation When Serving Large MoEs Like Qwen

TL;DR · AI Summary
GB 200s improve the prefill and decode disaggregation efficiency for large MoE models like Qwen, significantly enhancing throughput compared to the Hopper platform.
Key Takeaways
- GB 200s are better suited for high-throughput inference on large MoE models comp
- Perplexity has published research on deploying post-trained Qwen3 235B models on
- GB 200s are not just a training platform but also a high-performance inference p
Outline
Jump quickly between sections.
Introduces the impact of GB 200s on large MoE models.
How GB 200s change the prefill and decode disaggregation process.
Comparison of throughput between GB 200s and Hopper.
Perplexity's publication on deploying Qwen3 235B models on GB 200s.
Mindmap
See how the topics connect at a glance.
查看大纲文本(无障碍 / 无 JS 友好)
- GB 200s 与 MoE 模型
- 预填充和解码分离
- 性能对比
- 研究发布
Highlights
Key sentences worth saving and sharing.
GB 200s change how one does the prefill and decode disaggregation when serving large MoEs like Qwen.
GB200 is a major step up over Hopper for high-throughput inference on large MoE models, not just a training platform.
We published new research on how we serve post-trained Qwen3 235B models on NVIDIA GB200 NVL72 Blackwell racks.
Aravind Srinivas on X: "GB 200s change how one does the prefill and decode disaggregation when serving large MoEs like Qwen. We’ve published details of our stack quantifying the throughput benefits compared to serving on Hoppers." / X
Don’t miss what’s happening

Aravind Srinivas 
GB 200s change how one does the prefill and decode disaggregation when serving large MoEs like Qwen. We’ve published details of our stack quantifying the throughput benefits compared to serving on Hoppers.
Quote

Perplexity
@perplexity_ai
·
10h
We published new research on how we serve post-trained Qwen3 235B models on NVIDIA GB200 NVL72 Blackwell racks. GB200 is a major step up over Hopper for high-throughput inference on large MoE models, not just a training platform.
·
11
13
164
53
Read 11 replies