---
title: "SGLang is hitting 180 tok/s/GPU on DeepSeek-V4 decode with ~1M context on Blackwell. \n\nGood to see f..."
source_name: "NVIDIA AI(@NVIDIAAI)"
original_url: "https://x.com/NVIDIAAI/status/2049964864240791877"
canonical_url: "https://www.traeai.com/articles/f0b3332a-8e74-4c6c-bcc5-eaaae94244a4"
content_type: "tweet"
language: "中文"
score: 7
tags: ["NVIDIA","DeepSeek-V4","SGLang","Blackwell","LMSYS"]
published_at: "2026-04-30T21:31:24+00:00"
created_at: "2026-05-01T01:55:13.866963+00:00"
---

# SGLang is hitting 180 tok/s/GPU on DeepSeek-V4 decode with ~1M context on Blackwell. 

Good to see f...

Canonical URL: https://www.traeai.com/articles/f0b3332a-8e74-4c6c-bcc5-eaaae94244a4
Original source: https://x.com/NVIDIAAI/status/2049964864240791877

## Summary

NVIDIA AI 报告称，SGLang 在 Blackwell 硬件上使用 DeepSeek-V4 模型解码达到 180 tok/s/GPU 的速度，约 1M 上下文，得益于 LMSYS 组织针对 Blackwell 的特定优化，提高了混合稀疏注意力的利用效率。

## Key Takeaways

- SGLang 在 DeepSeek-V4 解码任务上实现高性能，达 180 tok/s/GPU。
- 该成果基于 Blackwell 硬件与 LMSYS 优化，提升模型稀疏注意力性能。
- LMSYS 同时发布适用于 V4 的 Miles RL 训练管道，支持 Day 0 优化。

## Content

Title: NVIDIA AI on X: "SGLang is hitting 180 tok/s/GPU on DeepSeek-V4 decode with ~1M context on Blackwell. 

Good to see fast progress in open source DeepSeek-V4 inference on new hardware. 

This comes from Blackwell-specific optimizations by @lmsysorg that better use the model’s hybrid sparse" / X

URL Source: http://x.com/NVIDIAAI/status/2049964864240791877

Published Time: Fri, 01 May 2026 01:54:21 GMT

Markdown Content:
Don’t miss what’s happening

People on X are the first to know.

## Post

## Conversation

[![Image 1: Square profile picture](https://pbs.twimg.com/profile_images/1864460831662198785/ycNcxa7F_normal.jpg)](https://x.com/NVIDIAAI)

SGLang is hitting 180 tok/s/GPU on DeepSeek-V4 decode with ~1M context on Blackwell. Good to see fast progress in open source DeepSeek-V4 inference on new hardware. This comes from Blackwell-specific optimizations by

that better use the model’s hybrid sparse attention.

Quote

LMSYS Org

@lmsysorg

Apr 24

DeepSeek V4 by @deepseek_ai just dropped! SGLang is ready on Day 0 with a full stack of optimizations from architectures to low-level kernels. We also deliver a verified RL training pipeline in Miles (by @radixark) for V4 at launch: ![Image 2: 1️⃣](https://abs.twimg.com/emoji/v2/svg/31-20e3.svg) Native "ShadowRadix" Design: DeepSeek V4's

[![Image 3: Image](https://pbs.twimg.com/media/HGo4qljagAAH3UF?format=jpg&name=small)](https://x.com/lmsysorg/status/2047511629919932623/photo/1)

Sign up now to get your own personalized timeline!

## Trending now

## What’s happening

Sports · Trending

Risacher

Only on X · Trending

#911onABC

Crime drama · Trending

#LawAndOrderSVU

Trending in United States

Fennell