T
traeai
Sign in

产品

SGLang

微软技术栈中使用的序列生成语言工具。

已跟踪 6 条高相关材料

TraeAI 观察

相关材料

已收录 6 条与 SGLang 相关的内容,按评分排序。

Latent Space 图标

Reve 2 and Ideogram 4: Layouts in Imagegen

Latent Space1547 字 (约 7 分钟)
87

Advances in image composition are simultaneously broken by Reve 2 and Ideogram 4, with Ideogram 4 now the top-ranked open image model on Arena. Microsoft released MAI-Thinking-1 achieving 97% on AIME 2025 without synthetic data or distillation, publishing detailed training stacks and MoE scaling. Frontier Tuning enables enterprise workflow models to reach GPT-5.4 quality with up to 10× efficiency gains, while Gemma 4 12B and others strengthen local-first deployment momentum.

入选理由:Ideogram 4.0 登顶 Arena 开放图像模型榜单,图像布局能力显著提升。

FeaturedArticle#ImageGen#Layouts#MAI-Thinking-1#Frontier Tuning#Gemma 4 12B英文
Benchmarking inference at scale: coding agents

Benchmarking inference at scale: coding agents

Together AI Blog1358 字 (约 6 分钟)
85

Together Inference Engine delivers 31% more TPS than next fastest OSS engine on same hardware, maintains 2× better TTFT at saturation. Performance gains come from full-stack optimization.

入选理由:ThunderMLA、自定义内核重写和端到端优化使Together引擎比其他OSS引擎多31%的TPS

FeaturedArticle#Together AI#Inference Engine#Coding Agent#Performance Optimization#TTFT英文
163: 详解DeepSeekV4:Infra巨鲸、百万上下文走进现实、极致效率优化

DeepSeekV4发布,通过组合创新和工程优化,在R1的“测试时扩展”范式下,实现百万上下文从理论到实用的飞跃,对Agent和多步复杂任务具有重要意义。

入选理由:DeepSeek V4沿用现有范式,未带来范式变化,但通过一系列技术创新显著提升了长上下文处理能力。

FeaturedPodcast#DeepSeek#大模型#注意力机制#优化器#稀疏注意力中文
SGLang is hitting 180 tok/s/GPU on DeepSeek-V4 decode with ~1M context on Blackwell. 

Good to see f...

NVIDIA AI 报告称,SGLang 在 Blackwell 硬件上使用 DeepSeek-V4 模型解码达到 180 tok/s/GPU 的速度,约 1M 上下文,得益于 LMSYS 组织针对 Blackwell 的特定优化,提高了混合稀疏注意力的利用效率。

入选理由:SGLang 在 DeepSeek-V4 解码任务上实现高性能,达 180 tok/s/GPU。

FeaturedTweet#NVIDIA#DeepSeek-V4#SGLang#Blackwell#LMSYS中文
> Ecosystem: Compatible with llama.cpp, MLX, @LMStudio, vLLM, @ollama, @UnslothAI, and SGLang.
&g...

Google AI Developers: Gemma 4 Ecosystem Compatibility and Downloads

Google AI Developers(@googleaidevs)78 字 (约 1 分钟)
65

Google announces its model weights are compatible with major open-source ecosystems and can be directly downloaded from Hugging Face and Kaggle, lowering deployment barriers.

入选理由:Gemma 4 权重与 llama.cpp、vLLM、Ollama 等生态兼容,便于本地部署与推理。

FeaturedTweet#Gemma#Open-source Ecosystem#Model Deployment#Hugging Face#Kaggle英文

跨材料问答 · SGLang

回答基于:SGLang 相关 6 条材料
    0 / 500

    AI may generate inaccurate information. Please verify important content.