产品

ROSE

Q: ROSE 最近有什么新动态？

traeai 已收录 2 篇与 ROSE 相关的内容。最新一篇是「Perplexity runs on NVIDIA. Nice breakdown from the team on how they’re using the CUTLASS Python st...」，由 NVIDIA AI(@NVIDIAAI) 发布。

别名：runtime_optimized_serving_engine

Perplexity开发的推理引擎，支持多种规模的AI模型。

已跟踪 2 条高相关材料

TraeAI 观察

如果只读 3 篇

Perplexity runs on NVIDIA. Nice breakdown from the team on how they’re using the CUTLASS Python st...

NVIDIA AI(@NVIDIAAI) · 7.2 分

Perplexity利用NVIDIA的CUTLASS Python栈优化其推理模型，显著提升大规模语言模型的性能。

We’ve developed our own inference engine Runtime-Optimized Serving Engine (ROSE) to serve models ran...

Perplexity(@perplexity_ai) · 6.5 分

Perplexity 推出自研推理引擎 ROSE，支持从嵌入模型到万亿参数大模型的高效服务，并集成 CuTeDSL 以加速 GPU 内核定制，优化在 NVIDIA Hopper 和 Blackwell 架构上的性能。

Perplexity runs on NVIDIA.

NVIDIA AI(@NVIDIAAI)5月8日118 字 (约 1 分钟)

Perplexity leverages NVIDIA's CUTLASS Python stack to optimize its inference models, significantly enhancing the performance of large-scale language models.

入选理由：Perplexity开发了ROSE推理引擎，支持从嵌入到万亿参数LLM的模型服务。

FeaturedTweet#NVIDIA#AI#CUTLASS#Inference Engine英文

We’ve developed our own inference engine Runtime-Optimized Serving Engine (ROSE) to serve models ran...

We’ve developed our own inference engine ROSE

Perplexity(@perplexity_ai)5月6日302 字 (约 2 分钟)

Perplexity has launched its in-house inference engine ROSE, enabling efficient serving from embedding models to trillion-parameter LLMs, with CuTeDSL integration for faster GPU kernel customization.

入选理由：Perplexity 自主研发了推理引擎 ROSE，提升大模型服务效率。

FeaturedTweet#ROSE#CuTeDSL#GPU optimization#large model inference#Perplexity英文

跨材料问答 · ROSE

回答基于：ROSE 相关 2 条材料