Perplexity(@perplexity_ai)2026年5月27日

We're open-sourcing the Unigram tokenizer we rebuilt to reduce CPU utilization by 5-6x. Small reran...

8.5Score

We're open-sourcing the Unigram tokenizer we rebuilt to reduce CPU utilization by 5-6x.

Small reran...

TL;DR · AI 摘要

Perplexity开源了重构的Unigram分词器，CPU利用率降低5-6倍。

核心要点

Perplexity开源了Unigram分词器，CPU利用率降低5-6倍。
小规模重排序和嵌入器在GPU上运行只需单-digit毫秒。
CPU分词成为总延迟的重要组成部分。

结构提纲

按章节快速跳转。

§Introduction
Perplexity宣布开源其重构的Unigram分词器，显著降低CPU利用率。
·Performance Improvement
重构后的分词器将CPU利用率降低了5-6倍。
·Context of Usage
小规模重排序和嵌入器在GPU上运行速度快，CPU分词成为瓶颈。
·Open-Sourcing Details
开源项目地址为github.com/perplexityai/p。

思维导图

用一张图看清主题之间的关系。

查看大纲文本（无障碍 / 无 JS 友好）

Unigram Tokenizer Optimization

金句 / Highlights

值得收藏与分享的关键句。

We're open-sourcing the Unigram tokenizer we rebuilt to reduce CPU utilization by 5-6x.
— 第一段
⬇︎ 下载 PNG 𝕏 分享到 X
Small rerankers and embedders run in single-digit milliseconds on GPU, making CPU tokenization a meaningful share of total latency.
— 第一段
⬇︎ 下载 PNG 𝕏 分享到 X
CPU tokenization becomes a significant part of the total latency when small rerankers and embedders are fast on GPU.
— 第一段
⬇︎ 下载 PNG 𝕏 分享到 X

#Unigram#分词器#Perplexity#CPU优化#NLP

小型重排序器和嵌入器在 GPU 上运行时间仅为个位数毫秒，使得 CPU 分词成为总延迟中一个有意义的部分。

https://t.co/QUnHeiho56 https://t.co/Oh29f1lo51" / X

URL 源: https://x.com/perplexity_ai/status/2059664738087469511

Markdown 内容:

图像 1: 方形个人资料图片

我们开源了重新构建的 Unigram 分词器，以减少 5-6 倍的 CPU 利用率。小型重排序器和嵌入器在 GPU 上运行时间仅为个位数毫秒，使得 CPU 分词成为总延迟中一个有意义的部分。github.com/perplexityai/p

图像 2: 图像

下午 3:55 · 2026 年 5 月 27 日

We're open-sourcing the Unigram tokenizer we rebuilt to reduce CPU utilization by 5-6x. Small reran... | Perplexity(@perplexity_ai) | traeai