Every millisecond matters. We’re open sourcing the tokenizer we built and deployed on production; th...

Q: 开源贡献

开源分词器有助于社区和开发者。

Aravind Srinivas(@AravSrinivas)

Aravind Srinivas(@AravSrinivas)2026年5月27日

Every millisecond matters. We’re open sourcing the tokenizer we built and deployed on production; th...

8.5Score

TL;DR · AI Summary

Perplexity 开源其高效的 Unigram 分词器，CPU 利用率降低 5-6 倍，显著减少延迟。

Key Takeaways

Perplexity 开源 Unigram 分词器，CPU 利用率降低 5-6 倍。
小重排器和嵌入器在 GPU 上运行时间仅为单-digit 毫秒。
开源分词器有助于提高整体系统效率。

Outline

Jump quickly between sections.

§引言
Perplexity 开源其高效的 Unigram 分词器。
·分词器性能
Unigram 分词器显著降低 CPU 利用率，提高系统效率。
·技术细节
小重排器和嵌入器在 GPU 上运行时间仅为单-digit 毫秒。
·开源贡献
开源分词器有助于社区和开发者。

Mindmap

See how the topics connect at a glance.

查看大纲文本（无障碍 / 无 JS 友好）

Perplexity Unigram 分词器

Highlights

Key sentences worth saving and sharing.

We're open-sourcing the Unigram tokenizer we rebuilt to reduce CPU utilization by 5-6x.
— 正文
⬇︎ 下载 PNG 𝕏 分享到 X
Small rerankers and embedders run in single-digit milliseconds on GPU.
— 正文
⬇︎ 下载 PNG 𝕏 分享到 X
CPU tokenization is a meaningful share of total latency.
— 正文
⬇︎ 下载 PNG 𝕏 分享到 X

#Unigram 分词器#Perplexity#CPU 利用率#分词优化#开源项目

Open original article

Every millisecond matters. We’re open sourcing the tokenizer we built and deployed on production; that’s far efficient than huggingface and sentencepiece.

Quote

Perplexity

@perplexity_ai

9h

We're open-sourcing the Unigram tokenizer we rebuilt to reduce CPU utilization by 5-6x. Small rerankers and embedders run in single-digit milliseconds on GPU, making CPU tokenization a meaningful share of total latency. github.com/perplexityai/p