TLMs: Tiny LLMs and Agents on Edge Devices with @cormacb https://t.co/u0fHD7j5kZ Function Gemma s...

TL;DR · AI 摘要
本文介绍了Tiny LLMs和Agents在边缘设备上的应用,特别是Function Gemma模型在Pixel 7上的性能表现,以及开发者在设备上实现AI的两种路径:基于Gemma 4的技能框架和Eloquent生产转录应用。
核心要点
- Function Gemma模型在Pixel 7上以270M参数运行,预填处理速度达到近2000 token/秒,出厂时在固定应用意图上准确率达到46%。
- 通过在合成生成的数据集上进行微调,准确率在十个功能中的八个上超过90%。
- 开发者有两条路径在设备上实现AI:一是使用基于Gemma 4的技能框架,二是将两个亚十亿参数模型链在一起,如Eloquent转录应用。
结构提纲
按章节快速跳转。
思维导图
用一张图看清主题之间的关系。
查看大纲文本(无障碍 / 无 JS 友好)
- Tiny LLMs and Agents on Edge Devices
金句 / Highlights
值得收藏与分享的关键句。
Function Gemma ships at 270 million parameters and runs nearly 2,000 tokens per second prefill on a Pixel 7.
Out of the box, it hits 46% accuracy on a fixed set of app intents.
Fine tune on a synthetically generated dataset and that clears 90% on eight of ten functions.
https://t.co/32bANDfdr8
Function Gemma ships at 270 million parameters and runs nearly 2,000 tokens per second prefill on a Pixel 7. Out of the box, it hits 46% accuracy on a fixed set of app intents. Fine tune on a https://t.co/BM1BFC6L26" / X
TLMs: Tiny LLMs and Agents on Edge Devices with
youtube.com/watch?v=-TiET_ Function Gemma ships at 270 million parameters and runs nearly 2,000 tokens per second prefill on a Pixel 7. Out of the box, it hits 46% accuracy on a fixed set of app intents. Fine tune on a synthetically generated dataset and that clears 90% on eight of ten functions. Cormac walks through the two paths developers have for on device AI: a skill harness built on Gemma 4 with a restaurant roulette demo running fully on device. Then Eloquent, a production transcription app built by chaining two sub billion parameter models together. cc