T
traeai
Sign in
返回首页
AI EngineerVideo

From 46% to 90%: Fine-Tuning Tiny LLMs for On-Device Agents — Cormac Brick, Google

8.5Score
Watchable video resourceOpen original video

TL;DR · AI Summary

Google's AI Edge platform boosts on-device inference performance of Tiny LLMs (e.g., Gemini Nano) and agent skills from 46% to 90%, supporting cross-platform deployment with TensorFlow Lite runtime.

Key Takeaways

  • TensorFlow Lite and Lighter TLM achieve 90% inference performance for Tiny LLMs
  • Gemini Nano pre-installed via AI Core API provides summarization APIs with optim
  • TensorFlow Lite supports 2.7B Android devices, with Gemini 4 models achieving ef

Outline

Jump quickly between sections.

  1. Introduces motivations for deploying Tiny LLMs (<1B parameters) on-device, including low latency, privacy, and offline use.

  2. TensorFlow Lite as cross-framework runtime supporting MediaPipe and Lighter TLM deployment across CPU/GPU/NPU.

  3. Gemini Nano pre-installed via AI Core provides summarization APIs covering 2.7B Android devices.

  4. Gemini 4 achieves efficient inference on NPU/GPU, supporting iOS/Android platforms.

  5. Demonstrates methods to build custom agent skills atop system GenAI with toolchains.

Mindmap

See how the topics connect at a glance.

查看大纲文本(无障碍 / 无 JS 友好)
  • 设备端Tiny LLM优化
    • AI Edge平台
      • TensorFlow Lite
      • MediaPipe
    • Gemini Nano部署
      • 系统级GenAI
      • AI Core API
    • 跨平台支持
      • Android
      • NPU/GPU

Highlights

Key sentences worth saving and sharing.

#Tiny LLMs#TensorFlow Lite#Gemini Nano#AI Edge#Google

AI may generate inaccurate information. Please verify important content.