---
title: "Building real-world on-device AI with LiteRT and NPU"
source_name: "Google Developers Blog"
original_url: "https://developers.googleblog.com/building-real-world-on-device-ai-with-litert-and-npu/"
canonical_url: "https://www.traeai.com/articles/43858cb1-c2da-45ad-b645-f06f40774733"
content_type: "article"
language: "中文"
score: 5
tags: []
published_at: null
created_at: "2026-04-23T22:46:31.847321+00:00"
---

# Building real-world on-device AI with LiteRT and NPU

Canonical URL: https://www.traeai.com/articles/43858cb1-c2da-45ad-b645-f06f40774733
Original source: https://developers.googleblog.com/building-real-world-on-device-ai-with-litert-and-npu/

## Summary

traeai 为开发者、研究员和内容团队筛选高质量 AI 技术内容，提供摘要、评分、趋势雷达与一键内容产出。

## Key Takeaways

- 
- 
- 

## Content

Title: Building real-world on-device AI with LiteRT and NPU

URL Source: http://developers.googleblog.com/building-real-world-on-device-ai-with-litert-and-npu/

Published Time: 2026-04-23

Markdown Content:
# Building real-world on-device AI with LiteRT and NPU - Google Developers Blog

[![Image 1: Google for Developers](https://storage.googleapis.com/gweb-developer-goog-blog-cms-assets/site/20251118-195321/images/g-dev.svg)](https://developers.google.com/)

[Products](http://developers.google.com/products)[](http://developers.googleblog.com/building-real-world-on-device-ai-with-litert-and-npu/#)

*   Develop
*   [Android](http://developer.android.com/)
*   [Chrome](http://developer.chrome.com/)
*   [ChromeOS](http://chromeos.dev/)
*   [Cloud](http://cloud.google.com/)
*   [Firebase](http://firebase.google.com/)
*   [Flutter](http://flutter.dev/)
*   [Google Assistant](http://developers.google.com/assistant)
*   [Google Maps Platform](http://developers.google.com/maps)
*   [Google Workspace](http://developers.google.com/workspace)
*   [TensorFlow](http://www.tensorflow.org/)
*   [YouTube](http://developers.google.com/youtube)

*   Grow
*   [Firebase](http://firebase.google.com/)
*   [Google Ads](http://developers.google.com/google-ads)
*   [Google Analytics](http://developers.google.com/analytics)
*   [Google Play](http://developer.android.com/distribute)
*   [Search](http://developers.google.com/search)
*   [Web Push and Notification APIs](http://developers.google.com/web/fundamentals/engage-and-retain/push-notifications)

*   Earn
*   [AdMob](http://developers.google.com/admob)
*   [Google Ads API](http://developers.google.com/google-ads/api)
*   [Google Pay](http://developers.google.com/pay)
*   [Google Play Billing](http://developer.android.com/google/play/billing/)
*   [Interactive Media Ads](http://developers.google.com/interactive-media-ads)

[Solutions](http://developers.google.com/solutions/catalog)

[Events](http://developers.google.com/events)

[Learn](http://developers.google.com/learn)

[Community](http://developers.google.com/community)[](http://developers.googleblog.com/building-real-world-on-device-ai-with-litert-and-npu/#)

*   Groups
*   [Google Developer Groups](http://developers.google.com/community/gdg)
*   [Google Developer Student Clubs](http://developers.google.com/community/gdsc)
*   [Woman Techmakers](http://developers.google.com/womentechmakers)
*   [Google Developer Experts](http://developers.google.com/community/experts)
*   [Tech Equity Collective](http://www.techequitycollective.com/)

*   Programs
*   [Accelerator](http://developers.google.com/community/accelerators)
*   [Solution Challenge](http://developers.google.com/community/gdsc-solution-challenge)
*   [DevFest](http://developers.google.com/community/devfest)

*   Stories
*   [All Stories](http://developers.google.com/community/stories)

[Developer Program](http://developers.google.com/profile/u/me)

[Blog](http://developers.googleblog.com/)

 Search 

[![Image 2: Google for Developers](https://storage.googleapis.com/gweb-developer-goog-blog-cms-assets/site/20251118-195321/images/g-dev.svg)](https://developers.google.com/)

*   [Products](http://developers.google.com/products)
    *    More 

*   [Solutions](http://developers.google.com/solutions/catalog)
*   [Events](http://developers.google.com/events)
*   [Learn](http://developers.google.com/learn)
*   [Community](http://developers.google.com/community)
    *    More 

*   [Developer Program](http://developers.google.com/profile/u/me)
*   [Blog](http://developers.googleblog.com/)

*    Develop 
*   [Android](http://developer.android.com/)
*   [Chrome](http://developer.chrome.com/)
*   [ChromeOS](http://chromeos.dev/)
*   [Cloud](http://cloud.google.com/)
*   [Firebase](http://firebase.google.com/)
*   [Flutter](http://flutter.dev/)
*   [Google Assistant](http://developers.google.com/assistant)
*   [Google Maps Platform](http://developers.google.com/maps)
*   [Google Workspace](http://developers.google.com/workspace)
*   [TensorFlow](http://www.tensorflow.org/)
*   [YouTube](http://developers.google.com/youtube)
*    Grow 
*   [Firebase](http://firebase.google.com/)
*   [Google Ads](http://developers.google.com/google-ads)
*   [Google Analytics](http://developers.google.com/analytics)
*   [Google Play](http://developer.android.com/distribute)
*   [Search](http://developers.google.com/search)
*   [Web Push and Notification APIs](http://developers.google.com/web/fundamentals/engage-and-retain/push-notifications)
*    Earn 
*   [AdMob](http://developers.google.com/admob)
*   [Google Ads API](http://developers.google.com/google-ads/api)
*   [Google Pay](http://developers.google.com/pay)
*   [Google Play Billing](http://developer.android.com/google/play/billing/)
*   [Interactive Media Ads](http://developers.google.com/interactive-media-ads)

*    Groups 
*   [Google Developer Groups](http://developers.google.com/community/gdg)
*   [Google Developer Student Clubs](http://developers.google.com/community/gdsc)
*   [Woman Techmakers](http://developers.google.com/womentechmakers)
*   [Google Developer Experts](http://developers.google.com/community/experts)
*   [Tech Equity Collective](http://www.techequitycollective.com/)
*    Programs 
*   [Accelerator](http://developers.google.com/community/accelerators)
*   [Solution Challenge](http://developers.google.com/community/gdsc-solution-challenge)
*   [DevFest](http://developers.google.com/community/devfest)
*    Stories 
*   [All Stories](http://developers.google.com/community/stories)

# Building real-world on-device AI with LiteRT and NPU

APRIL 23, 2026

[Chintan Parikh](http://developers.googleblog.com/search/?author=Chintan+Parikh)Product Manager

[Shuangfeng Li](http://developers.googleblog.com/search/?author=Shuangfeng+Li)Software Engineer

[Weiyi Wang](http://developers.googleblog.com/search/?author=Weiyi+Wang)Software Engineer

[Gerardo Carranza](http://developers.googleblog.com/search/?author=Gerardo+Carranza)Software Engineer

Share
*   [Facebook](https://www.facebook.com/sharer/sharer.php?u={url} "Share on Facebook")
*   [Twitter](https://twitter.com/intent/tweet?text={url} "Share on Twitter")
*   [LinkedIn](https://www.linkedin.com/shareArticle?url={url}&mini=true "Share on LinkedIn")
*   [Mail](mailto:name@example.com?subject=Check%20out%20this%20site&body=Check%20out%20{url} "Send via Email")
*   [](http://developers.googleblog.com/building-real-world-on-device-ai-with-litert-and-npu/# "Get shareable link")

![Image 3: Gemini_Generated_Image_ignk8signk8signk (1)](https://storage.googleapis.com/gweb-developer-goog-blog-assets/images/Gemini_Generated_Image_ignk8signk8signk_1.original.png)

Users benefit from instant AI features like real-time video effects, ASR, and motion capture in their mobile apps. However, for developers, running sophisticated models on-device often comes with balancing unique challenges related to managing device thermals, preserving battery life, and preventing frame drops. To deliver fast, responsive AI experiences without compromising performance, **LiteRT** unlocks **Neural Processing Units (NPUs)**, the hardware specifically built for these workloads.

[LiteRT is a cross-platform production-ready](https://developers.googleblog.com/litert-the-universal-framework-for-on-device-ai/) framework for on-device AI, offering **CPU, GPU, and NPU acceleration** across mobile, desktop, and IoT platforms. Designed for performance and scalability, LiteRT simplifies the deployment of high-speed AI features, through a unified API. This abstracts the complexity of integrating with multiple NPU SDKs, allowing developers to target diverse silicon without writing vendor-specific code.

### **Translating NPU performance into meaningful experiences**

LiteRT is already hardened across Google products, popular apps, and even SDKs. Utilized by industry leaders including Google Meet, Epic Games, and Argmax Inc. here is what NPU acceleration looks like in real-world production apps.

[**Google Meet**](https://play.google.com/store/apps/details?id=com.google.android.apps.tachyon): By leveraging the mobile NPU, Google Meet successfully deployed an Ultra-HD segmentation model **25x larger** than previous versions - without sacrificing inference speed. Crucially, it maintains a consistent power footprint, creating thermal headroom necessary to deliver higher-quality background replacement throughout a typical 20-30 min session.

Sorry, your browser doesn't support playback for this video

[**Epic Games, Inc**](https://www.epicgames.com/site/home): High-fidelity, real-time animation experiences demand exceptional efficiency. Epic’s [Live Link Face (Beta)](https://play.google.com/store/apps/details?id=com.epicgames.facelink&hl=en_US) app for Android enables creators to capture performances from a single camera, then generate and stream real-time MetaHuman facial animation directly from their devices into Unreal Engine.

Real-time facial solving is computationally intensive and requires consistently low latency. By using LiteRT on the NPU, Epic unlocks dedicated on-device acceleration on supported Android devices, enabling up to 30 FPS performance for real-time MetaHuman animation.

Sorry, your browser doesn't support playback for this video

Real-time MetaHuman facial animation in Unreal Engine with NPU

[**Argmax Inc**](https://www.argmaxinc.com/) recently launched the[Argmax Pro SDK for Android](https://www.argmaxinc.com/blog/argmax-pro-sdk-for-android) for on-device speech recognition in collaboration with LiteRT. By utilizing LiteRT and AI Pack feature delivery via Google Play, Argmax was able to bring its top-tier accuracy and real-time speed while respecting app size constraints on Android. Crucially, they leveraged LiteRT's Ahead-Of-Time (AOT) compilation to eliminate costly on-device compilation steps, enabling frontier speech models like NVIDIA Parakeet TDT 0.6B v2 to run with industry-leading latency.

Performance testing across Google Tensor, MediaTek and Qualcomm Technologies SoCs, Argmax Pro SDK showed that upgrading from GPU to NPU delivers over **2x speedup**. Beyond the speedups, the power efficiency of NPUs enabled Argmax SDK Enterprise customers like [Heidi Health](https://www.argmaxinc.com/blog/heidi-health-ai-scribe-built-with-argmax-enterprise) to conduct reliable on-device live transcription for extended sessions while mitigating impact to battery life. Finally, by offloading runtime libraries and models to on-demand downloads via Play's AI Packs, the device dynamically obtains the model that's optimized for the specific NPU.

![Image 4: Untitled](https://storage.googleapis.com/gweb-developer-goog-blog-assets/images/Untitled.original.jpg)

 Argmax's Kotlin-first SDK brings top-tier accuracy and real-time speed to Android, with seamless NPU and GPU acceleration by Google LiteRT. 

[**Google AI Edge Gallery App**](https://play.google.com/store/apps/details?id=com.google.ai.edge.gallery)**:** To help developers test and validate the performance of NPU acceleration, we are happy to announce that the **Google AI Edge Gallery App** now features **NPU support** for select Gemma models and built-in benchmarking tools. Available on [Android](https://play.google.com/store/apps/details?id=com.google.ai.edge.gallery), AI Edge Gallery lets you quickly see the true potential of AI performance on mobile hardware. Developers can also access the [Google AI Edge Gallery](https://github.com/google-ai-edge/gallery) on GitHub to build their own experiences.

Sorry, your browser doesn't support playback for this video

Explore various on-device LLM use cases with Google AI Edge Gallery

### **Scaling performance across the hardware spectrum**

While the performance gains in speech, animation, and video are clear, the path to the NPU has historically been difficult to unlock for developers, due to various vendor-specific SDKs and complexities. By providing a streamlined workflow and cross platform support, LiteRT enables developers to deploy advanced models, from mobile phones to industrial IoT and AI PCs, without sacrificing performance or portability.

**Cross-platform NPU support**

As highlighted in the recent [Google AI Edge Gemma 4 blog](https://developers.googleblog.com/bring-state-of-the-art-agentic-skills-to-the-edge-with-gemma-4/) post, LiteRT extends NPU acceleration beyond mobile, allowing you to deploy your models across a range of hardware using a single framework. For the industrial edge, LiteRT supports platforms like the [Qualcomm Dragonwing ™ IQ8 Series](https://www.qualcomm.com/internet-of-things/products/iq8-series), which also powers[Arduino VENTUNO Q](https://www.qualcomm.com/news/releases/2026/03/arduino-announces-arduino-ventuno-q----powered-by-qualcomm-drago), enabling high-reliability use cases like robotics and smart manufacturing with models like [Gemma 4](https://huggingface.co/litert-community/gemma-4-E2B-it-litert-lm). For desktop, LiteRT is preparing for AI PCs through [OpenVINO™ integration](https://community.intel.com/t5/Blogs/Tech-Innovation/Artificial-Intelligence-AI/Gemma-4-Models-optimized-for-Intel-Hardware-Enabling-instant/post/1742983) with **Intel® Core™ Ultra series 2** and **3** processors, delivering significant power savings and responsiveness for local GenAI workloads.

**Performance validation at scale**

[**Google AI Edge Portal**](https://ai.google.dev/edge/ai-edge-portal) provides a benchmark service across 100+ of the most popular mobile phones with insights on ML workloads across devices, accelerators and configurations. Developers can now make data-driven deployment decisions, such as whether to use AOT or JIT, that best suit their use cases and their target devices. To use the latest Portal NPU features, sign up for our private preview [here](https://docs.google.com/forms/d/e/1FAIpQLSfTcGPycQve8TLAsfH46pBlXBZe9FrgJAClwbF7DeL1LgVn4Q/viewform).

Sorry, your browser doesn't support playback for this video

Google AI Edge Portal Benchmarking Results

### **Get started with your NPU journey**

With our production-ready NPU integrations, LiteRT provides a unified workflow that abstracts away low-level complexities across both **Just-In-Time (JIT)** and **Ahead-Of-Time (AOT)** deployment.

Dive into our documentation and start your journey with NPU acceleration today.

*   **Documentation:** Explore the [LiteRT](https://ai.google.dev/edge/litert)&[LiteRT-LM](https://ai.google.dev/edge/litert-lm) documentation for comprehensive development guides.
*   **GitHub repos**: Visit [LiteRT](https://github.com/google-ai-edge/litert) and [LiteRT-LM](https://github.com/google-ai-edge/LiteRT-LM?tab=readme-ov-file) GitHub repos for latest updates and implementation details.
*   **Samples**: Check out the [LiteRT-Samples](https://github.com/google-ai-edge/litert-samples) GitHub repo for reference code. Use the [AI Edge Gallery app](https://github.com/google-ai-edge/gallery) as a starting point to build your own app.
*   **Models**: Visit [LiteRT Hugging Face Community](https://huggingface.co/litert-community) for ready-to-use open models like [Gemma 4](https://huggingface.co/litert-community/gemma-4-E2B-it-litert-lm). We keep actively optimizing the open weight model family, ensuring that its architectural improvements are mapped directly to high-speed NPU kernels. You can access those models using [LiteRT-LM CLI](https://ai.google.dev/edge/litert-lm/cli). More details in [‘Bring state-of-the-art agentic skills to the edge with Gemma 4.’](https://developers.googleblog.com/bring-state-of-the-art-agentic-skills-to-the-edge-with-gemma-4/)
*   [Google Tensor](https://ai.google.dev/edge/litert/next/tensor_ml_sdk)**-** Sign up for experimental access to Google Tensor ML SDK

Let us know your feedback and feature requests by opening an issue on our [GitHub channel](https://github.com/google-ai-edge/LiteRT/issues). We can’t wait to see what you build!

### **Acknowledgements**

_Google: Akshat Sharma, Alice Zheng, Andrew Zhang, Ashley Lin, Byungchul Kim, Changming Sun, Charlie Xu, Chenchen Tang, Chunlei Niu, Cormac Brick, Derek Bekebrede, Fabian Bergmark, Fengwu Yao, Gerardo Carranza, Gregory Karpiak, Jae Yoo, Jing Jin, Jingjiang Li, Julius Kammerl, Jun Jiang, Lu Wang, Maria Lyubimtseva, Mariana Quesada, Marissa Ikonomidis, Matt Kreileder, Matthias Grundmann, Meghna Johar, Na Li, Ping Yu, Renjie Wu, Rishika Sinha, Sachin Kotwani, Salil Tambe, Siargey Pisarchyk, Siargey Pisarchyk, Somdatta Banerjee, Steven Toribio, Suleman Shahid, Terry Heo, Wai Hon Law, Weiyi Wang, Xiaoming Hu_

_Partners: Alen Huang, Ankit Kapoor, Arda Atahan Ibis, Atila Orhon, Brian Keene, Chen Cen, Cheng-Dao Lee. Cheng-Yen Lin, Chun-Ting Lin (Graham), Code Lin, Deep Yap, Dylan Angus, Felix Baum, HungChun Liu, Jhih-Kuan Lin, Jiun-Kai Yang (Kelvin), Kedar Gharat, Ken Sieger, Laxmi Rayapudi, Lei Chen, Mike Tremaine, Ming-Che Lin (Vincent), Poyuan Jeng, MetaHuman Team, Vinesh Sukumar, Waimun Wong, Yi-Ru Chen, Yu-Ting Wan, Zach Nagengast_

 posted in: 

*   [Mobile](http://developers.googleblog.com/search/?technology_categories=Mobile)
*   [AI](http://developers.googleblog.com/search/?technology_categories=AI)
*   [Case Studies](http://developers.googleblog.com/search/?content_type_categories=Case+Studies)
*   [Announcements](http://developers.googleblog.com/search/?content_type_categories=Announcements)
*   [Learn](http://developers.googleblog.com/search/?content_type_categories=Learn)
*   [Explore](http://developers.googleblog.com/search/?content_type_categories=Explore)
*   [Influence](http://developers.googleblog.com/search/?tag=Influence)

[](http://developers.googleblog.com/building-real-world-on-device-ai-with-litert-and-npu/)Previous

Next[](http://developers.googleblog.com/production-ready-ai-agents-5-lessons-from-refactoring-a-monolith/)

Related Posts

[![Image 5: A2UI v0.9: The New Standard for Portable, Framework-Agnostic Generative UI](https://storage.googleapis.com/gweb-developer-goog-blog-assets/images/Hero.2e16d0ba.fill-800x400.jpg) Mobile Web How-To Guides Announcements A2UI v0.9: The New Standard for Portable, Framework-Agnostic Generative UI APRIL 17, 2026](http://developers.googleblog.com/a2ui-v0-9-generative-ui/)[![Image 6: Boost Training Goodput: How Continuous Checkpointing Optimizes Reliability in Orbax and MaxText](https://storage.googleapis.com/gweb-developer-goog-blog-assets/images/adk-gemini-cli-agent-building-bann.2e16d0ba.fill-800x400.png) AI Cloud Tutorials Case Studies Boost Training Goodput: How Continuous Checkpointing Optimizes Reliability in Orbax and MaxText MARCH 31, 2026](http://developers.googleblog.com/boost-training-goodput-how-continuous-checkpointing-optimizes-reliability-in-orbax-and-maxtext/)[![Image 7: Build Better AI Agents: 5 Developer Tips from the Agent Bake-Off](https://storage.googleapis.com/gweb-developer-goog-blog-assets/images/Gemini_Generated_Image_7z3w7s7z3w7.2e16d0ba.fill-800x400.jpg) AI Cloud Case Studies Best Practices Build Better AI Agents: 5 Developer Tips from the Agent Bake-Off APRIL 14, 2026](http://developers.googleblog.com/build-better-ai-agents-5-developer-tips-from-the-agent-bake-off/)[![Image 8: New enhancements for merchant initiated transactions with the Google Pay API](https://storage.googleapis.com/gweb-developer-goog-blog-assets/images/image_2.2e16d0ba.fill-800x400.png) Pay Mobile Web Tutorials Announcements New enhancements for merchant initiated transactions with the Google Pay API APRIL 15, 2026](http://developers.googleblog.com/new-enhancements-for-merchant-initiated-transactions-with-the-google-pay-api/)

*    Connect 
    *   [Blog](http://googledevelopers.blogspot.com/)
    *   [Bluesky](https://goo.gle/3FReQXN)
    *   [Instagram](https://goo.gle/googlefordevs)
    *   [LinkedIn](https://goo.gle/gdevs-li)
    *   [X (Twitter)](https://goo.gle/gdevs-tw)
    *   [YouTube](https://goo.gle/developers)

*    Programs 
    *   [Google Developer Program](http://developers.google.com/program)
    *   [Google Developer Groups](http://developers.google.com/community/gdg)
    *   [Google Developer Experts](http://developers.google.com/community/experts)
    *   [Accelerators](http://developers.google.com/community/accelerators)
    *   [Women Techmakers](http://www.womentechmakers.com/)
    *   [Google Cloud & NVIDIA](http://developers.google.com/community/nvidia)

*    Developer consoles 
    *   [Google API Console](http://console.developers.google.com/)
    *   [Google Cloud Platform Console](http://console.cloud.google.com/)
    *   [Google Play Console](http://play.google.com/apps/publish)
    *   [Firebase Console](http://console.firebase.google.com/)
    *   [Actions on Google Console](http://console.actions.google.com/)
    *   [Cast SDK Developer Console](http://cast.google.com/publish)
    *   [Chrome Web Store Dashboard](http://chrome.google.com/webstore/developer/dashboard)
    *   [Google Home Developer Console](http://console.home.google.com/)

[![Image 9: Google for Developers](https://storage.googleapis.com/gweb-developer-goog-blog-cms-assets/site/20251118-195321/images/g-dev.svg)](https://developers.google.com/)
*   [Android](http://developer.android.com/)
*   [Chrome](http://developer.chrome.com/home)
*   [Firebase](http://firebase.google.com/)
*   [Google Cloud Platform](http://cloud.google.com/)
*   [All products](http://developers.google.com/products)
*    Manage cookies 

*   [Terms](http://developers.google.com/terms/site-terms)
*   [Privacy](http://policies.google.com/privacy)
