T
traeai
Sign in

产品

LiteRT-LM

轻量级大语言模型运行时,支持通过CLI在本地启动兼容OpenAI格式的服务端点。

已跟踪 4 条高相关材料

TraeAI 观察

相关材料

已收录 4 条与 LiteRT-LM 相关的内容,按评分排序。

Benchmark and optimize LLMs on-device with AI Edge Portal

Benchmark and optimize LLMs on-device with AI Edge Portal

Google Cloud Blog924 字 (约 4 分钟)
85

Google AI Edge Portal introduces new LLM benchmarking and debugging capabilities, enabling performance optimization across over 120 Android devices with key metrics like initialization time and decode speed analysis, plus visualization tools.

入选理由:AI Edge Portal支持在120+ Android设备上测试LLM,提供初始化时间、预填速度等4项核心性能指标

FeaturedArticle#LLM optimization#Edge computing#Android devices#Google AI Edge Portal#Model Explorer英文
Bringing Gemma 4 12B to your Laptop: Unlocking Local, Agentic Workflows with Google AI Edge

Gemma 4 12B combined with Google AI Edge enables local execution on laptops, supporting code generation, voice editing, and OpenAI-compatible APIs on macOS. This setup facilitates on-device agentic workflows with a 60%+ quality boost in instruction following while ensuring offline privacy.

入选理由:Gemma 4 12B通过LiteRT-LM在消费级笔记本运行,支持本地Agent与多模态任务。

FeaturedArticle#Gemma 4#Google AI Edge#On-device AI#LiteRT-LM#Agentic Workflow英文
Blazing fast on-device GenAI with LiteRT-LM

Blazing fast on-device GenAI with LiteRT-LM

Google Developers Blog1574 字 (约 7 分钟)
75

Google AI Edge introduces LiteRT-LM, an optimized inference engine for deploying Gemma 4 models on edge devices, supporting Android, iOS, and web platforms with GPU inference reaching 76 tokens/sec and Multi-Token Prediction delivering up to 2.2x speedup.

入选理由:LiteRT-LM 在 Android GPU (OpenCL) 上实现 52 tokens/sec 解码速度,iOS (Metal) 达 56 tokens/sec,WebGPU 在 MacBook Pro 上可达 76 tokens/sec

FeaturedArticle#Google AI Edge#LiteRT-LM#Gemma 4#Edge AI#On-device Inference英文
TLMs: Tiny LLMs and Agents on Edge Devices with LiteRT-LM — Cormac Brick, Google

Google 提出 TLMs(Tiny Language Models)与 LiteRT-LM 框架,支持在边缘设备上高效部署轻量级 LLM 和自主 Agent,兼顾低延迟、隐私保护与离线能力。

入选理由:TLMs 是专为边缘设备优化的 sub-100M 参数 LLM,通过结构压缩与量化实现毫秒级推理。

FeaturedVideo#LLM#edge computing#Google#LiteRT-LM#TLM英文

跨材料问答 · LiteRT-LM

回答基于:LiteRT-LM 相关 4 条材料
    0 / 500

    AI may generate inaccurate information. Please verify important content.