T
traeai
Sign in

模型

Gemma 4

别名:Gemma4、Gemma 4 12B

Google发布的120亿参数原生多模态大语言模型,支持文本、图像、音频统一处理。

已跟踪 21 条高相关材料

TraeAI 观察

相关材料

已收录 21 条与 Gemma 4 相关的内容,按评分排序。

Google AI Studio 3.0 (Fully Free): This is ACTUALLY AWESOME!

Google AI Studio 3.0 (Fully Free): This is ACTUALLY AWESOME!

AICodeKing979 字 (约 4 分钟)
87

Google AI Studio 3.0 launches fully free with integrated Gemma 4 model and multimodal capabilities, enabling real-time inference, custom model deployment, and API access, significantly lowering the barrier for developers.

入选理由:Gemma 4 模型在 Google AI Studio 3.0 中完全免费,支持 128K 上下文长度。

FeaturedVideo#Google AI Studio#Gemma 4#AI development tool#free AI platform中文
Building a Multi-Tool Gemma 4 Agent with Error Recovery

Building a Multi-Tool Gemma 4 Agent with Error Recovery

Machine Learning Mastery3497 字 (约 14 分钟)
85

通过构建一个具有错误恢复机制的多工具 Gemma 4 代理,学习如何优雅地处理工具调用中的失败。

入选理由:迭代代理循环需设置最大迭代次数以防止无限循环。

FeaturedArticle#Gemma 4#工具调用#错误恢复#迭代代理英文
Hugging Face Blog 图标

Reachy Mini goes fully local

Hugging Face Blog1966 字 (约 8 分钟)
85

Reachy Mini now runs its voice backend locally, eliminating the need for cloud servers.

入选理由:部署本地语音后端于 Reachy Mini 上。

FeaturedArticle#Reachy Mini#Voice Backend#Local Service中文
Easy Agentic Tool Calling with Gemma 4

Easy Agentic Tool Calling with Gemma 4

KDnuggets2859 字 (约 12 分钟)
85

Gemma 4 enables true agentic behavior through local sandboxed tools like filesystem exploration and restricted Python execution.

入选理由:Gemma 4 支持本地工具调用,如文件系统探索和受限 Python 执行,增强模型自主性

FeaturedArticle#Gemma 4#Agent#Tool Calling#Security#Python英文
TLMs: Tiny LLMs and Agents on Edge Devices with @cormacb 

https://t.co/u0fHD7j5kZ

Function Gemma s...

本文介绍了Tiny LLMs和Agents在边缘设备上的应用,特别是Function Gemma模型在Pixel 7上的性能表现,以及开发者在设备上实现AI的两种路径:基于Gemma 4的技能框架和Eloquent生产转录应用。

入选理由:Function Gemma模型在Pixel 7上以270M参数运行,预填处理速度达到近2000 token/秒,出厂时在固定应用意图上准确率达到46%。

FeaturedTweet#Tiny LLMs#Edge Devices#Function Gemma#AI on Devices#Machine Learning中文
Recent Developments in LLM Architectures: KV Sharing, mHC, and Compressed Attention

Recent developments in LLM architectures focus on KV sharing, mHC, and compressed attention to improve long-context efficiency.

入选理由:Gemma 4引入KV共享和每层嵌入,优化内存使用。

FeaturedArticle#LLM#Architecture Optimization#Attention Mechanism英文
A Smarter Google AI Edge Gallery: MCP integration, notifications, and session continuity

A Smarter Google AI Edge Gallery: MCP integration, notifications, and session continuity

Google Developers Blog1169 字 (约 5 分钟)
80

Google AI Edge Gallery introduces three major capabilities: MCP protocol support for cross-data-source tool calling, local notification scheduling for proactive interactions, and persistent chat history, shifting mobile Agent development from reactive to automated and continuous experiences.

入选理由:通过注册MCP URL,应用可将工具定义动态导入本地模型系统提示词,推理完全在手机端完成,请求由MCP服务器执行

FeaturedArticle#Google AI Edge Gallery#MCP#On-device AI#Gemma 4#Mobile Agent英文
Ok that's so cool

Multi-token prediction makes Gemma 4 run way faster locally!

Same model, same la...

Ok that's so cool

Paul Couvert(@itsPaulAi)281 字 (约 2 分钟)
78

Multi-token prediction technology makes Gemma 4 run 1.5 times faster locally, reaching 138 tokens/s.

入选理由:Gemma 4使用MTP后,性能从97 tokens/s提升至138 tokens/s。

FeaturedTweet#Gemma 4#MTP#Open Source中文
We released Gemma 4 12B yesterday. Here is a visual guide that explains the full architecture.

→ Ho...

Gemma 4 12B Released: Visual Guide to Native Multimodal Architecture

Philipp Schmid(@_philschmid)169 字 (约 1 分钟)
75

Gemma 4 12B achieves native multimodal processing for text, images, and audio by removing separate vision and audio encoders. This architecture replaces traditional encoder-patching approaches with joint representation learning, reducing inference latency and improving edge deployment efficiency.

入选理由:Gemma 4 12B移除独立视觉/音频编码器,采用原生多模态统一架构

FeaturedTweet#Gemma 4#Multimodal LLM#Native Multimodality#Edge AI英文
Gemma 4 Multi-Token Prediction Delivers Up to ~3x Faster Token Generation

Gemma 4 introduces multi-token prediction technology, achieving up to 3x faster token generation, significantly improving large model inference efficiency.

入选理由:Gemma 4 采用多令牌预测技术,将令牌生成速度提升至原来的 3 倍。

FeaturedArticle#AI#LLM#Gemma#Transformer#Token Generation英文
AI on Android: Ask me Anything — Florina Muntenescu & Oli Gaymond, Google DeepMind

Android developers can build intelligent experiences through three approaches: pure on-device models, hybrid mode (on-device first with cloud fallback), or pure cloud inference, where Gemini Nano serves as the most efficient on-device model managed through AI Core system service supporting both ML Kit GenAI API and Light Art LM implementations.

入选理由:Android支持三种AI部署模式:纯设备端、混合模式、纯云端推理

FeaturedVideo#Android#AI#Gemini Nano#ML Kit#On-device AI英文
Google Developers Blog 图标

All the news from the Google I/O 2026 Developer keynote

Google Developers Blog818 字 (约 4 分钟)
75

Google announced a transition from assistive AI to autonomous agents at I/O 2026, highlighting the Gemini 3.5 model series, upgraded Antigravity 2.0 agent-first platform, and new tools including Android CLI, Android Bench, and WebMCP to help developers build high-quality applications.

入选理由:Google 推出 Gemini 3.5 系列模型并升级 Antigravity 2.0 平台,支持跨平台终端沙箱、凭证掩码和强化 Git 策略的子代理编排

FeaturedArticle#Google I/O#AI Agent#Android#Gemini#Web Development英文
Blazing fast on-device GenAI with LiteRT-LM

Blazing fast on-device GenAI with LiteRT-LM

Google Developers Blog1574 字 (约 7 分钟)
75

Google AI Edge introduces LiteRT-LM, an optimized inference engine for deploying Gemma 4 models on edge devices, supporting Android, iOS, and web platforms with GPU inference reaching 76 tokens/sec and Multi-Token Prediction delivering up to 2.2x speedup.

入选理由:LiteRT-LM 在 Android GPU (OpenCL) 上实现 52 tokens/sec 解码速度,iOS (Metal) 达 56 tokens/sec,WebGPU 在 MacBook Pro 上可达 76 tokens/sec

FeaturedArticle#Google AI Edge#LiteRT-LM#Gemma 4#Edge AI#On-device Inference英文
MTP drafters for Gemma 4 are available today under the same open-source Apache 2.0 license. Read the...

Google has released MTP drafters for Gemma 4 under the Apache 2.0 open-source license, available for download from Kaggle and Hugging Face.

入选理由:Gemma 4的MTP drafters现已发布,使用Apache 2.0开源许可。

FeaturedTweet#Gemma 4#MTP drafters#open source英文
.@GoogleDeepMind's Gemma 4 - 12B is available on Ollama!  

Chat: 
ollama run gemma4:12b-mlx

Hermes...

@GoogleDeepMind's Gemma 4 - 12B is available on Ollama!

ollama(@ollama)104 字 (约 1 分钟)
60

ollama announces that the Gemma 4 - 12B model is now available on its platform. Users can run the model via MLX, and it supports tools like Hermes Agent and Claude Code.

入选理由:ollama 宣布 Gemma 4 - 12B 模型已在其平台上可用。

FeaturedTweet#ollama#Gemma 4#MLX中文
End-of-week call for community builds!

Have a project or demo that showcases Gemma 4 Multi-Token Pr...

End-of-week call for community builds!

Google AI Developers(@googleaidevs)163 字 (约 1 分钟)
45

Google AI invites developers to showcase projects on Gemma 4 MTP.

入选理由:Google AI邀请开发者分享Gemma 4 MTP项目

FeaturedTweet#Google AI#Developer Community中文
Accelerating Gemma 4: faster inference with  multi-token prediction drafters

Accelerating Gemma 4: faster inference with multi-token prediction drafters

The Keyword (blog.google)1732 字 (约 7 分钟)
45

The article briefly mentions that Gemma 4 uses multi-token prediction to accelerate inference but provides no technical details, experimental data, or implementation methods, making it a promotional lightweight announcement with little engineering value.

入选理由:Gemma 4通过多标记预测(MTP)加速推理,速度提升最高达3倍。

FeaturedArticle#Gemma#multi-token prediction#inference optimization#Google DeepMind英文

跨材料问答 · Gemma 4

回答基于:Gemma 4 相关 21 条材料
    0 / 500

    AI may generate inaccurate information. Please verify important content.