Running LLMs on your iPhone: 40 tok/s Gemma 4 with MLX — Adrien Grondin, Locally AI

- 实现了 40 tokens/s 的高性能推理速度。
- 详细介绍了 MLX 框架的技术优势与实现细节。
- 为移动端 AI 应用开发提供了新思路。
Running LLMs on your iPhone: 40 tok/s Gemma 4 with MLX —Adrien Grondin, Locally AI - YouTube
Back 
Skip navigation
Search
Search with your voice
[](http://www.youtube.com/watch?v=a2muGkT4WD4)

[](http://www.youtube.com/watch?v=a2muGkT4WD4)
[](http://www.youtube.com/watch?v=a2muGkT4WD4)
[](http://www.youtube.com/watch?v=a2muGkT4WD4)
[](http://www.youtube.com/watch?v=a2muGkT4WD4)
Running LLMs on your iPhone: 40 tok/s Gemma 4 with MLX —Adrien Grondin, Locally AI
[](http://www.youtube.com/watch?v=a2muGkT4WD4)
Tap to unmute
2x

Running LLMs on your iPhone: 40 tok/s Gemma 4 with MLX —Adrien Grondin, Locally AI
AI Engineer 4,211 views 1 day ago
[](http://www.youtube.com/watch?v=a2muGkT4WD4)
Search
Copy link
Info
Shopping

If playback doesn't begin shortly, try restarting your device.
•
You're signed out
Videos you watch may be added to the TV's watch history and influence TV recommendations. To avoid this, cancel and sign in to YouTube on your computer.
Cancel Confirm
[](http://www.youtube.com/watch?v=a2muGkT4WD4)
Share
[](http://www.youtube.com/watch?v=a2muGkT4WD4 "Share link")- [x] Include playlist
An error occurred while retrieving sharing information. Please try again later.
0:00
[](http://www.youtube.com/watch?v=a2muGkT4WD4)[](https://www.youtube.com/watch?v=CEvIs9y1uog "Next (SHIFT+n)")
0:00 / 0:00
Live
•Watch full video
•
•
15:26 Gemma, DeepMind's Family of Open Models —Omar Sanseviero, Google DeepMind AI Engineer 15K • 1d ago Live Playlist ()Mix (50+)46:21 Harness Engineering: How to Build Software When Humans Steer, Agents Execute —Ryan Lopopolo, OpenAI AI Engineer 49K • 5d ago Live Playlist ()Mix (50+)21:50 Master Gemma 4 in 20 Minutes Ali H. Salem 31K • 5d ago Live Playlist ()Mix (50+)18:46 The Future of MCP — David Soria Parra, Anthropic AI Engineer 91K • 2d ago Live Playlist ()Mix (50+)12:30 What 6 months of AI coding did to my dev team Axel Molist 69K • 3d ago Live Playlist ()Mix (50+)1:57:03 Full Workshop: Build Your Own Deep Research Agents - Louis-François Bouchard, Paul Iusztin, Samridhi AI Engineer 9.9K • 1d ago Live Playlist ()Mix (50+)55:32 Claude just killed ALL Note-Taking Apps. Here is proof.ICOR with Tom | AI Productivity 296K • 4w ago Live Playlist ()Mix (50+)19:13 Full Claude Tutorial: Beginner to Advanced in 19 Minutes Futurepedia 223K • 13d ago Live Playlist ()Mix (50+)21:04 Qwen3.6 on Llama C++ with TurboQuant Samuel Gregory 3.8K • 19h ago Live Playlist ()Mix (50+)12:25 What AI Agent Skills Are and How They Work IBM Technology 54K • 1d ago Live Playlist ()Mix (50+)16:22 Don't Build Agents, Build Skills Instead – Barry Zhang & Mahesh Murag, Anthropic AI Engineer 1.1M • 4mo ago Live Playlist ()Mix (50+)23:34 Trump's OWN Party TURNS on Kash Patel After Devastating Atlantic Report Jack Mercer Show 46K • 9h ago Live Playlist ()Mix (50+)
Sign in to confirm you’re not a bot This helps protect our community. Learn more
Running LLMs on your iPhone: 40 tok/s Gemma 4 with MLX —Adrien Grondin, Locally AI

AI Engineer
411K subscribers
Subscribe
Subscribed
110
Share
Save
Download
Download
4.2K views 1 day ago
4,211 views • Apr 20, 2026
See more: https://x.com/adrgrondin/status/20405... Speaker info:…...more
...more
How this was made
Auto-dubbed
Audio tracks for some languages were automatically generated. Learn more
Transcript
Follow along using the transcript.
Show transcript

Show less
[](http://www.youtube.com/watch?v=a2muGkT4WD4)
Running LLMs on your iPhone: 40 tok/s Gemma 4 with MLX —Adrien Grondin, Locally AI
4,211 views 4.2K views
Apr 20, 2026
110
Share
Save
Download
Download
3 Comments
Sort comments
Sort by
Top Show featured commentsNewest Show recent comments, including potential spam
Add a comment...
[@rayr268](http://www.youtube.com/@rayr268)
This gives me so many ideas. Already have a good project in mind.
Show less Read more
Like
1
Dislike
Reply
[@cedricmanouan2333](http://www.youtube.com/@cedricmanouan2333)
Insightful presentation!!!
Show less Read more
Like
Dislike
Reply
[@jam_daniels](http://www.youtube.com/@jam_daniels)
great app!
Show less Read more
Like
Dislike
Reply
Top is selected, so you'll see featured comments
Comments 3
Top Show featured commentsNewest Show recent comments, including potential spam
In this video
Transcript
Description
Running LLMs on your iPhone: 40 tok/s Gemma 4 with MLX —Adrien Grondin, Locally AI
110 Likes
4,211 Views
Apr 20 2026
See more: https://x.com/adrgrondin/status/20405... Speaker info:
…...more
...more Show less
How this was made
Auto-dubbed
Audio tracks for some languages were automatically generated. Learn more
Transcript
Follow along using the transcript.
Show transcript

Transcript
NaN / NaN

[Gemma, DeepMind's Family of Open Models —Omar Sanseviero, Google DeepMind](http://www.youtube.com/watch?v=_gVFUEdhCyI&pp=ugUEEgJlbg%3D%3D)
AI Engineer
15K 1d ago
New

[Harness Engineering: How to Build Software When Humans Steer, Agents Execute —Ryan Lopopolo, OpenAI](http://www.youtube.com/watch?v=am_oeAoUhew&pp=ugUEEgJlbg%3D%3D)
AI Engineer
49K 5d ago
New

[Master Gemma 4 in 20 Minutes](http://www.youtube.com/watch?v=yJr_kTCOkFo&pp=ugUEEgJlbg%3D%3D)
Ali H. Salem
31K 5d ago
New

[The Future of MCP — David Soria Parra, Anthropic](http://www.youtube.com/watch?v=v3Fr2JR47KA&pp=ugUEEgJlbg%3D%3D)
AI Engineer
91K 2d ago
New

[What 6 months of AI coding did to my dev team](http://www.youtube.com/watch?v=h0hdaHPKDdI)
Axel Molist
69K 3d ago
New

[Full Workshop: Build Your Own Deep Research Agents - Louis-François Bouchard, Paul Iusztin, Samridhi](http://www.youtube.com/watch?v=mYSRn6PC1mc&pp=ugUEEgJlbg%3D%3D)
AI Engineer
9.9K 1d ago
New

[Claude just killed ALL Note-Taking Apps. Here is proof.](http://www.youtube.com/watch?v=geIKyDaXwGg)
ICOR with Tom | AI Productivity
296K 4w ago

[Full Claude Tutorial: Beginner to Advanced in 19 Minutes](http://www.youtube.com/watch?v=WSPChlfxJyA&pp=ugUEEgJlbg%3D%3D)
Futurepedia
223K 13d ago

[Qwen3.6 on Llama C++ with TurboQuant](http://www.youtube.com/watch?v=5jkAlqbk66A&pp=ugUHEgVlbi1VU9IHCQnDCgGHKiGM7w%3D%3D)
Samuel Gregory
3.8K 19h ago
New

[What AI Agent Skills Are and How They Work](http://www.youtube.com/watch?v=Lg-meK5IU8Q)
IBM Technology
54K 1d ago
New

[Don't Build Agents, Build Skills Instead – Barry Zhang & Mahesh Murag, Anthropic](http://www.youtube.com/watch?v=CEvIs9y1uog&pp=ugUEEgJlbg%3D%3D)
AI Engineer
1.1M 4mo ago

[Trump's OWN Party TURNS on Kash Patel After Devastating Atlantic Report](http://www.youtube.com/watch?v=yPafTchzwmg)
Jack Mercer Show
46K 9h ago
New

[I Ran Google's New AI Locally for Free. Here's How.](http://www.youtube.com/watch?v=n7qTHmr0df0&pp=0gcJCcMKAYcqIYzv)
Jimi Barkway | AI Automation
2.5K 3d ago
New

[Airplane (1980): 15 Weird Facts You Didn't Know!](http://www.youtube.com/watch?v=Dpu8VTnU4I8)
Remember When
415K 4d ago
New

[Demis Hassabis: Why AGI is Bigger than the Industrial Revolution & Where Are The Bottlenecks in AI](http://www.youtube.com/watch?v=SSya123u9Yk&pp=ugUHEgVlbi1VUw%3D%3D)
20VC with Harry Stebbings
264K 2w ago

[The design process is dead. Here’s what’s replacing it. | Jenny Wen (head of design at Claude)](http://www.youtube.com/watch?v=eh8bcBIAAFo)
Lenny's Podcast
274K 1mo ago

[Everything We Got Wrong About Research-Plan-Implement - Dexter Horthy](http://www.youtube.com/watch?v=YwZR6tc7qYg&pp=ugUEEgJlbg%3D%3D)
MLOps.community
90K 4w ago

[Claude Mythos Clone Shocks Anthropic and OpenAI](http://www.youtube.com/watch?v=cKFITKsb7M8&pp=ugUEEgJlbg%3D%3D)
AI Revolution
39K 11h ago
New

[🚗 BYD : The biggest SCAM of the car industry ?](http://www.youtube.com/watch?v=tS_fJJxMjn4&pp=ugUEEgJlbg%3D%3D)
Statrys
4.2M 4d ago
New

[Mythos leaks, SpaceX buys Cursor and OpenAI drops GPT Image 2.0](http://www.youtube.com/watch?v=ITT9xWeicWM)
Wes Roth
10K 4h ago
New
Show more
[](http://www.youtube.com/watch?v=a2muGkT4WD4)