Which Gemma 4 Model Runs on YOUR Device? The Complete Guide for iPhone, Android & Mac Mini

I downloaded all four Gemma 4 models onto my Mac Mini. Then I tried the 31B on my iPhone.

It crashed.

The phone got hot. The app froze. I wasted 40 minutes waiting for a download that was never going to work on 8 GB of RAM.

Gemma 4 is the first open model family that genuinely runs on phones. But "runs on phones" doesn't mean "every model runs on every phone." Pick the wrong variant and you'll burn an afternoon downloading a model that won't fit in memory.

This guide is the cheat sheet I wish I'd had. For every device — iPhone, Android, Mac Mini — I'll tell you exactly which Gemma 4 model to run, at what quantization, using which tool, and what speed to expect.

No guessing. No trial and error.

Meet the Gemma 4 Family

Google released Gemma 4 on April 2, 2026, with something the open-source community had been begging for: an Apache 2.0 license. No usage restrictions. No 700-million MAU limits. Just open.

The family has four members, and they're wildly different:

E2B — The pocket rocket. 2.3 billion effective parameters (5.1B total). Fits in 3.1 GB quantized. Runs on basically anything, including a Raspberry Pi.

E4B — The balanced pick. 4.5 billion effective (8B total). Needs 5 GB quantized. The sweet spot for flagship phones.

Device	RAM	Model	Quant	Size	Tool	tok/s
iPhone 14 Pro	6 GB	E2B	Q4_K_M	3.1 GB	AI Edge Gallery	15-25
iPhone 15/16 Pro	8 GB	E2B	Q4_K_M	3.1 GB	AI Edge Gallery	20-35
iPhone 15/16 Pro	8 GB	E4B	Q4_K_M	5.0 GB	LM Studio iOS	12-20
Android 8 GB	8 GB	E2B	Q8_0	5.1 GB	AI Edge Gallery	20-35
Android 12 GB	12 GB	E4B	Q4_K_M	5.0 GB	AI Edge Gallery	12-20
Android 16 GB	16 GB	E4B	Q8_0	8.2 GB	LiteRT-LM	12-20
Mac Mini M1	8 GB	E2B	Q8_0	5.1 GB	Ollama	30-50
Mac Mini M2	16 GB	E4B	Q8_0	8.2 GB	Ollama	40-60
Mac Mini M4	24 GB	26B A4B	Q4_K_M	16.9 GB	Ollama/MLX	15-25
Mac Mini M4 Pro	48 GB	31B	Q4_K_M	18.3 GB	Ollama/MLX	20-35

Which Gemma 4 Model Runs on YOUR Device? The Complete Guide for iPhone, Android & Mac Mini

Meet the Gemma 4 Family

Quantization: The One Decision That Changes Everything

Running Gemma 4 on iPhone

What Actually Works

The Tools

Quickest Path

Running Gemma 4 on Android

What Fits Where

The Tools

Termux + llama.cpp Setup

Mac Mini as Your Personal AI Server

Which Mac Mini, Which Model?

Full Setup: Ollama

Network Access + Always-On

MLX: The Speed Option

The Decision Matrix

Pitfalls I Learned the Hard Way

One Takeaway

作者

分类

更多文章

Google Gemma 4：从泄露到爆火的全景深度调研

OpenClaw + Gemma 4: Build a Free Local AI Agent in 5 Minutes (2026 Guide)

Gemma 4 vs Qwen 3.6-Plus: The Real Differences That Matter (2026)