Quick Start
Run your first local AI model in under 5 minutes. No account. No API key. No internet after setup.
Step 1 - Download Off Grid
iOS: Download on the App Store - requires iPhone 12 or newer (4GB RAM+)
Android: Get it on Google Play - requires Android 10+, 4GB RAM+
Or grab the latest APK directly from GitHub Releases.
Step 2 - Pick a model
When you open the app, you’ll see the model picker. If you’re unsure, start here:
| You want | Start with | Size |
|---|---|---|
| Fast chat, 3–4GB RAM | Qwen 3.5 0.8B | ~0.8GB |
| Best for most phones | Qwen 3.5 2B | ~1.7GB |
| Best quality (8GB RAM) | Qwen 3.5 9B | ~5.5GB |
| Vision + reasoning | Gemma 4 E2B | ~1.5GB |
| Image generation | SD 1.5 Palettized (iOS) / Absolute Reality (Android) | ~1GB |
Not sure? Pick Qwen 3.5 2B. It fits comfortably in 4GB RAM, supports 262K context, and is the best starting point for most phones.
Step 3 - Download and run
Tap a model → Download. This is the only time you need internet. The download goes to your device storage.
Once downloaded, tap Load - the model loads into RAM. On first load this takes 5–15 seconds depending on model size.
Type your first message. You’re now running AI locally.
Step 4 - Go offline (optional)
Put your phone in airplane mode. Everything still works.
What’s next
- Which model should I use? - full comparison table by device and use case
- Connect your home Ollama server - use bigger models from your desktop via LAN
- Run Stable Diffusion on Android - generate images completely on-device
Community
Stuck, or want to share what you’re building? Join the Slack community.
The app is open source - view it on GitHub.