Back to Cookbook
BeginnerMac / Apple 7 min read
M1 / M2 Mac 8GB: Realistic Ollama Limits
Unified memory is shared with macOS — here is what actually works on base MacBooks without swapping.
M1M28GB RAMOllamaMetal
Memory budget
macOS + apps use 3–4GB. That leaves ~4GB for the model on an 8GB Mac. Stick to 3B Q4 or 7B Q2/Q3 with short context. Close browsers before loading 7B.
text
M1 8GB safe picks:
llama3.2:3b → smooth chat
qwen2.5:3b → good Chinese
phi3.5:mini → fast responses
Avoid on 8GB:
llama3.1:8b @ Q4 → swap thrashing
any 14B+ modelOllama settings
Set OLLAMA_NUM_PARALLEL=1 and keep context at 2048 for 8GB machines. Monitor Memory Pressure in Activity Monitor.
bash
export OLLAMA_NUM_PARALLEL=1
export OLLAMA_MAX_LOADED_MODELS=1
ollama pull llama3.2:3b
ollama run llama3.2:3bDeployment guides are educational. Each model is subject to its own license — read the official Hugging Face model card before downloading or deploying.