Phi-3.5 Mini Instruct

3.8B

Microsoft Phi

Microsoft's tiny powerhouse. Best 4B model for on-device deployment.

⬇ 68.1K HF downloads♥ 81 likesbartowski/Phi-3.5-mini-instruct-GGUF· stats from 6/24/2026

Consumer GPUMac / Apple SiliconCPU / VPS

131K

Max Context

Quant Variants

GGUF Q8_0

Best Quality

99.8%

Accuracy Retained

Quantization Variants

Per-quant VRAM, quality loss, and inference speed on RTX 4090

Format	Level	BPW	VRAM	PPL Loss	Speed	Actions
GGUF	Q4_K_M	4.85	2.8 GB	3.8%	298 tok/s	Calc HF
GGUF	Q8_0	8.5	4.2 GB	0.2%	255 tok/s	Calc HF
AWQ	INT4	4	2.5 GB	5.1%	385 tok/s	Calc HF