Back to Quant Hub

Phi-3.5 Mini Instruct

3.8B

Microsoft Phi

Microsoft's tiny powerhouse. Best 4B model for on-device deployment.

68.1K HF downloads81 likesbartowski/Phi-3.5-mini-instruct-GGUF· stats from 6/24/2026
Consumer GPUMac / Apple SiliconCPU / VPS

131K

Max Context

3

Quant Variants

GGUF Q8_0

Best Quality

99.8%

Accuracy Retained

Quantization Variants

Per-quant VRAM, quality loss, and inference speed on RTX 4090

FormatLevelBPWVRAMPPL LossSpeedActions
GGUFQ4_K_M4.852.8 GB3.8%298 tok/s
GGUFQ8_08.54.2 GB0.2%255 tok/s
AWQINT442.5 GB5.1%385 tok/s