Back to Quant Hub

Qwen2.5-Coder 32B Instruct

32B

Alibaba Qwen2.5

Top-tier open coding model. HumanEval competitive with GPT-4o on 32B scale.

39.6K HF downloads111 likesbartowski/Qwen2.5-Coder-32B-Instruct-GGUF· stats from 6/24/2026
Consumer GPUPro GPU

131K

Max Context

3

Quant Variants

GGUF Q4_K_M

Best Quality

97.5%

Accuracy Retained

Quantization Variants

Per-quant VRAM, quality loss, and inference speed on RTX 4090

FormatLevelBPWVRAMPPL LossSpeedActions
GGUFQ4_K_M4.8522.0 GB2.5%44 tok/s
EXL23.5bpw3.516.4 GB4.5%65 tok/s
AWQINT4419.5 GB3.5%52 tok/s