Qwen2.5-Coder 32B Instruct

32B

Alibaba Qwen2.5

Top-tier open coding model. HumanEval competitive with GPT-4o on 32B scale.

⬇ 39.6K HF downloads♥ 111 likesbartowski/Qwen2.5-Coder-32B-Instruct-GGUF· stats from 6/24/2026

Consumer GPUPro GPU

131K

Max Context

Quant Variants

GGUF Q4_K_M

Best Quality

97.5%

Accuracy Retained

Quantization Variants

Per-quant VRAM, quality loss, and inference speed on RTX 4090

Format	Level	BPW	VRAM	PPL Loss	Speed	Actions
GGUF	Q4_K_M	4.85	22.0 GB	2.5%	44 tok/s	Calc HF
EXL2	3.5bpw	3.5	16.4 GB	4.5%	65 tok/s	Calc HF
AWQ	INT4	4	19.5 GB	3.5%	52 tok/s	Calc HF