Back to Quant Hub

Qwen2.5-Coder 7B Instruct

7B

Alibaba Qwen2.5

Best 7B coding model. Ideal for local dev assistants on 8–16GB VRAM.

⬇ 150.9K HF downloads♥ 289 likesQwen/Qwen2.5-Coder-7B-Instruct-GGUF· stats from 6/24/2026

Consumer GPUMac / Apple SiliconCPU / VPS

131K

Max Context

3

Quant Variants

EXL2 4.65bpw

Best Quality

98.0%

Accuracy Retained

Calculate VRAM Hugging Face Compare

Quantization Variants

Per-quant VRAM, quality loss, and inference speed on RTX 4090

Format	Level	BPW	VRAM	PPL Loss	Speed	Actions
GGUF	Q4_K_M	4.85	5.4 GB	2.8%	158 tok/s	Calc HF
AWQ	INT4	4	4.8 GB	4.0%	225 tok/s	Calc HF
EXL2	4.65bpw	4.65	5.2 GB	2.0%	248 tok/s	Calc HF