Back to Quant Hub

Qwen2.5-Coder 7B Instruct

7B

Alibaba Qwen2.5

Best 7B coding model. Ideal for local dev assistants on 8–16GB VRAM.

150.9K HF downloads289 likesQwen/Qwen2.5-Coder-7B-Instruct-GGUF· stats from 6/24/2026
Consumer GPUMac / Apple SiliconCPU / VPS

131K

Max Context

3

Quant Variants

EXL2 4.65bpw

Best Quality

98.0%

Accuracy Retained

Quantization Variants

Per-quant VRAM, quality loss, and inference speed on RTX 4090

FormatLevelBPWVRAMPPL LossSpeedActions
GGUFQ4_K_M4.855.4 GB2.8%158 tok/s
AWQINT444.8 GB4.0%225 tok/s
EXL24.65bpw4.655.2 GB2.0%248 tok/s