Back to Quant Hub

WizardLM-2 7B

7B

Microsoft / WizardLM

Evol-Instruct fine-tuned Mistral-based 7B. Strong complex instruction handling.

Consumer GPUMac / Apple SiliconCPU / VPS

33K

Max Context

2

Quant Variants

GGUF Q4_K_M

Best Quality

96.9%

Accuracy Retained

Quantization Variants

Per-quant VRAM, quality loss, and inference speed on RTX 4090

FormatLevelBPWVRAMPPL LossSpeedActions
GGUFQ4_K_M4.855.4 GB3.1%152 tok/s
AWQINT444.8 GB4.3%218 tok/s