Back to Quant Hub

Mistral Nemo 12B Instruct

12B

Mistral AI

Mistral + NVIDIA collaboration. 128K context, excellent multilingual support.

15.1K HF downloads117 likesbartowski/Mistral-Nemo-Instruct-2407-GGUF· stats from 6/24/2026
Consumer GPUMac / Apple Silicon

131K

Max Context

3

Quant Variants

GGUF Q6_K

Best Quality

99.1%

Accuracy Retained

Quantization Variants

Per-quant VRAM, quality loss, and inference speed on RTX 4090

FormatLevelBPWVRAMPPL LossSpeedActions
GGUFQ4_K_M4.858.5 GB3.1%112 tok/s
GGUFQ6_K6.5611.0 GB0.9%95 tok/s
AWQINT447.8 GB4.4%148 tok/s