Back to Quant Hub

OpenChat 3.6 8B

8B

OpenChat

C-RLFT fine-tuned Llama 3.1 8B. Known for natural conversational tone.

Consumer GPUMac / Apple SiliconCPU / VPS

8K

Max Context

2

Quant Variants

EXL2 4.65bpw

Best Quality

97.6%

Accuracy Retained

Quantization Variants

Per-quant VRAM, quality loss, and inference speed on RTX 4090

FormatLevelBPWVRAMPPL LossSpeedActions
GGUFQ4_K_M4.855.7 GB3.1%146 tok/s
EXL24.65bpw4.655.4 GB2.4%228 tok/s