Enum QLoRAAdapter<T>.QuantizationType
Specifies the type of 4-bit quantization to use for base layer weights.
public enum QLoRAAdapter<T>.QuantizationType
Fields
INT4 = 04-bit integer quantization with uniform spacing (-8 to 7).
Simple linear quantization mapping 16 values uniformly across the range. Fast and straightforward, but not optimal for normally distributed weights.
NF4 = 14-bit Normal Float quantization optimized for normally distributed weights.
Uses information-theoretically optimal quantization levels for normal distributions. Provides better accuracy for typical neural network weights at the same bit width. This is the recommended and default quantization type for QLoRA.
Remarks
For Beginners: This determines how we compress numbers from full precision to 4-bit. Think of it like choosing between different image compression algorithms - each has trade-offs.