Enum FlashAttentionPrecision
- Namespace
- AiDotNet.NeuralNetworks.Attention
- Assembly
- AiDotNet.dll
Precision modes for Flash Attention computation.
public enum FlashAttentionPrecision
Fields
Float16 = 0Use 16-bit floating point (half precision). Fastest but may have numerical issues with very long sequences.
Float32 = 1Use 32-bit floating point (single precision). Good balance of speed and accuracy.
Mixed = 2Use mixed precision (FP16 for matmul, FP32 for softmax). Best combination of speed and numerical stability.