Table of Contents

Enum FlashAttentionPrecision

Namespace
AiDotNet.NeuralNetworks.Attention
Assembly
AiDotNet.dll

Precision modes for Flash Attention computation.

public enum FlashAttentionPrecision

Fields

Float16 = 0

Use 16-bit floating point (half precision). Fastest but may have numerical issues with very long sequences.

Float32 = 1

Use 32-bit floating point (single precision). Good balance of speed and accuracy.

Mixed = 2

Use mixed precision (FP16 for matmul, FP32 for softmax). Best combination of speed and numerical stability.