Class DefaultLoRAConfiguration<T>

Namespace: AiDotNet.LoRA

Assembly: AiDotNet.dll

Default LoRA configuration that applies LoRA to all layers with trainable weight matrices.

public class DefaultLoRAConfiguration<T> : ILoRAConfiguration<T>

Type Parameters

T: The numeric type used for calculations, typically float or double.

Inheritance: object

DefaultLoRAConfiguration<T>

Implements: ILoRAConfiguration<T>

Inherited Members: object.Equals(object)

object.Equals(object, object)

object.GetHashCode()

object.GetType()

object.MemberwiseClone()

object.ReferenceEquals(object, object)

object.ToString()

Remarks

This configuration implements an intelligent strategy: wrap all layers that have trainable weight matrices with StandardLoRAAdapter, and leave utility layers (activation, pooling, etc.) unchanged. This maximizes the benefits of LoRA across all applicable layer types.

Supported Layer Types (30+ layer types): - Dense/Linear layers (Dense, FullyConnected, FeedForward) - Convolutional layers (all Conv variants including depthwise, separable, dilated, etc.) - Recurrent layers (LSTM, GRU, ConvLSTM, Bidirectional) - Attention layers (Attention, MultiHeadAttention, SelfAttention) - Transformer layers (Encoder, Decoder) - Embedding layers (Embedding, PatchEmbedding) - Specialized layers (Highway, GatedLinearUnit, SqueezeAndExcitation, Capsule, CRF, etc.)

Available LoRA Variants: AiDotNet includes 32 cutting-edge LoRA variants for different use cases: - StandardLoRAAdapter: Generic LoRA for all layer types - QLoRAAdapter: 4-bit quantization for 75% memory reduction - DoRAAdapter: Weight decomposition (+3.7% on LLaMA-7B) - AdaLoRAAdapter: Adaptive rank allocation - VeRAAdapter: Shared matrices (10x fewer parameters) - LoRAPlusAdapter: Dual learning rates (2x faster convergence) - LoHaAdapter: Hadamard products for CNNs - LoKrAdapter: Kronecker products (57x compression) - DyLoRAAdapter: Dynamic rank training - RoSAAdapter: Robust to distribution shifts - DVoRAAdapter: DoRA+VeRA hybrid - LoRAFAAdapter: Frozen A matrix (50% reduction) - DeltaLoRAAdapter: Delta-based updates with momentum - LoRADropAdapter: Dropout regularization - PiSSAAdapter: SVD initialization (NeurIPS 2024) - GLoRAAdapter: Weight + activation adaptation - LongLoRAAdapter: Context length extension - MultiLoRAAdapter: Multi-task learning with routing - XLoRAAdapter: Mixture of experts - TiedLoRAAdapter: Weight tying (90% reduction) - ReLoRAAdapter: Restart mechanism prevents forgetting - LoftQAdapter: Alternating quantization+LoRA - QALoRAAdapter: Quantization-aware training - VBLoRAAdapter: Vector banks (2024) - SLoRAAdapter: Scalable serving (1000+ adapters) - MoRAAdapter: High-rank updates for knowledge tasks - LoRAXSAdapter: Extreme efficiency (100x compression) - FloraAdapter: Gradient compression view - ChainLoRAAdapter: Sequential task chaining - HRAAdapter: Hybrid low-rank + sparse - LoRETTAAdapter: Tensor-train decomposition - NOLAAdapter: Random basis (20x compression)

To use a specific variant, pass a factory function to the constructor. Example: new DefaultLoRAConfiguration<double>(rank: 8, adapterFactory: (layer, r, a, f) => new QLoRAAdapter<double>(layer, r, a, f))

For Beginners: This is a ready-to-use LoRA configuration for most common scenarios.

When you apply this configuration to a model:

All Dense layers get wrapped with LoRA adapters
All FullyConnected layers get wrapped with LoRA adapters
All other layers (convolutional, pooling, etc.) pass through unchanged

This is perfect for:

Fine-tuning pre-trained models on new tasks
Adapting large language models with limited resources
Training multiple task-specific adapters for the same base model

Example usage:

// Create a configuration with rank=8, alpha=8, and frozen base layers
var loraConfig = new DefaultLoRAConfiguration<double>(rank: 8, alpha: 8, freezeBaseLayer: true);

// Apply to all layers in your model
var adaptedLayers = model.Layers.Select(layer => loraConfig.ApplyLoRA(layer)).ToList();

The configuration respects these parameters:

Rank: Controls compression (fewer parameters = lower rank)
Alpha: Controls adaptation strength (typically same as rank)
FreezeBaseLayer: Whether to freeze original weights (true for efficiency)

Constructors

DefaultLoRAConfiguration(int, double, bool, ILoRAAdapter<T>?)

Initializes a new DefaultLoRAConfiguration with the specified parameters.

public DefaultLoRAConfiguration(int rank, double alpha = -1, bool freezeBaseLayer = true, ILoRAAdapter<T>? loraAdapter = null)

Parameters

rank int: The rank of the low-rank decomposition (must be positive).
alpha double: The scaling factor for LoRA contributions (defaults to rank if negative).
freezeBaseLayer bool: Whether to freeze base layers during training (default: true).
loraAdapter ILoRAAdapter<T>: Optional LoRA adapter to use. Defaults to StandardLoRAAdapter if null.

Remarks

For Beginners: This creates a configuration that will be applied to your model's layers.

Parameters explained:

rank: How many "compression channels" to use (8 is a good starting point)
alpha: How strong the LoRA effect is (use -1 to auto-set to rank value)
freezeBaseLayer: Whether to lock original weights (true = more efficient, recommended)

Example configurations:

// Standard LoRA (default)
var standard = new DefaultLoRAConfiguration<double>(rank: 8, alpha: 8);

// QLoRA for 4-bit quantization (75% memory reduction)
var qloraAdapter = new QLoRAAdapter<double>(null, 8, 8, true);
var qlora = new DefaultLoRAConfiguration<double>(rank: 8, alpha: 8, loraAdapter: qloraAdapter);

// DoRA for improved weight decomposition (+3.7% accuracy on LLaMA-7B)
var doraAdapter = new DoRAAdapter<double>(null, 8, 8, true);
var dora = new DefaultLoRAConfiguration<double>(rank: 8, alpha: 8, loraAdapter: doraAdapter);

// VeRA for extreme parameter efficiency (10x fewer parameters)
var veraAdapter = new VeRAAdapter<double>(null, 8, 8, true);
var vera = new DefaultLoRAConfiguration<double>(rank: 8, alpha: 8, loraAdapter: veraAdapter);

Exceptions

ArgumentException: Thrown when rank is not positive.

Properties

Alpha

Gets the scaling factor (alpha) for LoRA adaptations.

public double Alpha { get; }

Property Value

double

Remarks

Alpha controls how strongly LoRA adaptations affect outputs. Common practice: alpha = rank (for scaling factor of 1.0) Set to -1 to use rank as alpha (automatic scaling).

FreezeBaseLayer

Gets whether base layers should be frozen during training.

public bool FreezeBaseLayer { get; }

Property Value

bool

Remarks

When true (typical), only LoRA parameters are trained while base layer weights remain frozen. This dramatically reduces memory and compute requirements.

When false, both base layer and LoRA parameters are trained. This uses more resources but may achieve better results in some scenarios.

Rank

Gets the rank of the low-rank decomposition to use for adapted layers.

public int Rank { get; }

Property Value

int

Remarks

The rank determines the number of parameters in the LoRA adaptation. Lower rank = fewer parameters = more efficient but less flexible.

Common values: - 1-4: Minimal parameters, very efficient - 8: Good default balance - 16-32: More flexibility - 64+: Approaching full fine-tuning

Methods

ApplyLoRA(ILayer<T>)

Applies LoRA adaptation to layers with trainable weight matrices.

public ILayer<T> ApplyLoRA(ILayer<T> layer)

Parameters

layer ILayer<T>: The layer to potentially adapt with LoRA.

Returns

ILayer<T>: A StandardLoRAAdapter wrapping the layer if it has trainable weights, otherwise returns the original layer unchanged.

Remarks

This method examines the layer type and wraps it with StandardLoRAAdapter if it's a layer type that benefits from LoRA adaptation (has trainable weight matrices).

Supported Layer Types: - Dense/Linear: DenseLayer, FullyConnectedLayer, FeedForwardLayer - Convolutional: ConvolutionalLayer, DeconvolutionalLayer, DepthwiseSeparableConvolutionalLayer, DilatedConvolutionalLayer, SeparableConvolutionalLayer, SubpixelConvolutionalLayer - Recurrent: LSTMLayer, GRULayer, RecurrentLayer, ConvLSTMLayer, BidirectionalLayer - Attention: AttentionLayer, MultiHeadAttentionLayer, SelfAttentionLayer - Transformer: TransformerEncoderLayer, TransformerDecoderLayer - Embedding: EmbeddingLayer, PatchEmbeddingLayer - Specialized: LocallyConnectedLayer, HighwayLayer, GatedLinearUnitLayer, SqueezeAndExcitationLayer - Advanced: CapsuleLayer, PrimaryCapsuleLayer, DigitCapsuleLayer, ConditionalRandomFieldLayer

Excluded Layer Types:

Activation, Pooling, Dropout, Flatten, Reshape, Normalization (no trainable weights)
GraphConvolutionalLayer (requires specialized adapter that implements IGraphConvolutionLayer)

For Beginners: This method decides whether to add LoRA to each layer.

Decision logic:

If the layer has trainable weight matrices → Wrap it with StandardLoRAAdapter
If the layer is just doing math operations (activation, pooling, etc.) → Return unchanged

This intelligent approach means:

LoRA is applied to all layers that can benefit from it
Works with Dense, Convolutional, Recurrent, Attention, and Transformer layers
Utility layers (pooling, dropout, etc.) pass through unchanged

Example:

var config = new DefaultLoRAConfiguration<double>(rank: 8);

// Dense layer gets adapted
var denseLayer = new DenseLayer<double>(100, 50);
var adapted1 = config.ApplyLoRA(denseLayer); // Returns StandardLoRAAdapter

// Convolutional layer gets adapted
var convLayer = new ConvolutionalLayer<double>(...);
var adapted2 = config.ApplyLoRA(convLayer); // Returns StandardLoRAAdapter

// Attention layer gets adapted
var attnLayer = new MultiHeadAttentionLayer<double>(...);
var adapted3 = config.ApplyLoRA(attnLayer); // Returns StandardLoRAAdapter

// Pooling layer passes through (no weights to adapt)
var poolLayer = new MaxPoolingLayer<double>(...);
var unchanged = config.ApplyLoRA(poolLayer); // Returns original poolLayer

Table of Contents

Class DefaultLoRAConfiguration<T>

Type Parameters

Remarks

Constructors

DefaultLoRAConfiguration(int, double, bool, ILoRAAdapter<T>?)

Parameters

Remarks

Exceptions

Properties

Alpha

Property Value

Remarks

FreezeBaseLayer

Property Value

Remarks

Rank

Property Value

Remarks

Methods

ApplyLoRA(ILayer<T>)

Parameters

Returns

Remarks