Table of Contents

Class AdaLoRAAdapter<T>

Namespace
AiDotNet.LoRA.Adapters
Assembly
AiDotNet.dll

Adaptive Low-Rank Adaptation (AdaLoRA) adapter that dynamically allocates parameter budgets among weight matrices.

public class AdaLoRAAdapter<T> : LoRAAdapterBase<T>, IDisposable, ILoRAAdapter<T>, ILayer<T>, IJitCompilable<T>, IDiagnosticsProvider, IWeightLoadable<T>

Type Parameters

T

The numeric type used for calculations, typically float or double.

Inheritance
AdaLoRAAdapter<T>
Implements
Inherited Members

Remarks

AdaLoRA improves upon standard LoRA by dynamically adjusting the rank allocation based on importance scores. Instead of using a fixed rank for all weight matrices, AdaLoRA: - Starts with a maximum rank and adaptively reduces it during training - Computes importance scores for each singular value component - Prunes less important components to focus parameter budget on critical adaptations - Allows different layers to have different effective ranks

This leads to more efficient parameter usage compared to fixed-rank LoRA, especially for large models where some layers need more adaptation capacity than others.

For Beginners: AdaLoRA is like smart LoRA that learns which parts of the adaptation matter most.

Think of standard LoRA as giving every layer the same budget (rank=8 everywhere). AdaLoRA is smarter:

  • Some layers get more budget (rank=16) because they're important for the task
  • Other layers get less budget (rank=2) because small changes are enough
  • The model learns this automatically during training

How it works:

  1. Start with a large rank (e.g., maxRank=32)
  2. During training, track how important each component is
  3. Prune components with low importance scores
  4. Focus parameters on what actually helps

Benefits:

  • More parameter-efficient than fixed-rank LoRA
  • Better performance with same parameter budget
  • Automatically finds optimal rank per layer

Reference: "Adaptive Budget Allocation for Parameter-Efficient Fine-Tuning" (ICLR 2023) https://arxiv.org/abs/2303.10512

Constructors

AdaLoRAAdapter(ILayer<T>, int, double, bool, double, int, int, double)

Initializes a new AdaLoRA adapter with adaptive rank allocation.

public AdaLoRAAdapter(ILayer<T> baseLayer, int maxRank, double alpha = -1, bool freezeBaseLayer = true, double rankPruningThreshold = 0.05, int minRank = 1, int pruningInterval = 100, double importanceScoreEMA = 0.95)

Parameters

baseLayer ILayer<T>

The layer to adapt with AdaLoRA.

maxRank int

The maximum rank for the LoRA decomposition.

alpha double

The LoRA scaling factor (defaults to maxRank if negative).

freezeBaseLayer bool

Whether to freeze the base layer's parameters during training.

rankPruningThreshold double

Threshold for pruning based on importance scores (default: 0.05).

minRank int

Minimum rank to maintain after pruning (default: 1).

pruningInterval int

Number of steps between pruning operations (default: 100).

importanceScoreEMA double

EMA factor for importance score updates (default: 0.95).

Remarks

For Beginners: This creates an AdaLoRA adapter with smart rank allocation.

Parameters:

  • baseLayer: The layer you want to adapt (typically Dense or FullyConnected)
  • maxRank: Start with this many components (will prune down during training)
  • alpha: How strong the adaptation is
  • freezeBaseLayer: Lock the original weights (usually true for efficiency)
  • rankPruningThreshold: How unimportant a component must be to get pruned (0.05 = bottom 5%)
  • minRank: Never prune below this rank (safety net)
  • pruningInterval: How often to check for pruning (in training steps)
  • importanceScoreEMA: How smooth importance tracking is (higher = more stable)

The adapter will automatically adjust its rank during training to focus parameters on the most important components.

Exceptions

ArgumentNullException

Thrown when baseLayer is null.

ArgumentException

Thrown when rank parameters are invalid.

Properties

CurrentRank

Gets the current active rank after pruning.

public int CurrentRank { get; }

Property Value

int

MaxRank

Gets the maximum rank this adapter can use.

public int MaxRank { get; }

Property Value

int

Methods

Backward(Tensor<T>)

Performs the backward pass and updates importance scores based on gradients.

public override Tensor<T> Backward(Tensor<T> outputGradient)

Parameters

outputGradient Tensor<T>

Gradient flowing back from the next layer.

Returns

Tensor<T>

Gradient to pass to the previous layer.

Remarks

During backpropagation, AdaLoRA computes importance scores based on the magnitude of gradients for each singular value component. Components with consistently large gradients are considered more important.

For Beginners: This is where we learn which components are important! As gradients flow back: 1. We see which components have large gradients (they're actively learning) 2. We update their importance scores (high gradients = high importance) 3. We use exponential moving average to smooth out noise

Components that consistently get small gradients aren't helping much, so they'll get low importance scores and eventually be pruned.

ExpandRank(int)

Expands the rank by adding new components (for cases where more capacity is needed).

public void ExpandRank(int additionalRank)

Parameters

additionalRank int

Number of components to add.

Remarks

This is the opposite of pruning - it adds new components when the model needs more capacity. New components are initialized with low importance and will need to prove their worth. The corresponding matrix elements are reinitialized with small random values so they can learn.

For Beginners: Sometimes the model realizes it needs more capacity. This method adds new components, giving the model more flexibility to learn.

Think of it like hiring more workers when the team is overloaded. The new components start with low importance and have to earn their keep.

Forward(Tensor<T>)

Performs the forward pass using only the top-k most important singular values.

public override Tensor<T> Forward(Tensor<T> input)

Parameters

input Tensor<T>

Input tensor.

Returns

Tensor<T>

Sum of base layer output and AdaLoRA output (using current rank).

Remarks

Unlike standard LoRA which uses all rank components, AdaLoRA only uses the currentRank most important components based on importance scores. This is more efficient and focuses computation on the most impactful adaptations.

For Beginners: This computes the output using only the important components. If we started with rank=32 but pruned to rank=8, we only use the top 8 most important singular values. This makes computation faster and more focused.

GetImportanceScores()

Gets a copy of the current importance scores.

public Vector<T> GetImportanceScores()

Returns

Vector<T>

MergeToOriginalLayer()

Merges the AdaLoRA adaptation into the base layer and returns the merged layer.

public override ILayer<T> MergeToOriginalLayer()

Returns

ILayer<T>

A new layer with AdaLoRA weights merged into the base layer's weights.

Remarks

For Dense/FullyConnected layers, this merges the LoRA matrices into the base layer weights. Only the currently active components (based on currentRank) are merged.

For Beginners: This "bakes in" your adaptive LoRA to create a regular layer. Only the components that survived pruning (the important ones) are included in the merge.

This gives you a final layer that:

  • Includes only the useful adaptations
  • Is as fast as a regular layer
  • Can be deployed without AdaLoRA infrastructure