Class LoRAAdapterBase<T>
Abstract base class for LoRA (Low-Rank Adaptation) adapters that wrap existing layers.
public abstract class LoRAAdapterBase<T> : LayerBase<T>, IDisposable, ILoRAAdapter<T>, ILayer<T>, IJitCompilable<T>, IDiagnosticsProvider, IWeightLoadable<T>
Type Parameters
TThe numeric type used for calculations, typically float or double.
- Inheritance
-
LayerBase<T>LoRAAdapterBase<T>
- Implements
-
ILoRAAdapter<T>ILayer<T>
- Derived
- Inherited Members
Remarks
This base class provides common functionality for all LoRA adapter implementations. It manages the base layer, LoRA layer, and parameter synchronization, while allowing derived classes to implement layer-type-specific logic such as merging and validation.
For Beginners: This is the foundation for all LoRA adapters in the library.
A LoRA adapter wraps an existing layer (like a dense or convolutional layer) and adds a small "correction layer" that learns what adjustments are needed. This base class:
- Manages both the original layer and the LoRA correction layer
- Handles parameter synchronization between them
- Provides common forward/backward pass logic (original + correction)
- Lets specialized adapters handle layer-specific details
This design allows you to create LoRA adapters for any layer type by:
- Inheriting from this base class
- Implementing layer-specific validation
- Implementing how to merge the LoRA weights back into the original layer
The result is parameter-efficient fine-tuning that works across different layer architectures!
Constructors
LoRAAdapterBase(ILayer<T>, int, double, bool)
Initializes a new LoRA adapter base with the specified parameters.
protected LoRAAdapterBase(ILayer<T> baseLayer, int rank, double alpha = -1, bool freezeBaseLayer = true)
Parameters
baseLayerILayer<T>The layer to adapt with LoRA.
rankintThe rank of the LoRA decomposition.
alphadoubleThe LoRA scaling factor (defaults to rank if negative).
freezeBaseLayerboolWhether to freeze the base layer's parameters during training.
Remarks
For Beginners: This creates the foundation for a LoRA adapter.
Parameters:
- baseLayer: The layer you want to make more efficient to fine-tune
- rank: How much compression (lower = fewer parameters, less flexibility)
- alpha: How strong the LoRA adaptation is
- freezeBaseLayer: Whether to lock the original layer's weights (usually true for efficiency)
Derived classes will call this constructor and then add their own layer-specific logic.
Exceptions
- ArgumentNullException
Thrown when baseLayer is null.
Fields
_baseLayer
The base layer being adapted.
protected readonly ILayer<T> _baseLayer
Field Value
- ILayer<T>
_freezeBaseLayer
Whether the base layer's parameters are frozen (not trainable).
protected readonly bool _freezeBaseLayer
Field Value
_loraLayer
The LoRA layer that provides the adaptation.
protected readonly LoRALayer<T> _loraLayer
Field Value
- LoRALayer<T>
Properties
Alpha
Gets the scaling factor (alpha) for the LoRA adaptation.
public double Alpha { get; }
Property Value
Remarks
Alpha controls how strongly the LoRA adaptation affects the output. The actual LoRA contribution is scaled by alpha/rank. Common practice: alpha = rank (scaling factor of 1.0)
BaseLayer
Gets the base layer being adapted with LoRA.
public ILayer<T> BaseLayer { get; }
Property Value
- ILayer<T>
Remarks
This is the original layer that's being enhanced with LoRA adaptations. It may be frozen (non-trainable) during fine-tuning for maximum efficiency.
IsBaseLayerFrozen
Gets whether the base layer's parameters are frozen during training.
public bool IsBaseLayerFrozen { get; }
Property Value
Remarks
When true, only the LoRA parameters are trained, dramatically reducing memory requirements and training time. This is the typical use case for LoRA.
LoRALayer
Gets the LoRA layer providing the low-rank adaptation.
public LoRALayer<T> LoRALayer { get; }
Property Value
- LoRALayer<T>
Remarks
This layer implements the low-rank decomposition (A and B matrices) that provides the adaptation to the base layer's behavior.
ParameterCount
Gets the total number of trainable parameters.
public override int ParameterCount { get; }
Property Value
Remarks
If the base layer is frozen, this returns only the LoRA parameter count. Otherwise, it returns the sum of base and LoRA parameters.
Rank
Gets the rank of the low-rank decomposition.
public int Rank { get; }
Property Value
Remarks
The rank determines how many parameters the LoRA adaptation uses. Lower rank = fewer parameters = more efficient but less flexible.
Typical values: - rank=1-4: Very efficient, minimal parameters - rank=8: Good balance (default for many applications) - rank=16-32: More flexibility, more parameters - rank=64+: Diminishing returns, approaching full fine-tuning
SupportsJitCompilation
Gets whether this LoRA adapter supports JIT compilation.
public override bool SupportsJitCompilation { get; }
Property Value
- bool
True if both the base layer and LoRA layer support JIT compilation.
Remarks
LoRA adapters support JIT compilation when both their component layers (the base layer and the LoRA layer) support JIT compilation. The computation graph combines both layers: output = base_layer(input) + lora_layer(input)
For Beginners: JIT compilation makes layers run faster by converting their math operations into optimized native code.
A LoRA adapter can be JIT compiled when:
- The base layer supports JIT compilation (has its weights initialized)
- The LoRA layer supports JIT compilation (has its A and B matrices initialized)
The JIT-compiled version computes both the base layer's output and the LoRA adaptation in parallel, then adds them together. This can provide significant speedup (5-10x).
Alternatively, you can merge the LoRA weights into the base layer using MergeToOriginalLayer() for an even simpler and potentially faster deployment.
SupportsTraining
Gets whether this adapter supports training.
public override bool SupportsTraining { get; }
Property Value
Methods
Backward(Tensor<T>)
Performs the backward pass through both layers.
public override Tensor<T> Backward(Tensor<T> outputGradient)
Parameters
outputGradientTensor<T>Gradient flowing back from the next layer.
Returns
- Tensor<T>
Gradient to pass to the previous layer.
Remarks
The backward pass propagates gradients through both the LoRA layer and (if not frozen) the base layer. The input gradients from both paths are summed.
For Beginners: During learning, this figures out how to improve both layers: - Always updates the LoRA layer (that's what we're training) - Only updates the base layer if it's not frozen - Combines the gradients from both paths to tell earlier layers how to improve
CreateLoRALayer(int, double)
Creates the LoRA layer for this adapter.
protected virtual LoRALayer<T> CreateLoRALayer(int rank, double alpha)
Parameters
Returns
- LoRALayer<T>
A LoRA layer configured for this adapter.
Remarks
This method can be overridden by derived classes to customize LoRA layer creation. By default, it creates a standard LoRA layer with the adapter's input and output dimensions.
For Beginners: This creates the "correction layer" that learns adaptations.
Different adapter types might need different LoRA layer configurations:
- Dense layers: Standard 1D LoRA
- Convolutional layers: LoRA with spatial dimensions
- Attention layers: LoRA for query/key/value projections
This method lets each adapter type create the right kind of LoRA layer.
CreateMergedLayerWithClone(Vector<T>)
Helper method to create a merged layer by cloning the base layer and updating its parameters.
protected ILayer<T> CreateMergedLayerWithClone(Vector<T> mergedParams)
Parameters
mergedParamsVector<T>The merged parameters to set on the cloned layer.
Returns
- ILayer<T>
A cloned layer with merged parameters and preserved activation function.
Remarks
This helper method preserves the activation function and other settings from the base layer by using Clone() instead of creating a new layer. This ensures the merged layer behaves identically to the original adapted layer.
For Beginners: This is a utility method that derived classes can use to create a properly merged layer without duplicating the Clone() pattern everywhere.
Exceptions
- InvalidOperationException
Thrown when base layer is not DenseLayer or FullyConnectedLayer.
ExportComputationGraph(List<ComputationNode<T>>)
Exports the computation graph for JIT compilation.
public override ComputationNode<T> ExportComputationGraph(List<ComputationNode<T>> inputNodes)
Parameters
inputNodesList<ComputationNode<T>>List to which input nodes will be added.
Returns
- ComputationNode<T>
The output computation node representing the combined base + LoRA transformation.
Remarks
The computation graph implements: output = base_layer(input) + lora_layer(input)
This mirrors the Forward() method logic where:
- The input is passed through the base layer
- The same input is passed through the LoRA layer
- The two outputs are added element-wise
For Beginners: This exports the LoRA adapter's computation as a graph of operations that can be optimized and compiled to fast native code.
The graph represents:
- Input → base layer computation → base output
- Input → LoRA layer computation → LoRA output
- base output + LoRA output → final output
The JIT compiler can then fuse operations, apply SIMD vectorization, and perform other optimizations to make inference faster.
Exceptions
- ArgumentNullException
Thrown when inputNodes is null.
- InvalidOperationException
Thrown when component layers are not initialized.
Forward(Tensor<T>)
Performs the forward pass through both base and LoRA layers.
public override Tensor<T> Forward(Tensor<T> input)
Parameters
inputTensor<T>Input tensor.
Returns
- Tensor<T>
Sum of base layer output and LoRA output.
Remarks
The forward pass computes: output = base_layer(input) + lora_layer(input)
For Beginners: This runs the input through both the original layer and the LoRA correction layer, then adds their outputs together. The result is the original behavior plus the learned adaptation.
GetParameters()
Gets the current parameters as a vector.
public override Vector<T> GetParameters()
Returns
- Vector<T>
Vector containing parameters (LoRA only if base is frozen, otherwise both).
MergeToDenseOrFullyConnected()
Merges LoRA weights into the base layer for DenseLayer or FullyConnectedLayer.
protected ILayer<T> MergeToDenseOrFullyConnected()
Returns
- ILayer<T>
A new layer with merged weights.
Remarks
This helper method implements the standard LoRA merge logic for Dense and FullyConnected layers: 1. Get LoRA weight contribution from low-rank matrices 2. Add to base layer weights element-wise 3. Preserve biases unchanged 4. Create new layer with merged parameters
For Beginners: This combines the base weights with the LoRA adaptation, creating a single layer that doesn't need the adapter anymore. Useful for deployment!
Exceptions
- InvalidOperationException
Thrown when base layer is not DenseLayer or FullyConnectedLayer.
MergeToOriginalLayer()
Merges the LoRA adaptation into the base layer and returns the merged layer.
public abstract ILayer<T> MergeToOriginalLayer()
Returns
- ILayer<T>
A new layer with LoRA weights merged into the base layer's weights.
Remarks
This method must be implemented by derived classes to handle layer-type-specific merging logic. Each type of adapter (Dense, Convolutional, etc.) needs to know how to combine its LoRA weights with the base layer's weights.
For Beginners: This "bakes in" your LoRA adaptation to create a regular layer. After training with LoRA, you can merge the adaptation into the original weights for: - Faster inference (no need to compute LoRA separately) - Simpler deployment (single layer instead of two) - Compatibility with systems that don't support LoRA
Each layer type implements this differently because they have different internal structures.
ResetState()
Resets the internal state of both the base layer and LoRA layer.
public override void ResetState()
Remarks
For Beginners: This clears the memory of both the base layer and the LoRA layer. It's useful when starting to process a completely new, unrelated batch of data.
SetParameters(Vector<T>)
Sets the layer parameters from a vector.
public override void SetParameters(Vector<T> parameters)
Parameters
parametersVector<T>Vector containing parameters.
UpdateParameters(T)
Updates parameters using the specified learning rate.
public override void UpdateParameters(T learningRate)
Parameters
learningRateTThe learning rate for parameter updates.
UpdateParametersFromLayers()
Updates the parameter vector from the current base and LoRA layer states.
protected virtual void UpdateParametersFromLayers()
Remarks
This helper method synchronizes the adapter's parameter vector with the current state of the base and LoRA layers after updates. It packs parameters in the standard order: base layer parameters (if not frozen) followed by LoRA parameters.
For Beginners: This ensures the adapter's parameter vector stays in sync with its component layers. Called after parameter updates.