Class LoRALayer<T>
Implements Low-Rank Adaptation (LoRA) layer for parameter-efficient fine-tuning of neural networks.
public class LoRALayer<T> : LayerBase<T>, ILayer<T>, IJitCompilable<T>, IDiagnosticsProvider, IWeightLoadable<T>, IDisposable
Type Parameters
TThe numeric type used for calculations, typically float or double.
- Inheritance
-
LayerBase<T>LoRALayer<T>
- Implements
-
ILayer<T>
- Inherited Members
Remarks
LoRA works by decomposing weight updates into two low-rank matrices A and B, where the actual update is computed as B * A. This dramatically reduces the number of trainable parameters compared to fine-tuning all weights directly.
For Beginners: LoRA is a technique that makes it much cheaper to adapt large neural networks to new tasks. Instead of updating all the weights in a layer (which can be millions of parameters), LoRA adds two small matrices that work together to approximate the needed changes.
Think of it like this:
- Traditional fine-tuning: Adjusting every single knob on a massive control panel
- LoRA: Using just a few master controls that influence many knobs at once
The key insight is that the changes needed for fine-tuning often lie in a "low-rank" space, meaning we don't need full freedom to adjust every parameter independently.
Key parameters:
- Rank (r): Controls how many "master controls" you have. Higher rank = more flexibility but more parameters
- Alpha: A scaling factor that controls how much influence the LoRA adaptation has
For example, adapting a layer with 1000x1000 weights (1M parameters) using LoRA with rank=8 only requires 8x1000 + 8x1000 = 16,000 parameters (98.4% reduction!).
Constructors
LoRALayer(int, int, int, double, IActivationFunction<T>?)
Initializes a new LoRA layer with the specified dimensions and hyperparameters.
public LoRALayer(int inputSize, int outputSize, int rank, double alpha = -1, IActivationFunction<T>? activationFunction = null)
Parameters
inputSizeintThe number of input features.
outputSizeintThe number of output features.
rankintThe rank of the low-rank decomposition (must be positive and less than min(inputSize, outputSize)).
alphadoubleThe scaling factor for LoRA contributions (typically similar to rank value).
activationFunctionIActivationFunction<T>Optional activation function to apply after the LoRA transformation.
Remarks
The LoRA matrices are initialized as follows: - Matrix A: Random values from a Gaussian distribution (similar to Kaiming initialization) - Matrix B: Zero initialization (so LoRA starts with no effect)
For Beginners: This creates a new LoRA layer. You specify the input and output sizes (which should match the layer you're adapting), the rank (how much compression), and alpha (how strong the adaptation is).
The initialization is carefully chosen:
- Matrix A gets random values (so training can start moving in useful directions)
- Matrix B starts at zero (so initially, LoRA doesn't change anything)
Exceptions
- ArgumentException
Thrown when rank is invalid.
Properties
Alpha
Gets the alpha scaling factor.
public T Alpha { get; }
Property Value
- T
ParameterCount
Gets the total number of trainable parameters (elements in A and B matrices).
public override int ParameterCount { get; }
Property Value
Rank
Gets the rank of this LoRA layer.
public int Rank { get; }
Property Value
Scaling
Gets the computed scaling factor (alpha / rank).
public T Scaling { get; }
Property Value
- T
SupportsJitCompilation
Gets whether this LoRA layer supports JIT compilation.
public override bool SupportsJitCompilation { get; }
Property Value
- bool
True if the LoRA matrices are initialized.
Remarks
LoRA layers support JIT compilation when their matrices (A and B) are properly initialized. The JIT-compiled version computes output = input * A * B * scaling using optimized tensor operations.
For Beginners: JIT compilation makes the LoRA layer run faster by converting its math operations into optimized native code. This is especially beneficial for inference when you want maximum speed.
The layer can be JIT compiled as long as it has been initialized, which happens automatically when the layer is created.
SupportsTraining
Gets whether this layer supports training (always true for LoRA).
public override bool SupportsTraining { get; }
Property Value
Methods
Backward(Tensor<T>)
Performs the backward pass through the LoRA layer.
public override Tensor<T> Backward(Tensor<T> outputGradient)
Parameters
outputGradientTensor<T>Gradient flowing back from the next layer.
Returns
- Tensor<T>
Gradient to pass to the previous layer.
Remarks
The backward pass computes gradients for both LoRA matrices and propagates gradients back to the input. Gradients are computed as: - dL/dB = A^T * input^T * outputGradient * scaling - dL/dA = input^T * outputGradient * B^T * scaling - dL/dinput = outputGradient * B^T * A^T * scaling
For Beginners: This is where learning happens! The backward pass: 1. Figures out how to adjust matrix A and B to reduce error 2. Passes gradients back to earlier layers so they can learn too
It uses calculus (specifically, the chain rule) to figure out how each parameter contributed to the error.
ExportComputationGraph(List<ComputationNode<T>>)
Exports the computation graph for JIT compilation.
public override ComputationNode<T> ExportComputationGraph(List<ComputationNode<T>> inputNodes)
Parameters
inputNodesList<ComputationNode<T>>List to which input nodes will be added.
Returns
- ComputationNode<T>
The output computation node representing the LoRA transformation.
Remarks
The computation graph implements: output = input * A * B * scaling where: - A is the low-rank projection matrix (inputSize × rank) - B is the reconstruction matrix (rank × outputSize) - scaling = alpha / rank
For Beginners: This exports the LoRA computation as a graph of operations that can be optimized and compiled to fast native code.
The graph represents:
- Input → multiply by matrix A (compress to low rank)
- Result → multiply by matrix B (expand to output size)
- Result → multiply by scaling factor
The JIT compiler can then fuse these operations and apply optimizations like SIMD vectorization.
Exceptions
- ArgumentNullException
Thrown when inputNodes is null.
- InvalidOperationException
Thrown when matrices are not initialized.
Forward(Tensor<T>)
Performs the forward pass through the LoRA layer.
public override Tensor<T> Forward(Tensor<T> input)
Parameters
inputTensor<T>Input tensor of shape [batchSize, inputSize].
Returns
- Tensor<T>
Output tensor of shape [batchSize, outputSize].
Remarks
The forward pass computes: output = input * A * B * scaling where scaling = alpha / rank.
For Beginners: This processes data through the LoRA layer. The input is: 1. Multiplied by matrix A (compressing to rank dimensions) 2. Multiplied by matrix B (expanding back to output dimensions) 3. Scaled by alpha/rank (controlling the strength)
The result represents the adaptation that gets added to the base layer's output.
GetMatrixA()
Gets matrix A (for inspection or advanced use cases).
public Matrix<T> GetMatrixA()
Returns
- Matrix<T>
GetMatrixB()
Gets matrix B (for inspection or advanced use cases).
public Matrix<T> GetMatrixB()
Returns
- Matrix<T>
GetParameters()
Gets the current parameters as a vector.
public override Vector<T> GetParameters()
Returns
- Vector<T>
Vector containing all LoRA parameters (A and B matrices flattened).
MergeWeights()
Merges the LoRA weights into a dense weight matrix that can be added to a base layer.
public Matrix<T> MergeWeights()
Returns
- Matrix<T>
The merged weight matrix (inputSize × outputSize) representing the full LoRA contribution.
Remarks
This computes the full weight matrix W_lora = A * B * scaling, which can then be added to the base layer's weights. This is useful for deployment when you want to merge the adaptation back into the base model for inference efficiency.
For Beginners: This "bakes in" the LoRA adaptation into a regular weight matrix. Instead of storing two small matrices (A and B) and computing them during inference, you can merge them into one larger matrix and add it to the original weights.
This is like converting assembly instructions back into a final product - once you're done training, you can simplify the model for faster inference.
ResetState()
Resets the internal state of the layer.
public override void ResetState()
Remarks
For LoRA layers, this clears the stored input from the last forward pass.
For Beginners: This clears the layer's memory of the last input it processed. It's like hitting a reset button before processing a new, unrelated batch of data.
SetParameters(Vector<T>)
Sets the layer parameters from a vector.
public override void SetParameters(Vector<T> parameters)
Parameters
parametersVector<T>Vector containing all LoRA parameters.
UpdateParameters(T)
Updates the layer's parameters using the specified learning rate.
public override void UpdateParameters(T learningRate)
Parameters
learningRateTThe learning rate for parameter updates.