Class LoHaAdapter<T>

Namespace: AiDotNet.LoRA.Adapters

Assembly: AiDotNet.dll

LoHa (Low-Rank Hadamard Product Adaptation) adapter for parameter-efficient fine-tuning.

public class LoHaAdapter<T> : LoRAAdapterBase<T>, IDisposable, ILoRAAdapter<T>, ILayer<T>, IJitCompilable<T>, IDiagnosticsProvider, IWeightLoadable<T>

Type Parameters

T: The numeric type used for calculations, typically float or double.

Inheritance: object

LayerBase<T>

LoRAAdapterBase<T>

LoHaAdapter<T>

Implements: IDisposable

ILoRAAdapter<T>

ILayer<T>

IJitCompilable<T>

IDiagnosticsProvider

IWeightLoadable<T>

Inherited Members: LoRAAdapterBase<T>._baseLayer

LoRAAdapterBase<T>._loraLayer

LoRAAdapterBase<T>._freezeBaseLayer

LoRAAdapterBase<T>.BaseLayer

LoRAAdapterBase<T>.LoRALayer

LoRAAdapterBase<T>.IsBaseLayerFrozen

LoRAAdapterBase<T>.Rank

LoRAAdapterBase<T>.Alpha

LoRAAdapterBase<T>.SupportsTraining

LoRAAdapterBase<T>.CreateLoRALayer(int, double)

LoRAAdapterBase<T>.CreateMergedLayerWithClone(Vector<T>)

LoRAAdapterBase<T>.MergeToDenseOrFullyConnected()

LoRAAdapterBase<T>.UpdateParametersFromLayers()

LoRAAdapterBase<T>.SupportsJitCompilation

LoRAAdapterBase<T>.ExportComputationGraph(List<ComputationNode<T>>)

LayerBase<T>.Engine

LayerBase<T>.ScalarActivation

LayerBase<T>.VectorActivation

LayerBase<T>.UsingVectorActivation

LayerBase<T>.NumOps

LayerBase<T>.Random

LayerBase<T>.Parameters

LayerBase<T>.ParameterGradients

LayerBase<T>.InputShape

LayerBase<T>.InputShapes

LayerBase<T>.UpdateInputShape(int[])

LayerBase<T>.OutputShape

LayerBase<T>.IsTrainingMode

LayerBase<T>.InitializationStrategy

LayerBase<T>.IsInitialized

LayerBase<T>.InitializationLock

LayerBase<T>.EnsureInitialized()

LayerBase<T>.UseAutodiff

LayerBase<T>.SetTrainingMode(bool)

LayerBase<T>.GetParameterGradients()

LayerBase<T>.ClearGradients()

LayerBase<T>.GetInputShape()

LayerBase<T>.GetInputShapes()

LayerBase<T>.GetOutputShape()

LayerBase<T>.GetWeights()

LayerBase<T>.GetBiases()

LayerBase<T>.MapActivationToFused()

LayerBase<T>.SupportsGpuExecution

LayerBase<T>.SupportsGpuTraining

LayerBase<T>.CanExecuteOnGpu

LayerBase<T>.CanTrainOnGpu

LayerBase<T>.ForwardGpu(params IGpuTensor<T>[])

LayerBase<T>.BackwardGpu(IGpuTensor<T>)

LayerBase<T>.UpdateParametersGpu(IGpuOptimizerConfig)

LayerBase<T>.UploadWeightsToGpu()

LayerBase<T>.DownloadWeightsFromGpu()

LayerBase<T>.ZeroGradientsGpu()

LayerBase<T>.GetActivationTypes()

LayerBase<T>.Forward(params Tensor<T>[])

LayerBase<T>.ApplyActivation(Tensor<T>)

LayerBase<T>.ApplyActivation(Vector<T>)

LayerBase<T>.ActivateTensor(IActivationFunction<T>, Tensor<T>)

LayerBase<T>.ActivateTensor(IVectorActivationFunction<T>, Tensor<T>)

LayerBase<T>.CalculateInputShape(int, int, int)

LayerBase<T>.CalculateOutputShape(int, int, int)

LayerBase<T>.Clone()

LayerBase<T>.DerivativeTensor(IActivationFunction<T>, Tensor<T>)

LayerBase<T>.ApplyActivationDerivative(T, T)

LayerBase<T>.ApplyActivationDerivative(Tensor<T>, Tensor<T>)

LayerBase<T>.ComputeActivationJacobian(Vector<T>)

LayerBase<T>.ApplyActivationDerivative(Vector<T>, Vector<T>)

LayerBase<T>.UpdateParameters(Vector<T>)

LayerBase<T>.Serialize(BinaryWriter)

LayerBase<T>.Deserialize(BinaryReader)

LayerBase<T>.GetDiagnostics()

LayerBase<T>.ApplyActivationToGraph(ComputationNode<T>)

LayerBase<T>.CanActivationBeJitted()

LayerBase<T>.RegisterTrainableParameter(Tensor<T>, PersistentTensorRole)

LayerBase<T>.InvalidateTrainableParameter(Tensor<T>)

LayerBase<T>.HasGpuActivation()

LayerBase<T>.ApplyActivationForwardGpu(IDirectGpuBackend, IGpuBuffer, IGpuBuffer, int)

LayerBase<T>.ApplyActivationBackwardGpu(IDirectGpuBackend, IGpuBuffer, IGpuBuffer, IGpuBuffer, IGpuBuffer, int)

LayerBase<T>.GetFusedActivationType()

LayerBase<T>.ApplyGpuActivation(IDirectGpuBackend, IGpuBuffer, IGpuBuffer, int, FusedActivationType)

LayerBase<T>.ApplyGpuActivationBackward(IDirectGpuBackend, IGpuBuffer, IGpuBuffer, IGpuBuffer, IGpuBuffer, int, FusedActivationType, float)

LayerBase<T>.Dispose()

LayerBase<T>.Dispose(bool)

LayerBase<T>.WeightParameterName

LayerBase<T>.BiasParameterName

LayerBase<T>.SetWeights(Tensor<T>)

LayerBase<T>.SetBiases(Tensor<T>)

LayerBase<T>.GetParameterNames()

LayerBase<T>.TryGetParameter(string, out Tensor<T>)

LayerBase<T>.SetParameter(string, Tensor<T>)

LayerBase<T>.GetParameterShape(string)

LayerBase<T>.NamedParameterCount

LayerBase<T>.ValidateWeights(IEnumerable<string>, Func<string, string>)

LayerBase<T>.LoadWeights(Dictionary<string, Tensor<T>>, Func<string, string>, bool)

object.Equals(object)

object.Equals(object, object)

object.GetHashCode()

object.GetType()

object.MemberwiseClone()

object.ReferenceEquals(object, object)

object.ToString()

Remarks

LoHa uses element-wise Hadamard products (⊙) instead of matrix multiplication for adaptation. Instead of computing ΔW = B * A like standard LoRA, LoHa computes: ΔW = sum over rank of (A[i] ⊙ B[i])

This formulation can capture element-wise patterns that matrix multiplication may miss, making it particularly effective for:

Convolutional layers (local spatial patterns)
Element-wise transformations
Fine-grained weight adjustments

Mathematical Formulation:

Standard LoRA: ΔW = B * A where B is rank×output, A is input×rank LoHa: ΔW = Σ(A[i] ⊙ B[i]) where A[i] and B[i] are both input×output

The Hadamard product (⊙) performs element-wise multiplication, allowing each element of the weight matrix to be adjusted independently across the rank dimensions.

For Beginners: LoHa is a variant of LoRA that uses element-wise multiplication instead of matrix multiplication. Think of it this way:

Standard LoRA: Learns "row and column patterns" that combine via matrix multiply
LoHa: Learns "pixel-by-pixel patterns" that combine via element-wise multiply

LoHa is especially good when:

You need to capture local, element-wise patterns (like in images)
The weight matrix has spatial structure (like convolutional filters)
You want each weight to be adjusted somewhat independently

Trade-offs compared to LoRA:

More parameters: Both A and B must be full-sized (input×output) per rank dimension
Different expressiveness: Better for element-wise patterns, different from matrix patterns
Better for CNNs: The element-wise nature matches convolutional structure better

Example: A 100×100 weight matrix with rank=8

Standard LoRA: 8×100 + 100×8 = 1,600 parameters
LoHa: 2 × 8 × 100 × 100 = 160,000 parameters (each rank has 2 full-sized matrices)

LoHa uses MORE parameters than LoRA but models element-wise weight interactions via Hadamard products.

Constructors

LoHaAdapter(ILayer<T>, int, double, bool)

Initializes a new LoHa adapter wrapping an existing layer.

public LoHaAdapter(ILayer<T> baseLayer, int rank, double alpha = -1, bool freezeBaseLayer = true)

Parameters

baseLayer ILayer<T>: The layer to adapt with LoHa.
rank int: The rank of the low-rank decomposition.
alpha double: The LoHa scaling factor (defaults to rank if negative).
freezeBaseLayer bool: Whether to freeze the base layer's parameters during training.

Remarks

For Beginners: This creates a LoHa adapter for any layer with 1D input/output.

Parameters:

baseLayer: The layer you want to make more efficient to fine-tune
rank: How many element-wise patterns to learn (more = more flexibility, more parameters)
alpha: How strong the LoHa adaptation is (typically same as rank)
freezeBaseLayer: Whether to lock the original layer's weights (usually true for efficiency)

The adapter creates 2×rank full-sized matrices (A and B for each rank dimension), which are combined using element-wise Hadamard products during forward/backward passes.

Exceptions

ArgumentNullException: Thrown when baseLayer is null.
ArgumentException: Thrown when the base layer doesn't have 1D input/output shapes.

Properties

ParameterCount

Gets the total number of trainable parameters.

public override int ParameterCount { get; }

Property Value

int

Remarks

LoHa has 2 * rank * inputSize * outputSize parameters (A and B matrices for each rank). This is more than standard LoRA but still far less than full fine-tuning.

Methods

Backward(Tensor<T>)

Performs the backward pass through both layers, computing gradients for LoHa matrices.

public override Tensor<T> Backward(Tensor<T> outputGradient)

Parameters

outputGradient Tensor<T>: Gradient flowing back from the next layer.

Returns

Tensor<T>: Gradient to pass to the previous layer.

Remarks

The backward pass computes gradients using the chain rule for Hadamard products:

dL/dA[r] = input^T * (dL/doutput ⊙ B[r]) * scaling dL/dB[r] = (input * A[r]) ⊙ dL/doutput * scaling dL/dinput = base_gradient + sum over rank of (dL/doutput ⊙ B[r]) * A[r]^T * scaling

The Hadamard product gradient rule: d/dx (f ⊙ g) = df ⊙ g + f ⊙ dg

For Beginners: This is the learning phase for LoHa. It computes:

How to adjust each A[i] matrix to reduce error
How to adjust each B[i] matrix to reduce error
What gradient to send to earlier layers

The math is more complex than standard LoRA because Hadamard products have different derivative rules than matrix multiplication, but the idea is the same: figure out how each parameter contributed to the error and adjust accordingly.

Forward(Tensor<T>)

Performs the forward pass through both base layer and LoHa adaptation.

public override Tensor<T> Forward(Tensor<T> input)

Parameters

input Tensor<T>: Input tensor.

Returns

Tensor<T>: Sum of base layer output and LoHa delta (computed via Hadamard products).

Remarks

The forward pass computes: 1. base_output = base_layer(input) 2. loha_delta = sum over rank of (input * A[i] ⊙ B[i]) * scaling 3. output = base_output + loha_delta

The Hadamard product (⊙) multiplies corresponding elements, allowing element-wise adaptations.

For Beginners: This runs the input through the original layer and adds a correction.

The correction is computed by:

Transforming input through each A[i] matrix (one per rank dimension)
Multiplying element-wise with corresponding B[i] matrix (Hadamard product)
Summing all rank contributions together
Scaling by alpha/rank

This element-wise approach lets LoHa learn fine-grained adjustments to each weight independently.

GetParameters()

Gets the current parameters as a vector.

public override Vector<T> GetParameters()

Returns

Vector<T>: Vector containing all LoHa parameters (A and B matrices for all ranks).

MergeToOriginalLayer()

Merges the LoHa adaptation into the base layer and returns the merged layer.

public override ILayer<T> MergeToOriginalLayer()

Returns

ILayer<T>: A new DenseLayer with LoHa weights merged into the base layer's weights.

Remarks

This method computes the full LoHa weight delta by summing all Hadamard products: ΔW = scaling * sum over rank of (A[i] ⊙ B[i])

The delta is then added to the base layer's weights to create a merged layer.

For Beginners: This "bakes in" your LoHa adaptation to create a regular Dense layer.

The merging process:

Computes the full weight delta from all A[i] and B[i] matrices using Hadamard products
Adds this delta to the base layer's existing weights
Copies biases unchanged (LoHa doesn't modify biases)
Creates a new DenseLayer with the merged weights

After merging, you have a single layer that includes all the learned adaptations, making inference faster and simpler.

Exceptions

InvalidOperationException: Thrown when the base layer type is not DenseLayer or FullyConnectedLayer.

ResetState()

Resets the internal state of both the base layer and LoHa adapter.

public override void ResetState()

Remarks

For Beginners: This clears the memory of the adapter and base layer. It's useful when starting to process a completely new, unrelated batch of data.

SetParameters(Vector<T>)

Sets the layer parameters from a vector.

public override void SetParameters(Vector<T> parameters)

Parameters

parameters Vector<T>: Vector containing all LoHa parameters.

UpdateParameters(T)

Updates parameters using the specified learning rate.

public override void UpdateParameters(T learningRate)

Parameters

learningRate T: The learning rate for parameter updates.

Table of Contents

Class LoHaAdapter<T>

Type Parameters

Remarks

Constructors

LoHaAdapter(ILayer<T>, int, double, bool)

Parameters

Remarks

Exceptions

Properties

ParameterCount

Property Value

Remarks

Methods

Backward(Tensor<T>)

Parameters

Returns

Remarks

Forward(Tensor<T>)

Parameters

Returns

Remarks

GetParameters()

Returns

MergeToOriginalLayer()

Returns

Remarks

Exceptions

ResetState()

Remarks

SetParameters(Vector<T>)

Parameters

UpdateParameters(T)

Parameters