Class FloraAdapter<T>
Implements Flora (Low-Rank Adapters Are Secretly Gradient Compressors) adapter for memory-efficient fine-tuning.
public class FloraAdapter<T> : LoRAAdapterBase<T>, IDisposable, ILoRAAdapter<T>, ILayer<T>, IJitCompilable<T>, IDiagnosticsProvider, IWeightLoadable<T>
Type Parameters
TThe numeric type used for calculations, typically float or double.
- Inheritance
-
LayerBase<T>FloraAdapter<T>
- Implements
-
ILoRAAdapter<T>ILayer<T>
- Inherited Members
Remarks
Flora reinterprets LoRA as a gradient compression mechanism and achieves high-rank updates through periodic resampling of projection matrices while maintaining sublinear space complexity for optimizer states.
Research Paper: "Flora: Low-Rank Adapters Are Secretly Gradient Compressors" by Yongchang Hao et al., ICML 2024. arXiv:2402.03293
Key Innovation: Unlike standard LoRA which restricts weight updates to a fixed low-rank subspace, Flora periodically resamples the projection matrices (A and B), allowing the effective rank of cumulative updates to grow over time. This achieves performance comparable to full-rank fine-tuning while maintaining the memory efficiency of LoRA.
Constructors
FloraAdapter(ILayer<T>, int, double, int, double, double, bool, bool, int)
public FloraAdapter(ILayer<T> baseLayer, int rank, double alpha = -1, int resamplingInterval = 1000, double momentumDecay = 0.9, double secondMomentDecay = 0.999, bool useAdaptiveLearningRate = true, bool freezeBaseLayer = true, int seed = 42)
Parameters
baseLayerILayer<T>rankintalphadoubleresamplingIntervalintmomentumDecaydoublesecondMomentDecaydoubleuseAdaptiveLearningRateboolfreezeBaseLayerboolseedint
Properties
CurrentStep
public int CurrentStep { get; }
Property Value
ResamplingInterval
public int ResamplingInterval { get; }
Property Value
Methods
MergeToOriginalLayer()
Merges the LoRA adaptation into the base layer and returns the merged layer.
public override ILayer<T> MergeToOriginalLayer()
Returns
- ILayer<T>
A new layer with LoRA weights merged into the base layer's weights.
Remarks
This method must be implemented by derived classes to handle layer-type-specific merging logic. Each type of adapter (Dense, Convolutional, etc.) needs to know how to combine its LoRA weights with the base layer's weights.
For Beginners: This "bakes in" your LoRA adaptation to create a regular layer. After training with LoRA, you can merge the adaptation into the original weights for: - Faster inference (no need to compute LoRA separately) - Simpler deployment (single layer instead of two) - Compatibility with systems that don't support LoRA
Each layer type implements this differently because they have different internal structures.
ResetState()
Resets the internal state of both the base layer and LoRA layer.
public override void ResetState()
Remarks
For Beginners: This clears the memory of both the base layer and the LoRA layer. It's useful when starting to process a completely new, unrelated batch of data.
UpdateParameters(T)
Updates parameters using the specified learning rate.
public override void UpdateParameters(T learningRate)
Parameters
learningRateTThe learning rate for parameter updates.