Class GRUNeuralNetwork<T>

Namespace: AiDotNet.NeuralNetworks

Assembly: AiDotNet.dll

Represents a Gated Recurrent Unit (GRU) Neural Network for processing sequential data.

public class GRUNeuralNetwork<T> : NeuralNetworkBase<T>, INeuralNetworkModel<T>, INeuralNetwork<T>, IFullModel<T, Tensor<T>, Tensor<T>>, IModel<Tensor<T>, Tensor<T>, ModelMetadata<T>>, IModelSerializer, ICheckpointableModel, IParameterizable<T, Tensor<T>, Tensor<T>>, IFeatureAware, IFeatureImportance<T>, ICloneable<IFullModel<T, Tensor<T>, Tensor<T>>>, IGradientComputable<T, Tensor<T>, Tensor<T>>, IJitCompilable<T>, IInterpretableModel<T>, IInputGradientComputable<T>, IDisposable

Type Parameters

T: The numeric type used for calculations, typically float or double.

Inheritance: object

NeuralNetworkBase<T>

GRUNeuralNetwork<T>

Implements: INeuralNetworkModel<T>

INeuralNetwork<T>

IFullModel<T, Tensor<T>, Tensor<T>>

IModel<Tensor<T>, Tensor<T>, ModelMetadata<T>>

IModelSerializer

ICheckpointableModel

IParameterizable<T, Tensor<T>, Tensor<T>>

IFeatureAware

IFeatureImportance<T>

ICloneable<IFullModel<T, Tensor<T>, Tensor<T>>>

IGradientComputable<T, Tensor<T>, Tensor<T>>

IJitCompilable<T>

IInterpretableModel<T>

IInputGradientComputable<T>

IDisposable

Inherited Members: NeuralNetworkBase<T>.Layers

NeuralNetworkBase<T>.LayerCount

NeuralNetworkBase<T>.Architecture

NeuralNetworkBase<T>.NumOps

NeuralNetworkBase<T>.Engine

NeuralNetworkBase<T>._layerInputs

NeuralNetworkBase<T>._layerOutputs

NeuralNetworkBase<T>.Random

NeuralNetworkBase<T>.LossFunction

NeuralNetworkBase<T>.LastLoss

NeuralNetworkBase<T>.IsTrainingMode

NeuralNetworkBase<T>.SupportsTraining

NeuralNetworkBase<T>.SupportsGpuTraining

NeuralNetworkBase<T>.CanTrainOnGpu

NeuralNetworkBase<T>.GpuEngine

NeuralNetworkBase<T>.MaxGradNorm

NeuralNetworkBase<T>._mixedPrecisionContext

NeuralNetworkBase<T>._memoryManager

NeuralNetworkBase<T>.IsMemoryManagementEnabled

NeuralNetworkBase<T>.IsGradientCheckpointingEnabled

NeuralNetworkBase<T>.IsMixedPrecisionEnabled

NeuralNetworkBase<T>.ClipGradients(List<Tensor<T>>)

NeuralNetworkBase<T>.ClipGradient(Tensor<T>)

NeuralNetworkBase<T>.ClipGradient(Vector<T>)

NeuralNetworkBase<T>.GetParameters()

NeuralNetworkBase<T>.Backpropagate(Tensor<T>)

NeuralNetworkBase<T>.BackpropagateWithRecompute(Tensor<T>)

NeuralNetworkBase<T>.ForwardGpu(IGpuTensor<T>)

NeuralNetworkBase<T>.BackpropagateGpu(IGpuTensor<T>)

NeuralNetworkBase<T>.BackpropagateGpuDeferred(IGpuTensor<T>, GpuExecutionOptions)

NeuralNetworkBase<T>.UpdateParametersGpu(T, T, T)

NeuralNetworkBase<T>.UpdateParametersGpu(IGpuOptimizerConfig)

NeuralNetworkBase<T>.UpdateParametersGpuDeferred(IGpuOptimizerConfig, GpuExecutionOptions)

NeuralNetworkBase<T>.TrainBatchGpuDeferred(IGpuTensor<T>, IGpuTensor<T>, IGpuOptimizerConfig, GpuExecutionOptions)

NeuralNetworkBase<T>.TrainBatchGpuDeferredAsync(IGpuTensor<T>, IGpuTensor<T>, IGpuOptimizerConfig, GpuExecutionOptions, CancellationToken)

NeuralNetworkBase<T>.UploadWeightsToGpu()

NeuralNetworkBase<T>.DownloadWeightsFromGpu()

NeuralNetworkBase<T>.ZeroGradientsGpu()

NeuralNetworkBase<T>.ExtractSingleExample(Tensor<T>, int)

NeuralNetworkBase<T>.ForwardWithCheckpointing(Tensor<T>)

NeuralNetworkBase<T>.CanUseGpuResidentPath()

NeuralNetworkBase<T>.TryForwardGpuOptimized(Tensor<T>, out Tensor<T>)

NeuralNetworkBase<T>.ForwardGpu(Tensor<T>)

NeuralNetworkBase<T>.ForwardDeferred(Tensor<T>)

NeuralNetworkBase<T>.ForwardDeferredAsync(Tensor<T>, CancellationToken)

NeuralNetworkBase<T>.BeginGpuExecution(GpuExecutionOptions)

NeuralNetworkBase<T>.ForwardWithGpuContext(Tensor<T>)

NeuralNetworkBase<T>.ForwardWithGpuContext(IGpuTensor<T>)

NeuralNetworkBase<T>.GetGpuMemoryStats()

NeuralNetworkBase<T>.ForwardWithFeatures(Tensor<T>, int[])

NeuralNetworkBase<T>.ParameterCount

NeuralNetworkBase<T>.GetParameterCount()

NeuralNetworkBase<T>.InvalidateParameterCountCache()

NeuralNetworkBase<T>.AddLayerToCollection(ILayer<T>)

NeuralNetworkBase<T>.RemoveLayerFromCollection(ILayer<T>)

NeuralNetworkBase<T>.ClearLayers()

NeuralNetworkBase<T>.ValidateCustomLayers(List<ILayer<T>>)

NeuralNetworkBase<T>.ValidateCustomLayersInternal(List<ILayer<T>>)

NeuralNetworkBase<T>.IsValidInputLayer(ILayer<T>)

NeuralNetworkBase<T>.IsValidOutputLayer(ILayer<T>)

NeuralNetworkBase<T>.AreLayersCompatible(ILayer<T>, ILayer<T>)

NeuralNetworkBase<T>.GetParameterGradients()

NeuralNetworkBase<T>.EnsureArchitectureInitialized()

NeuralNetworkBase<T>.SetTrainingMode(bool)

NeuralNetworkBase<T>.EnableMemoryManagement(TrainingMemoryConfig)

NeuralNetworkBase<T>.DisableMemoryManagement()

NeuralNetworkBase<T>.GetMemoryEstimate(int, int)

NeuralNetworkBase<T>.GetLastLoss()

NeuralNetworkBase<T>.ResetState()

NeuralNetworkBase<T>.BackwardWithInputGradient(Tensor<T>)

NeuralNetworkBase<T>.ComputeInputGradient(Vector<T>, Vector<T>)

NeuralNetworkBase<T>.ComputeInputGradient(Tensor<T>, Tensor<T>)

NeuralNetworkBase<T>.SaveModel(string)

NeuralNetworkBase<T>.LoadModel(string)

NeuralNetworkBase<T>.Serialize()

NeuralNetworkBase<T>.Deserialize(byte[])

NeuralNetworkBase<T>.WithParameters(Vector<T>)

NeuralNetworkBase<T>.GetActiveFeatureIndices()

NeuralNetworkBase<T>.IsFeatureUsed(int)

NeuralNetworkBase<T>.DeepCopy()

NeuralNetworkBase<T>.Clone()

NeuralNetworkBase<T>.SetActiveFeatureIndices(IEnumerable<int>)

NeuralNetworkBase<T>._enabledMethods

NeuralNetworkBase<T>._sensitiveFeatures

NeuralNetworkBase<T>._fairnessMetrics

NeuralNetworkBase<T>._baseModel

NeuralNetworkBase<T>.GetGlobalFeatureImportanceAsync()

NeuralNetworkBase<T>.GetLocalFeatureImportanceAsync(Tensor<T>)

NeuralNetworkBase<T>.GetShapValuesAsync(Tensor<T>)

NeuralNetworkBase<T>.GetLimeExplanationAsync(Tensor<T>, int)

NeuralNetworkBase<T>.GetPartialDependenceAsync(Vector<int>, int)

NeuralNetworkBase<T>.GetCounterfactualAsync(Tensor<T>, Tensor<T>, int)

NeuralNetworkBase<T>.GetModelSpecificInterpretabilityAsync()

NeuralNetworkBase<T>.GenerateTextExplanationAsync(Tensor<T>, Tensor<T>)

NeuralNetworkBase<T>.GetFeatureInteractionAsync(int, int)

NeuralNetworkBase<T>.ValidateFairnessAsync(Tensor<T>, int)

NeuralNetworkBase<T>.GetAnchorExplanationAsync(Tensor<T>, T)

NeuralNetworkBase<T>.SetBaseModel<TInput, TOutput>(IFullModel<T, TInput, TOutput>)

NeuralNetworkBase<T>.EnableMethod(params InterpretationMethod[])

NeuralNetworkBase<T>.ConfigureFairness(Vector<int>, params FairnessMetric[])

NeuralNetworkBase<T>.GetNamedLayerActivations(Tensor<T>)

NeuralNetworkBase<T>.GetArchitecture()

NeuralNetworkBase<T>.GetFeatureImportance()

NeuralNetworkBase<T>.SetParameters(Vector<T>)

NeuralNetworkBase<T>.AddLayer(LayerType, int, ActivationFunction)

NeuralNetworkBase<T>.AddConvolutionalLayer(int, int, int, ActivationFunction)

NeuralNetworkBase<T>.AddLSTMLayer(int, bool)

NeuralNetworkBase<T>.AddDropoutLayer(double)

NeuralNetworkBase<T>.AddBatchNormalizationLayer(int, double, double)

NeuralNetworkBase<T>.AddPoolingLayer(int[], PoolingType, int, int?)

NeuralNetworkBase<T>.GetGradients()

NeuralNetworkBase<T>.GetInputShape()

NeuralNetworkBase<T>.GetLayerActivations(Tensor<T>)

NeuralNetworkBase<T>.DefaultLossFunction

NeuralNetworkBase<T>.ComputeGradients(Tensor<T>, Tensor<T>, ILossFunction<T>)

NeuralNetworkBase<T>.ApplyGradients(Vector<T>, T)

NeuralNetworkBase<T>.SaveState(Stream)

NeuralNetworkBase<T>.LoadState(Stream)

NeuralNetworkBase<T>.Dispose()

NeuralNetworkBase<T>.Dispose(bool)

NeuralNetworkBase<T>.SupportsJitCompilation

NeuralNetworkBase<T>.ExportComputationGraph(List<ComputationNode<T>>)

NeuralNetworkBase<T>.ConvertLayerToGraph(ILayer<T>, ComputationNode<T>)

object.Equals(object)

object.Equals(object, object)

object.GetHashCode()

object.GetType()

object.MemberwiseClone()

object.ReferenceEquals(object, object)

object.ToString()

Extension Methods: DistributedExtensions.AsDistributedForHighBandwidth<T, TInput, TOutput>(IFullModel<T, TInput, TOutput>, ICommunicationBackend<T>)

DistributedExtensions.AsDistributedForLowBandwidth<T, TInput, TOutput>(IFullModel<T, TInput, TOutput>, ICommunicationBackend<T>)

DistributedExtensions.AsDistributed<T, TInput, TOutput>(IFullModel<T, TInput, TOutput>, ICommunicationBackend<T>)

DistributedExtensions.AsDistributed<T, TInput, TOutput>(IFullModel<T, TInput, TOutput>, IShardingConfiguration<T>)

Remarks

A GRU Neural Network is a type of recurrent neural network designed to effectively model sequential data. GRU networks use gating mechanisms to control information flow through the network, allowing them to capture long-term dependencies in sequence data while avoiding the vanishing gradient problem that affects simple recurrent networks.

For Beginners: A GRU Neural Network is a special type of neural network that's good at processing data that comes in sequences.

Think of it like reading a book:

A regular neural network would look at each word in isolation
A GRU network remembers what it read earlier and uses that context to understand each new word

GRU networks have special "gates" that control what information to remember and what to forget:

This helps them understand patterns that stretch over long sequences
For example, in a sentence like "John went to the store because he needed milk," a GRU can connect "he" with "John"

GRU networks are useful for:

Text processing and generation
Time series prediction (like stock prices or weather)
Speech recognition
Any task where the order and context of data matters

Constructors

GRUNeuralNetwork(NeuralNetworkArchitecture<T>, ILossFunction<T>?)

Initializes a new instance of the GRUNeuralNetwork<T> class with the specified architecture.

public GRUNeuralNetwork(NeuralNetworkArchitecture<T> architecture, ILossFunction<T>? lossFunction = null)

Parameters

architecture NeuralNetworkArchitecture<T>: The neural network architecture defining the structure of the network.
lossFunction ILossFunction<T>

Remarks

This constructor creates a GRU neural network with the specified architecture. The architecture defines important properties like input size, hidden layer sizes, and output size of the network.

For Beginners: This creates a new GRU neural network with your chosen design.

When you create a GRU network, you specify its architecture, which is like a blueprint that defines:

How many inputs the network accepts
How many hidden units to use (the network's "memory capacity")
How many outputs the network produces
Other structural aspects of the network

Think of it like defining the floor plan before building a house - you're setting up the basic structure that will determine how information flows through your network.

Methods

CreateNewInstance()

Creates a new instance of the GRU Neural Network with the same architecture and configuration.

protected override IFullModel<T, Tensor<T>, Tensor<T>> CreateNewInstance()

Returns

IFullModel<T, Tensor<T>, Tensor<T>>: A new GRU Neural Network instance with the same architecture and configuration.

Remarks

This method creates a new instance of the GRU Neural Network with the same architecture as the current instance. It's used in scenarios where a fresh copy of the model is needed while maintaining the same configuration.

For Beginners: This method creates a brand new copy of the neural network with the same setup.

Think of it like creating a clone of the network:

The new network has the same architecture (structure)
But it's a completely separate instance with its own parameters and learning state

This is useful when you need multiple instances of the same GRU model, such as for ensemble learning or comparing different training approaches.

DeserializeNetworkSpecificData(BinaryReader)

Loads GRU-specific data from a binary stream.

protected override void DeserializeNetworkSpecificData(BinaryReader reader)

Parameters

reader BinaryReader: The binary reader to load from.

Remarks

This method deserializes GRU-specific data that was previously saved using SerializeNetworkSpecificData. It restores any special configuration or state that is unique to GRU networks.

For Beginners: This method loads special GRU settings from a file.

When loading a saved model:

The base neural network parts are loaded by other methods
This method loads any GRU-specific settings or state

This ensures that the loaded model functions exactly like the original one that was saved.

ForwardWithMemory(Tensor<T>)

Performs a forward pass while storing intermediate values for backpropagation.

public override Tensor<T> ForwardWithMemory(Tensor<T> input)

Parameters

input Tensor<T>: The input tensor.

Returns

Tensor<T>: The output tensor from the forward pass.

Remarks

For Beginners: This method processes the input through the network while remembering intermediate values needed for learning.

Think of it like solving a math problem and showing your work - the network needs to keep track of intermediate steps to understand how to improve.

GetModelMetadata()

Gets metadata about this GRU Neural Network model.

public override ModelMetadata<T> GetModelMetadata()

Returns

ModelMetadata<T>: A ModelMetaData object containing information about the model.

Remarks

This method returns metadata about the model, including its name, description, architecture, and other relevant information that might be useful for model management and serialization.

For Beginners: This method provides information about this neural network model.

The metadata includes:

The type of model (GRU Neural Network)
The network architecture (how many layers, neurons, etc.)
Configuration details specific to GRU networks

This information is useful for documentation, debugging, and when saving/loading models.

InitializeLayers()

Initializes the layers of the neural network based on the provided architecture.

protected override void InitializeLayers()

Remarks

This method sets up the layers of the GRU neural network. If the architecture provides specific layers, those are used directly. Otherwise, default layers appropriate for a GRU neural network are created.

For Beginners: This method sets up the building blocks of your neural network.

When initializing layers:

If you provided specific layers in the architecture, those are used
If not, the network creates a standard set of GRU layers automatically

This is like assembling all the components of your network before training begins. The standard GRU layers typically include:

Input layers to receive your data
GRU layers that process sequential information
Output layers that produce the final prediction

Predict(Tensor<T>)

Performs a forward pass through the network and generates predictions.

public override Tensor<T> Predict(Tensor<T> input)

Parameters

input Tensor<T>: The input tensor to the network, typically a sequence.

Returns

Tensor<T>: The output tensor produced by the network.

Remarks

This method processes the input tensor through all layers of the GRU network in sequence, applying the appropriate transformations at each step. For sequential data, the input is typically a 3D tensor with dimensions [batch_size, sequence_length, feature_size].

For Beginners: This method takes your input data and runs it through the neural network to get a prediction.

For example, if your input is a sequence of words:

Each word is passed through the network one at a time
The GRU remembers information from previous words
After processing the entire sequence, the network produces its prediction

This is similar to how you might read a sentence and understand its meaning by considering each word in context with the ones before it.

SerializeNetworkSpecificData(BinaryWriter)

Saves GRU-specific data to a binary stream.

protected override void SerializeNetworkSpecificData(BinaryWriter writer)

Parameters

writer BinaryWriter: The binary writer to save to.

Remarks

This method serializes any GRU-specific data that isn't part of the base neural network. In the case of a GRU network, this might include sequence-specific settings or state.

For Beginners: This method saves special GRU settings to a file.

When saving the model:

The base neural network parts are saved by other methods
This method saves any GRU-specific settings or state

This ensures that when you reload the model, it will have all the same settings and capabilities as the original.

Train(Tensor<T>, Tensor<T>)

Trains the GRU network using the provided input and expected output.

public override void Train(Tensor<T> input, Tensor<T> expectedOutput)

Parameters

input Tensor<T>: The input tensor for training.
expectedOutput Tensor<T>: The expected output tensor for calculating error.

Remarks

This method implements the training process for GRU networks using backpropagation through time (BPTT). It forward propagates the input, calculates the error by comparing with the expected output, and then backpropagates the error to update the network parameters.

For Beginners: This method is how the network learns from examples.

The training process works like this:

The network makes a prediction based on the input sequence
The prediction is compared to the expected output to calculate the error
The error is used to adjust the network's internal values (parameters)
Over time, these adjustments help the network make better predictions

In GRU networks, training is more complex because the error needs to flow backwards through time (across the sequence), but this complexity is handled internally.

UpdateParameters(Vector<T>)

Updates the parameters of all layers in the network using the provided parameter vector.

public override void UpdateParameters(Vector<T> parameters)

Parameters

parameters Vector<T>: A vector containing updated parameters for all layers.

Remarks

This method distributes the provided parameter values to each layer in the network. It extracts the appropriate segment of the parameter vector for each layer based on the layer's parameter count.

For Beginners: This method updates all the learned values in the network.

During training, a neural network adjusts its internal values (parameters) to make better predictions. This method:

Takes a long list of new parameter values
Figures out which values belong to which layers
Updates each layer with its corresponding values

Think of it like fine-tuning different parts of a machine based on how well it performed. GRU networks have several important parameters:

Update gate parameters: control what information to add from the current input
Reset gate parameters: control what past information to forget
Memory parameters: store information across the sequence

This method ensures all these parameters get updated correctly during training.

Table of Contents

Class GRUNeuralNetwork<T>

Type Parameters

Remarks

Constructors

GRUNeuralNetwork(NeuralNetworkArchitecture<T>, ILossFunction<T>?)

Parameters

Remarks

Methods

CreateNewInstance()

Returns

Remarks

DeserializeNetworkSpecificData(BinaryReader)

Parameters

Remarks

ForwardWithMemory(Tensor<T>)

Parameters

Returns

Remarks

GetModelMetadata()

Returns

Remarks

InitializeLayers()

Remarks

Predict(Tensor<T>)

Parameters

Returns

Remarks

SerializeNetworkSpecificData(BinaryWriter)

Parameters

Remarks

Train(Tensor<T>, Tensor<T>)

Parameters

Remarks

UpdateParameters(Vector<T>)

Parameters

Remarks