Class ResidualNeuralNetwork<T>

Namespace: AiDotNet.NeuralNetworks

Assembly: AiDotNet.dll

Represents a Residual Neural Network, which is a type of neural network that uses skip connections to address the vanishing gradient problem in deep networks.

public class ResidualNeuralNetwork<T> : NeuralNetworkBase<T>, INeuralNetworkModel<T>, INeuralNetwork<T>, IFullModel<T, Tensor<T>, Tensor<T>>, IModel<Tensor<T>, Tensor<T>, ModelMetadata<T>>, IModelSerializer, ICheckpointableModel, IParameterizable<T, Tensor<T>, Tensor<T>>, IFeatureAware, IFeatureImportance<T>, ICloneable<IFullModel<T, Tensor<T>, Tensor<T>>>, IGradientComputable<T, Tensor<T>, Tensor<T>>, IJitCompilable<T>, IInterpretableModel<T>, IInputGradientComputable<T>, IDisposable, IAuxiliaryLossLayer<T>, IDiagnosticsProvider

Type Parameters

T: The numeric type used for calculations, typically float or double.

Inheritance: object

NeuralNetworkBase<T>

ResidualNeuralNetwork<T>

Implements: INeuralNetworkModel<T>

INeuralNetwork<T>

IFullModel<T, Tensor<T>, Tensor<T>>

IModel<Tensor<T>, Tensor<T>, ModelMetadata<T>>

IModelSerializer

ICheckpointableModel

IParameterizable<T, Tensor<T>, Tensor<T>>

IFeatureAware

IFeatureImportance<T>

ICloneable<IFullModel<T, Tensor<T>, Tensor<T>>>

IGradientComputable<T, Tensor<T>, Tensor<T>>

IJitCompilable<T>

IInterpretableModel<T>

IInputGradientComputable<T>

IDisposable

IAuxiliaryLossLayer<T>

IDiagnosticsProvider

Inherited Members: NeuralNetworkBase<T>.Layers

NeuralNetworkBase<T>.LayerCount

NeuralNetworkBase<T>.Architecture

NeuralNetworkBase<T>.NumOps

NeuralNetworkBase<T>.Engine

NeuralNetworkBase<T>._layerInputs

NeuralNetworkBase<T>._layerOutputs

NeuralNetworkBase<T>.Random

NeuralNetworkBase<T>.LossFunction

NeuralNetworkBase<T>.LastLoss

NeuralNetworkBase<T>.IsTrainingMode

NeuralNetworkBase<T>.SupportsGpuTraining

NeuralNetworkBase<T>.CanTrainOnGpu

NeuralNetworkBase<T>.GpuEngine

NeuralNetworkBase<T>.MaxGradNorm

NeuralNetworkBase<T>._mixedPrecisionContext

NeuralNetworkBase<T>._memoryManager

NeuralNetworkBase<T>.IsMemoryManagementEnabled

NeuralNetworkBase<T>.IsGradientCheckpointingEnabled

NeuralNetworkBase<T>.IsMixedPrecisionEnabled

NeuralNetworkBase<T>.ClipGradients(List<Tensor<T>>)

NeuralNetworkBase<T>.ClipGradient(Tensor<T>)

NeuralNetworkBase<T>.ClipGradient(Vector<T>)

NeuralNetworkBase<T>.GetParameters()

NeuralNetworkBase<T>.Backpropagate(Tensor<T>)

NeuralNetworkBase<T>.BackpropagateWithRecompute(Tensor<T>)

NeuralNetworkBase<T>.ForwardGpu(IGpuTensor<T>)

NeuralNetworkBase<T>.BackpropagateGpu(IGpuTensor<T>)

NeuralNetworkBase<T>.BackpropagateGpuDeferred(IGpuTensor<T>, GpuExecutionOptions)

NeuralNetworkBase<T>.UpdateParametersGpu(T, T, T)

NeuralNetworkBase<T>.UpdateParametersGpu(IGpuOptimizerConfig)

NeuralNetworkBase<T>.UpdateParametersGpuDeferred(IGpuOptimizerConfig, GpuExecutionOptions)

NeuralNetworkBase<T>.TrainBatchGpuDeferred(IGpuTensor<T>, IGpuTensor<T>, IGpuOptimizerConfig, GpuExecutionOptions)

NeuralNetworkBase<T>.TrainBatchGpuDeferredAsync(IGpuTensor<T>, IGpuTensor<T>, IGpuOptimizerConfig, GpuExecutionOptions, CancellationToken)

NeuralNetworkBase<T>.UploadWeightsToGpu()

NeuralNetworkBase<T>.DownloadWeightsFromGpu()

NeuralNetworkBase<T>.ZeroGradientsGpu()

NeuralNetworkBase<T>.ExtractSingleExample(Tensor<T>, int)

NeuralNetworkBase<T>.ForwardWithMemory(Tensor<T>)

NeuralNetworkBase<T>.ForwardWithCheckpointing(Tensor<T>)

NeuralNetworkBase<T>.CanUseGpuResidentPath()

NeuralNetworkBase<T>.TryForwardGpuOptimized(Tensor<T>, out Tensor<T>)

NeuralNetworkBase<T>.ForwardGpu(Tensor<T>)

NeuralNetworkBase<T>.ForwardDeferred(Tensor<T>)

NeuralNetworkBase<T>.ForwardDeferredAsync(Tensor<T>, CancellationToken)

NeuralNetworkBase<T>.BeginGpuExecution(GpuExecutionOptions)

NeuralNetworkBase<T>.ForwardWithGpuContext(Tensor<T>)

NeuralNetworkBase<T>.ForwardWithGpuContext(IGpuTensor<T>)

NeuralNetworkBase<T>.GetGpuMemoryStats()

NeuralNetworkBase<T>.ForwardWithFeatures(Tensor<T>, int[])

NeuralNetworkBase<T>.ParameterCount

NeuralNetworkBase<T>.GetParameterCount()

NeuralNetworkBase<T>.InvalidateParameterCountCache()

NeuralNetworkBase<T>.AddLayerToCollection(ILayer<T>)

NeuralNetworkBase<T>.RemoveLayerFromCollection(ILayer<T>)

NeuralNetworkBase<T>.ClearLayers()

NeuralNetworkBase<T>.ValidateCustomLayers(List<ILayer<T>>)

NeuralNetworkBase<T>.ValidateCustomLayersInternal(List<ILayer<T>>)

NeuralNetworkBase<T>.IsValidInputLayer(ILayer<T>)

NeuralNetworkBase<T>.IsValidOutputLayer(ILayer<T>)

NeuralNetworkBase<T>.AreLayersCompatible(ILayer<T>, ILayer<T>)

NeuralNetworkBase<T>.GetParameterGradients()

NeuralNetworkBase<T>.EnsureArchitectureInitialized()

NeuralNetworkBase<T>.SetTrainingMode(bool)

NeuralNetworkBase<T>.EnableMemoryManagement(TrainingMemoryConfig)

NeuralNetworkBase<T>.DisableMemoryManagement()

NeuralNetworkBase<T>.GetMemoryEstimate(int, int)

NeuralNetworkBase<T>.GetLastLoss()

NeuralNetworkBase<T>.ResetState()

NeuralNetworkBase<T>.BackwardWithInputGradient(Tensor<T>)

NeuralNetworkBase<T>.ComputeInputGradient(Vector<T>, Vector<T>)

NeuralNetworkBase<T>.ComputeInputGradient(Tensor<T>, Tensor<T>)

NeuralNetworkBase<T>.SaveModel(string)

NeuralNetworkBase<T>.LoadModel(string)

NeuralNetworkBase<T>.Serialize()

NeuralNetworkBase<T>.Deserialize(byte[])

NeuralNetworkBase<T>.WithParameters(Vector<T>)

NeuralNetworkBase<T>.GetActiveFeatureIndices()

NeuralNetworkBase<T>.IsFeatureUsed(int)

NeuralNetworkBase<T>.DeepCopy()

NeuralNetworkBase<T>.Clone()

NeuralNetworkBase<T>.SetActiveFeatureIndices(IEnumerable<int>)

NeuralNetworkBase<T>._enabledMethods

NeuralNetworkBase<T>._sensitiveFeatures

NeuralNetworkBase<T>._fairnessMetrics

NeuralNetworkBase<T>._baseModel

NeuralNetworkBase<T>.GetGlobalFeatureImportanceAsync()

NeuralNetworkBase<T>.GetLocalFeatureImportanceAsync(Tensor<T>)

NeuralNetworkBase<T>.GetShapValuesAsync(Tensor<T>)

NeuralNetworkBase<T>.GetLimeExplanationAsync(Tensor<T>, int)

NeuralNetworkBase<T>.GetPartialDependenceAsync(Vector<int>, int)

NeuralNetworkBase<T>.GetCounterfactualAsync(Tensor<T>, Tensor<T>, int)

NeuralNetworkBase<T>.GetModelSpecificInterpretabilityAsync()

NeuralNetworkBase<T>.GenerateTextExplanationAsync(Tensor<T>, Tensor<T>)

NeuralNetworkBase<T>.GetFeatureInteractionAsync(int, int)

NeuralNetworkBase<T>.ValidateFairnessAsync(Tensor<T>, int)

NeuralNetworkBase<T>.GetAnchorExplanationAsync(Tensor<T>, T)

NeuralNetworkBase<T>.SetBaseModel<TInput, TOutput>(IFullModel<T, TInput, TOutput>)

NeuralNetworkBase<T>.EnableMethod(params InterpretationMethod[])

NeuralNetworkBase<T>.ConfigureFairness(Vector<int>, params FairnessMetric[])

NeuralNetworkBase<T>.GetNamedLayerActivations(Tensor<T>)

NeuralNetworkBase<T>.GetArchitecture()

NeuralNetworkBase<T>.GetFeatureImportance()

NeuralNetworkBase<T>.SetParameters(Vector<T>)

NeuralNetworkBase<T>.AddLayer(LayerType, int, ActivationFunction)

NeuralNetworkBase<T>.AddConvolutionalLayer(int, int, int, ActivationFunction)

NeuralNetworkBase<T>.AddLSTMLayer(int, bool)

NeuralNetworkBase<T>.AddDropoutLayer(double)

NeuralNetworkBase<T>.AddBatchNormalizationLayer(int, double, double)

NeuralNetworkBase<T>.AddPoolingLayer(int[], PoolingType, int, int?)

NeuralNetworkBase<T>.GetGradients()

NeuralNetworkBase<T>.GetInputShape()

NeuralNetworkBase<T>.GetLayerActivations(Tensor<T>)

NeuralNetworkBase<T>.DefaultLossFunction

NeuralNetworkBase<T>.ComputeGradients(Tensor<T>, Tensor<T>, ILossFunction<T>)

NeuralNetworkBase<T>.ApplyGradients(Vector<T>, T)

NeuralNetworkBase<T>.SaveState(Stream)

NeuralNetworkBase<T>.LoadState(Stream)

NeuralNetworkBase<T>.Dispose()

NeuralNetworkBase<T>.Dispose(bool)

NeuralNetworkBase<T>.SupportsJitCompilation

NeuralNetworkBase<T>.ExportComputationGraph(List<ComputationNode<T>>)

NeuralNetworkBase<T>.ConvertLayerToGraph(ILayer<T>, ComputationNode<T>)

object.Equals(object)

object.Equals(object, object)

object.GetHashCode()

object.GetType()

object.MemberwiseClone()

object.ReferenceEquals(object, object)

object.ToString()

Extension Methods: DistributedExtensions.AsDistributedForHighBandwidth<T, TInput, TOutput>(IFullModel<T, TInput, TOutput>, ICommunicationBackend<T>)

DistributedExtensions.AsDistributedForLowBandwidth<T, TInput, TOutput>(IFullModel<T, TInput, TOutput>, ICommunicationBackend<T>)

DistributedExtensions.AsDistributed<T, TInput, TOutput>(IFullModel<T, TInput, TOutput>, ICommunicationBackend<T>)

DistributedExtensions.AsDistributed<T, TInput, TOutput>(IFullModel<T, TInput, TOutput>, IShardingConfiguration<T>)

Remarks

A Residual Neural Network (ResNet) is an advanced neural network architecture that introduces "skip connections" or "shortcuts" that allow information to bypass one or more layers. These residual connections help address the vanishing gradient problem that occurs in very deep networks, enabling the training of networks with many more layers than previously possible. ResNets were a breakthrough in deep learning that significantly improved performance on image recognition and other tasks.

For Beginners: A Residual Neural Network is like a highway system for information in a neural network.

Think of it like this:

In a traditional neural network, information must pass through every layer sequentially
In a ResNet, there are "shortcut paths" or "highways" that let information skip ahead

For example, imagine trying to pass a message through a line of 100 people:

In a regular network, each person must whisper to the next person in line
In a ResNet, some people can also shout directly to someone 5 positions ahead

This design solves a major problem: in very deep networks (many layers), information and learning signals tend to fade away or "vanish" as they travel through many layers. The shortcuts in ResNets help information flow more easily through the network, allowing for much deeper networks (some with over 100 layers!) that can learn more complex patterns.

ResNets revolutionized image recognition and are now used in many AI systems that need to identify complex patterns in data.

Constructors

ResidualNeuralNetwork(NeuralNetworkArchitecture<T>, T?, int, int, ILossFunction<T>?)

Initializes a new instance of the ResidualNeuralNetwork<T> class with the specified architecture.

public ResidualNeuralNetwork(NeuralNetworkArchitecture<T> architecture, T? learningRate = default, int epochs = 10, int batchSize = 32, ILossFunction<T>? lossFunction = null)

Parameters

architecture NeuralNetworkArchitecture<T>: The neural network architecture to use for the ResNet.
learningRate T: The learning rate for training. Default is 0.01 converted to type T.
epochs int: The number of training epochs. Default is 10.
batchSize int: The batch size for training. Default is 32.
lossFunction ILossFunction<T>: Optional custom loss function. If null, a default will be chosen based on task type.

Remarks

This constructor creates a new Residual Neural Network with the specified architecture. It initializes the network layers based on the architecture, or creates default ResNet layers if no specific layers are provided.

For Beginners: This sets up the Residual Neural Network with its basic structure.

When creating a new ResNet:

The architecture defines what the network looks like - how many layers it has, how they're connected, etc.
The constructor prepares the network by either:
- Using the specific layers provided in the architecture, or
- Creating default layers designed for ResNets if none are specified

The default ResNet layers include special residual blocks that have both:

A main path where information is processed through multiple layers
A shortcut path that allows information to skip these layers

This combination of paths is what gives ResNets their special ability to train very deep networks.

Properties

AuxiliaryLossWeight

Gets or sets the weight for the deep supervision auxiliary loss.

public T AuxiliaryLossWeight { get; set; }

Property Value

T

Remarks

This weight controls how much the intermediate auxiliary classifiers contribute to the total loss. The total loss is: main_loss + (auxiliary_weight * auxiliary_loss). Typical values range from 0.1 to 0.5.

For Beginners: This controls how much the network should care about intermediate predictions.

The weight determines the balance between:

Final output accuracy (main loss)
Intermediate prediction accuracy (auxiliary loss)

Common values:

0.3 (default): Balanced contribution from intermediate classifiers
0.1-0.2: Less emphasis on intermediate predictions
0.4-0.5: More emphasis on intermediate predictions

Higher values make the network focus more on getting intermediate predictions correct, which can help with gradient flow but may slow convergence.

SupportsTraining

Indicates whether this network supports training (learning from data).

public override bool SupportsTraining { get; }

Property Value

bool

Remarks

This property indicates whether the network is capable of learning from data through training. For ResidualNeuralNetwork, this property always returns true since the network is designed for training.

For Beginners: This tells you if the network can learn from data.

The Residual Neural Network supports training, which means:

It can adjust its internal values based on examples
It can improve its performance over time
It can learn to recognize patterns in data

This property always returns true because ResNets are specifically designed to be trainable, even when they're very deep (many layers).

UseAuxiliaryLoss

public bool UseAuxiliaryLoss { get; set; }

Property Value

bool

Methods

AddAuxiliaryClassifier(ILayer<T>, int)

Adds an auxiliary classifier at the specified layer position for deep supervision.

public void AddAuxiliaryClassifier(ILayer<T> classifier, int layerPosition)

Parameters

classifier ILayer<T>: The classifier layer to add for intermediate predictions.
layerPosition int: The layer index where this classifier should be applied.

Remarks

Auxiliary classifiers enable deep supervision by providing additional training signals at intermediate layers. This helps with gradient flow and can improve training stability.

For Beginners: Think of auxiliary classifiers as "checkpoints" in your network. They make predictions at intermediate stages, helping the network learn better representations at each layer rather than only at the final output.

ComputeAuxiliaryLoss()

Computes the auxiliary loss for deep supervision from intermediate auxiliary classifiers.

public T ComputeAuxiliaryLoss()

Returns

T: The computed deep supervision auxiliary loss.

Remarks

This method computes the auxiliary loss from intermediate classifiers placed at strategic positions in the network. For very deep ResNets, these intermediate classifiers help maintain strong gradient signals throughout the network during backpropagation.

For Beginners: This calculates how well the network's intermediate layers are learning.

Deep supervision works by:

Adding small classifiers at intermediate points in the network
Each classifier tries to predict the final output from intermediate features
Computing loss for each intermediate prediction
Averaging these losses to get the auxiliary loss

This helps because:

It provides learning signals to earlier layers
It prevents gradients from becoming too weak in deep networks
It encourages intermediate layers to learn meaningful features

The auxiliary loss is combined with the main loss during training to guide learning.

CreateNewInstance()

Creates a new instance of the residual neural network with the same configuration.

protected override IFullModel<T, Tensor<T>, Tensor<T>> CreateNewInstance()

Returns

IFullModel<T, Tensor<T>, Tensor<T>>: A new instance of ResidualNeuralNetwork<T> with the same configuration as the current instance.

Remarks

This method creates a new residual neural network that has the same configuration as the current instance. It's used for model persistence, cloning, and transferring the model's configuration to new instances. The new instance will have the same architecture, learning rate, epochs, batch size, and loss function as the original, but will not share parameter values unless they are explicitly copied after creation.

For Beginners: This method makes a fresh copy of the current model with the same settings.

It's like creating a blueprint copy of your network that can be used to:

Save your model's settings
Create a new identical model
Transfer your model's configuration to another system

This is useful when you want to:

Create multiple similar residual neural networks
Save a model's configuration for later use
Reset a model while keeping its settings

Note that while the settings are copied, the learned parameters (like the weights for detecting features) are not automatically transferred, so the new instance will need training or parameter copying to match the performance of the original.

DeserializeNetworkSpecificData(BinaryReader)

Deserializes network-specific data for the Residual Neural Network.

protected override void DeserializeNetworkSpecificData(BinaryReader reader)

Parameters

reader BinaryReader: The BinaryReader to read the data from.

Remarks

This method reads the training parameters specific to the Residual Neural Network from the provided BinaryReader. It restores the number of epochs, learning rate, and batch size, ensuring that the network's training configuration is accurately reconstructed during deserialization.

For Beginners: This method loads the special settings for training this ResNet.

It reads:

The number of times to train on the entire dataset (epochs)
How quickly the network learns from its mistakes (learning rate)
How many examples the network looks at before updating (batch size)

Loading these settings ensures that you can continue training or use the network with the exact same configuration it had when it was saved.

GetAuxiliaryLossDiagnostics()

Gets diagnostic information about the deep supervision auxiliary loss.

public Dictionary<string, string> GetAuxiliaryLossDiagnostics()

Returns

Dictionary<string, string>: A dictionary containing diagnostic information about auxiliary losses.

Remarks

This method returns detailed diagnostics about the deep supervision system, including the number of auxiliary classifiers, their positions in the network, and the computed losses. This information is useful for monitoring training progress and debugging.

For Beginners: This provides information about how deep supervision is working.

The diagnostics include:

Total auxiliary loss from all intermediate classifiers
Weight applied to the auxiliary loss
Number of auxiliary classifiers in the network
Whether deep supervision is enabled

This helps you:

Monitor if auxiliary classifiers are contributing to training
Debug issues with deep supervision
Understand the impact of intermediate supervision on learning

You can use this information to adjust the auxiliary loss weight or the placement of auxiliary classifiers for better training results.

GetDiagnostics()

Gets diagnostic information about this component's state and behavior. Includes auxiliary loss diagnostics from GetAuxiliaryLossDiagnostics().

public Dictionary<string, string> GetDiagnostics()

Returns

Dictionary<string, string>: A dictionary containing diagnostic metrics including auxiliary loss diagnostics.

GetModelMetadata()

Gets metadata about the Residual Neural Network model.

public override ModelMetadata<T> GetModelMetadata()

Returns

ModelMetadata<T>: A ModelMetaData object containing information about the model.

Remarks

This method returns metadata that describes the Residual Neural Network, including its type, architecture details, and training parameters. This information can be useful for model management, documentation, and versioning.

For Beginners: This provides a summary of your network's configuration.

The metadata includes:

The type of model (Residual Neural Network)
The number of layers in the network
Information about the network's structure
Training parameters like learning rate and epochs

This is useful for:

Documenting your model
Comparing different model configurations
Reproducing your model setup later

InitializeLayers()

Initializes the neural network layers based on the provided architecture or default configuration.

protected override void InitializeLayers()

Remarks

This method sets up the neural network layers for the Residual Neural Network. If the architecture provides specific layers, those are used. Otherwise, a default configuration optimized for ResNets is created. In a typical ResNet, this involves creating residual blocks that combine a main path with a shortcut path, allowing information to either pass through layers or bypass them.

For Beginners: This method sets up the building blocks of the neural network.

When initializing layers:

If the user provided specific layers, those are used
Otherwise, default layers suitable for ResNets are created automatically
The system checks that any custom layers will work properly with the ResNet

A typical ResNet has specialized building blocks called "residual blocks" that contain:

Convolutional layers that process the input
Batch normalization layers that stabilize learning
Activation layers that introduce non-linearity
Shortcut connections that allow information to bypass these layers

These blocks are then stacked together, often with increasing complexity as you go deeper into the network.

Predict(Tensor<T>)

Makes a prediction using the Residual Neural Network.

public override Tensor<T> Predict(Tensor<T> input)

Parameters

input Tensor<T>: The input tensor to make a prediction for.

Returns

Tensor<T>: The predicted output tensor.

Remarks

This method performs a forward pass through the network to generate a prediction based on the input tensor. The input flows through all layers sequentially, with residual connections allowing information to bypass certain layers where applicable. The output represents the network's prediction, which depends on the task (e.g., class probabilities for classification or continuous values for regression).

For Beginners: This method uses the network to make a prediction based on input data.

The prediction process works like this:

Input data enters the network at the first layer
The data passes through each layer in sequence
At residual blocks, there are two paths:
- A main path through multiple processing layers
- A shortcut path that bypasses these layers
The outputs from both paths are combined at the end of each block
The final layer produces the prediction result

For example, in an image recognition task:

The input might be an image
Each layer detects increasingly complex patterns
The shortcuts help information flow through the entire network
The output tells you what the image contains

SerializeNetworkSpecificData(BinaryWriter)

Serializes network-specific data for the Residual Neural Network.

protected override void SerializeNetworkSpecificData(BinaryWriter writer)

Parameters

writer BinaryWriter: The BinaryWriter to write the data to.

Remarks

This method writes the training parameters specific to the Residual Neural Network to the provided BinaryWriter. These parameters include the number of epochs, learning rate, and batch size, which are crucial for reconstructing the network's training configuration during deserialization.

For Beginners: This method saves the special settings for training this ResNet.

It writes:

The number of times to train on the entire dataset (epochs)
How quickly the network learns from its mistakes (learning rate)
How many examples the network looks at before updating (batch size)

These settings are important because they affect how the network learns and performs. Saving them allows you to recreate the exact same training setup later.

Train(Tensor<T>, Tensor<T>)

Trains the Residual Neural Network on the provided data.

public override void Train(Tensor<T> input, Tensor<T> expectedOutput)

Parameters

input Tensor<T>: The input training data.
expectedOutput Tensor<T>: The expected output for the given input.

Remarks

This method trains the Residual Neural Network on the provided data for the specified number of epochs. It divides the data into batches and trains on each batch using backpropagation and gradient descent. The method tracks and reports the average loss for each epoch to monitor training progress. If deep supervision is enabled and auxiliary classifiers are configured, auxiliary losses from intermediate classifiers are included.

For Beginners: This method teaches the ResNet to recognize patterns in your data.

The training process works like this:

Divides your data into smaller batches for efficient processing
For each batch:
- Feeds the input data through the network
- Compares the prediction with the expected output
- Calculates how wrong the prediction was (the "loss")
- If deep supervision is enabled, also computes losses from intermediate classifiers
- Adjusts the network's parameters to reduce errors
Repeats this process for multiple epochs (complete passes through the data)

The special residual connections in the ResNet help the error signals flow backward through the network more effectively, making it possible to train very deep networks that would otherwise suffer from the vanishing gradient problem.

UpdateParameters(Vector<T>)

Updates the parameters of the residual neural network layers.

public override void UpdateParameters(Vector<T> parameters)

Parameters

parameters Vector<T>: The vector of parameter updates to apply.

Remarks

This method updates the parameters of each layer in the residual neural network based on the provided parameter updates. The parameters vector is divided into segments corresponding to each layer's parameter count, and each segment is applied to its respective layer. In a ResNet, these parameters typically include weights for convolutional layers, as well as parameters for batch normalization and other operations within residual blocks.

For Beginners: This method updates how the ResNet makes decisions based on training.

During training:

The network learns by adjusting its internal parameters
This method applies those adjustments
Each layer gets the portion of updates meant specifically for it

For a ResNet, these adjustments might include:

How each convolutional filter detects patterns
How the batch normalization layers stabilize learning
How information should flow through both the main and shortcut paths

The residual connections (shortcuts) make it easier for these updates to flow backward through the network during training, which helps very deep networks learn effectively.

Table of Contents

Class ResidualNeuralNetwork<T>

Type Parameters

Remarks

Constructors

ResidualNeuralNetwork(NeuralNetworkArchitecture<T>, T?, int, int, ILossFunction<T>?)

Parameters

Remarks

Properties

AuxiliaryLossWeight

Property Value

Remarks

SupportsTraining

Property Value

Remarks

UseAuxiliaryLoss

Property Value

Methods

AddAuxiliaryClassifier(ILayer<T>, int)

Parameters

Remarks

ComputeAuxiliaryLoss()

Returns

Remarks

CreateNewInstance()

Returns

Remarks

DeserializeNetworkSpecificData(BinaryReader)

Parameters

Remarks

GetAuxiliaryLossDiagnostics()

Returns

Remarks

GetDiagnostics()

Returns

GetModelMetadata()

Returns

Remarks

InitializeLayers()

Remarks

Predict(Tensor<T>)

Parameters

Returns

Remarks

SerializeNetworkSpecificData(BinaryWriter)

Parameters

Remarks

Train(Tensor<T>, Tensor<T>)

Parameters

Remarks

UpdateParameters(Vector<T>)

Parameters

Remarks