Table of Contents

Class VGGNetwork<T>

Namespace
AiDotNet.NeuralNetworks
Assembly
AiDotNet.dll

Represents a VGG (Visual Geometry Group) neural network architecture for image classification.

public class VGGNetwork<T> : NeuralNetworkBase<T>, INeuralNetworkModel<T>, INeuralNetwork<T>, IFullModel<T, Tensor<T>, Tensor<T>>, IModel<Tensor<T>, Tensor<T>, ModelMetadata<T>>, IModelSerializer, ICheckpointableModel, IParameterizable<T, Tensor<T>, Tensor<T>>, IFeatureAware, IFeatureImportance<T>, ICloneable<IFullModel<T, Tensor<T>, Tensor<T>>>, IGradientComputable<T, Tensor<T>, Tensor<T>>, IJitCompilable<T>, IInterpretableModel<T>, IInputGradientComputable<T>, IDisposable

Type Parameters

T

The numeric type used for calculations (typically float or double).

Inheritance
VGGNetwork<T>
Implements
IFullModel<T, Tensor<T>, Tensor<T>>
IModel<Tensor<T>, Tensor<T>, ModelMetadata<T>>
IParameterizable<T, Tensor<T>, Tensor<T>>
ICloneable<IFullModel<T, Tensor<T>, Tensor<T>>>
IGradientComputable<T, Tensor<T>, Tensor<T>>
Inherited Members
Extension Methods

Remarks

VGG networks are deep convolutional neural networks developed by the Visual Geometry Group at Oxford University. They are characterized by their use of small (3x3) convolution filters stacked in increasing depth, which allows them to learn complex hierarchical features.

For Beginners: VGG networks are one of the foundational architectures in deep learning for image recognition. Despite being developed in 2014, they remain popular because:

  • They're simple to understand - just stacked convolutions and pooling
  • They serve as excellent baselines for comparing new architectures
  • They're great for transfer learning (using a pre-trained network as a starting point)
  • The features they learn are highly transferable to other visual tasks

Architecture: VGG networks consist of:

  • Multiple blocks of 3x3 convolutional layers with ReLU activation
  • Max pooling (2x2, stride 2) after each block to reduce spatial dimensions
  • Optional batch normalization after each convolution (in _BN variants)
  • Three fully connected layers (4096 -> 4096 -> num_classes)
  • Dropout regularization in the fully connected layers

Typical Usage:

// Create VGG16 with batch normalization for 10-class classification
var config = new VGGConfiguration(VGGVariant.VGG16_BN, numClasses: 10);
var architecture = new NeuralNetworkArchitecture<float>(
    inputType: InputType.ThreeDimensional,
    inputHeight: 224,
    inputWidth: 224,
    inputDepth: 3,
    outputSize: 10,
    taskType: NeuralNetworkTaskType.MultiClassClassification);
var network = new VGGNetwork<float>(architecture, config);

Constructors

VGGNetwork(NeuralNetworkArchitecture<T>, VGGConfiguration, IGradientBasedOptimizer<T, Tensor<T>, Tensor<T>>?, ILossFunction<T>?, double)

Initializes a new instance of the VGGNetwork class.

public VGGNetwork(NeuralNetworkArchitecture<T> architecture, VGGConfiguration configuration, IGradientBasedOptimizer<T, Tensor<T>, Tensor<T>>? optimizer = null, ILossFunction<T>? lossFunction = null, double maxGradNorm = 1)

Parameters

architecture NeuralNetworkArchitecture<T>

The architecture defining the structure of the neural network.

configuration VGGConfiguration

The VGG-specific configuration.

optimizer IGradientBasedOptimizer<T, Tensor<T>, Tensor<T>>

Optional optimizer for training (default: Adam).

lossFunction ILossFunction<T>

Optional loss function (default: based on task type).

maxGradNorm double

Maximum gradient norm for gradient clipping (default: 1.0).

Remarks

VGG networks require three-dimensional input data (channels, height, width).

For Beginners: When creating a VGG network, you need to provide:

  • An architecture object that describes the input/output dimensions
  • A configuration that specifies which VGG variant to use
  • Optionally, custom optimizer and loss function (good defaults are provided)

Exceptions

InvalidInputTypeException

Thrown when the input type is not three-dimensional.

ArgumentNullException

Thrown when configuration is null.

Properties

NumClasses

Gets the number of output classes for classification.

public int NumClasses { get; }

Property Value

int

UsesBatchNormalization

Gets whether this network uses batch normalization.

public bool UsesBatchNormalization { get; }

Property Value

bool

Remarks

For Beginners: Batch normalization is a technique that normalizes layer inputs, making training faster and more stable. VGG variants ending in "_BN" use this.

Variant

Gets the VGG variant being used.

public VGGVariant Variant { get; }

Property Value

VGGVariant

Remarks

For Beginners: The variant determines how deep the network is (VGG11, 13, 16, or 19) and whether batch normalization is used (variants ending in _BN).

Methods

Backward(Tensor<T>)

Performs a backward pass through the network to calculate gradients.

public Tensor<T> Backward(Tensor<T> outputGradient)

Parameters

outputGradient Tensor<T>

The gradient of the loss with respect to the network's output.

Returns

Tensor<T>

The gradient of the loss with respect to the network's input.

Remarks

The backward pass propagates gradients through each layer in reverse order, computing the gradients needed for parameter updates during training.

For Beginners: This is how the network learns from its mistakes. After making a prediction, we calculate how wrong it was, and this method propagates that error backward through all the layers so each layer knows how to adjust its weights.

CreateNewInstance()

Creates a new instance of the VGG network model.

protected override IFullModel<T, Tensor<T>, Tensor<T>> CreateNewInstance()

Returns

IFullModel<T, Tensor<T>, Tensor<T>>

A new instance of the VGG network with the same configuration.

DeserializeNetworkSpecificData(BinaryReader)

Deserializes VGG network-specific data from a binary reader.

protected override void DeserializeNetworkSpecificData(BinaryReader reader)

Parameters

reader BinaryReader

Forward(Tensor<T>)

Performs a forward pass through the VGG network with the given input tensor.

public Tensor<T> Forward(Tensor<T> input)

Parameters

input Tensor<T>

The input tensor to process (shape: [channels, height, width] for a single example, or [batch, channels, height, width] for a batch; a missing batch dimension is added internally).

Returns

Tensor<T>

The output tensor after processing through all layers.

Remarks

The forward pass sequentially processes the input through each layer of the network: convolution blocks, pooling, fully connected layers, and produces class probabilities.

For Beginners: This is how the network makes predictions. You give it an image (as a tensor), and it processes it through all the VGG layers to produce a prediction. The output contains probabilities for each class.

Exceptions

TensorShapeMismatchException

Thrown when the input shape doesn't match expected shape.

GetModelMetadata()

Retrieves metadata about the VGG network model.

public override ModelMetadata<T> GetModelMetadata()

Returns

ModelMetadata<T>

A ModelMetaData object containing information about the network.

Remarks

For Beginners: This returns a summary of the network's structure and configuration, including the VGG variant, input/output shapes, and layer information.

GetParameterCount()

Gets the total number of trainable parameters in the network.

public int GetParameterCount()

Returns

int

The total parameter count.

Remarks

VGG networks are known for having a large number of parameters, primarily in the fully connected layers. For example:

  • VGG16: ~138 million parameters
  • VGG19: ~144 million parameters

For Beginners: Parameters are the learnable weights in the network. More parameters means more capacity to learn complex patterns, but also requires more memory and training data. VGG networks have many parameters because of their large fully connected layers.

InitializeLayers()

Initializes the layers of the VGG network based on the configuration.

protected override sealed void InitializeLayers()

Remarks

This method either uses custom layers provided in the architecture or creates the standard VGG layers based on the configuration.

For Beginners: This method builds the VGG network layer by layer. If you've provided custom layers, it uses those. Otherwise, it creates the standard VGG architecture with the appropriate number of convolutional blocks for your chosen variant.

Predict(Tensor<T>)

Makes a prediction using the VGG network for the given input.

public override Tensor<T> Predict(Tensor<T> input)

Parameters

input Tensor<T>

The input tensor to make a prediction for.

Returns

Tensor<T>

The predicted output tensor containing class probabilities.

Remarks

For Beginners: This is the main method you'll use after training. Give it an image, and it returns probabilities for each class. The class with the highest probability is the network's prediction.

SerializeNetworkSpecificData(BinaryWriter)

Serializes VGG network-specific data to a binary writer.

protected override void SerializeNetworkSpecificData(BinaryWriter writer)

Parameters

writer BinaryWriter

Train(Tensor<T>, Tensor<T>)

Trains the VGG network using the provided input and expected output.

public override void Train(Tensor<T> input, Tensor<T> expectedOutput)

Parameters

input Tensor<T>

The input tensor for training.

expectedOutput Tensor<T>

The expected output tensor (one-hot encoded class labels).

Remarks

This method performs one training iteration: forward pass, loss calculation, backward pass, and parameter update.

For Beginners: This is how the network learns. You show it an image and tell it what class the image belongs to. The network makes a guess, compares it to the correct answer, and adjusts its weights to do better next time.

UpdateParameters(Vector<T>)

Updates the parameters of all layers in the network.

public override void UpdateParameters(Vector<T> parameters)

Parameters

parameters Vector<T>

A vector containing all parameters for the network.

Remarks

For Beginners: After calculating how to improve each layer's weights, this method actually applies those improvements to make the network better.