Class VGGNetwork<T>
- Namespace
- AiDotNet.NeuralNetworks
- Assembly
- AiDotNet.dll
Represents a VGG (Visual Geometry Group) neural network architecture for image classification.
public class VGGNetwork<T> : NeuralNetworkBase<T>, INeuralNetworkModel<T>, INeuralNetwork<T>, IFullModel<T, Tensor<T>, Tensor<T>>, IModel<Tensor<T>, Tensor<T>, ModelMetadata<T>>, IModelSerializer, ICheckpointableModel, IParameterizable<T, Tensor<T>, Tensor<T>>, IFeatureAware, IFeatureImportance<T>, ICloneable<IFullModel<T, Tensor<T>, Tensor<T>>>, IGradientComputable<T, Tensor<T>, Tensor<T>>, IJitCompilable<T>, IInterpretableModel<T>, IInputGradientComputable<T>, IDisposable
Type Parameters
TThe numeric type used for calculations (typically float or double).
- Inheritance
-
VGGNetwork<T>
- Implements
- Inherited Members
- Extension Methods
Remarks
VGG networks are deep convolutional neural networks developed by the Visual Geometry Group at Oxford University. They are characterized by their use of small (3x3) convolution filters stacked in increasing depth, which allows them to learn complex hierarchical features.
For Beginners: VGG networks are one of the foundational architectures in deep learning for image recognition. Despite being developed in 2014, they remain popular because:
- They're simple to understand - just stacked convolutions and pooling
- They serve as excellent baselines for comparing new architectures
- They're great for transfer learning (using a pre-trained network as a starting point)
- The features they learn are highly transferable to other visual tasks
Architecture: VGG networks consist of:
- Multiple blocks of 3x3 convolutional layers with ReLU activation
- Max pooling (2x2, stride 2) after each block to reduce spatial dimensions
- Optional batch normalization after each convolution (in _BN variants)
- Three fully connected layers (4096 -> 4096 -> num_classes)
- Dropout regularization in the fully connected layers
Typical Usage:
// Create VGG16 with batch normalization for 10-class classification
var config = new VGGConfiguration(VGGVariant.VGG16_BN, numClasses: 10);
var architecture = new NeuralNetworkArchitecture<float>(
inputType: InputType.ThreeDimensional,
inputHeight: 224,
inputWidth: 224,
inputDepth: 3,
outputSize: 10,
taskType: NeuralNetworkTaskType.MultiClassClassification);
var network = new VGGNetwork<float>(architecture, config);
Constructors
VGGNetwork(NeuralNetworkArchitecture<T>, VGGConfiguration, IGradientBasedOptimizer<T, Tensor<T>, Tensor<T>>?, ILossFunction<T>?, double)
Initializes a new instance of the VGGNetwork class.
public VGGNetwork(NeuralNetworkArchitecture<T> architecture, VGGConfiguration configuration, IGradientBasedOptimizer<T, Tensor<T>, Tensor<T>>? optimizer = null, ILossFunction<T>? lossFunction = null, double maxGradNorm = 1)
Parameters
architectureNeuralNetworkArchitecture<T>The architecture defining the structure of the neural network.
configurationVGGConfigurationThe VGG-specific configuration.
optimizerIGradientBasedOptimizer<T, Tensor<T>, Tensor<T>>Optional optimizer for training (default: Adam).
lossFunctionILossFunction<T>Optional loss function (default: based on task type).
maxGradNormdoubleMaximum gradient norm for gradient clipping (default: 1.0).
Remarks
VGG networks require three-dimensional input data (channels, height, width).
For Beginners: When creating a VGG network, you need to provide:
- An architecture object that describes the input/output dimensions
- A configuration that specifies which VGG variant to use
- Optionally, custom optimizer and loss function (good defaults are provided)
Exceptions
- InvalidInputTypeException
Thrown when the input type is not three-dimensional.
- ArgumentNullException
Thrown when configuration is null.
Properties
NumClasses
Gets the number of output classes for classification.
public int NumClasses { get; }
Property Value
UsesBatchNormalization
Gets whether this network uses batch normalization.
public bool UsesBatchNormalization { get; }
Property Value
Remarks
For Beginners: Batch normalization is a technique that normalizes layer inputs, making training faster and more stable. VGG variants ending in "_BN" use this.
Variant
Gets the VGG variant being used.
public VGGVariant Variant { get; }
Property Value
Remarks
For Beginners: The variant determines how deep the network is (VGG11, 13, 16, or 19) and whether batch normalization is used (variants ending in _BN).
Methods
Backward(Tensor<T>)
Performs a backward pass through the network to calculate gradients.
public Tensor<T> Backward(Tensor<T> outputGradient)
Parameters
outputGradientTensor<T>The gradient of the loss with respect to the network's output.
Returns
- Tensor<T>
The gradient of the loss with respect to the network's input.
Remarks
The backward pass propagates gradients through each layer in reverse order, computing the gradients needed for parameter updates during training.
For Beginners: This is how the network learns from its mistakes. After making a prediction, we calculate how wrong it was, and this method propagates that error backward through all the layers so each layer knows how to adjust its weights.
CreateNewInstance()
Creates a new instance of the VGG network model.
protected override IFullModel<T, Tensor<T>, Tensor<T>> CreateNewInstance()
Returns
- IFullModel<T, Tensor<T>, Tensor<T>>
A new instance of the VGG network with the same configuration.
DeserializeNetworkSpecificData(BinaryReader)
Deserializes VGG network-specific data from a binary reader.
protected override void DeserializeNetworkSpecificData(BinaryReader reader)
Parameters
readerBinaryReader
Forward(Tensor<T>)
Performs a forward pass through the VGG network with the given input tensor.
public Tensor<T> Forward(Tensor<T> input)
Parameters
inputTensor<T>The input tensor to process (shape: [channels, height, width] for a single example, or [batch, channels, height, width] for a batch; a missing batch dimension is added internally).
Returns
- Tensor<T>
The output tensor after processing through all layers.
Remarks
The forward pass sequentially processes the input through each layer of the network: convolution blocks, pooling, fully connected layers, and produces class probabilities.
For Beginners: This is how the network makes predictions. You give it an image (as a tensor), and it processes it through all the VGG layers to produce a prediction. The output contains probabilities for each class.
Exceptions
- TensorShapeMismatchException
Thrown when the input shape doesn't match expected shape.
GetModelMetadata()
Retrieves metadata about the VGG network model.
public override ModelMetadata<T> GetModelMetadata()
Returns
- ModelMetadata<T>
A ModelMetaData object containing information about the network.
Remarks
For Beginners: This returns a summary of the network's structure and configuration, including the VGG variant, input/output shapes, and layer information.
GetParameterCount()
Gets the total number of trainable parameters in the network.
public int GetParameterCount()
Returns
- int
The total parameter count.
Remarks
VGG networks are known for having a large number of parameters, primarily in the fully connected layers. For example:
- VGG16: ~138 million parameters
- VGG19: ~144 million parameters
For Beginners: Parameters are the learnable weights in the network. More parameters means more capacity to learn complex patterns, but also requires more memory and training data. VGG networks have many parameters because of their large fully connected layers.
InitializeLayers()
Initializes the layers of the VGG network based on the configuration.
protected override sealed void InitializeLayers()
Remarks
This method either uses custom layers provided in the architecture or creates the standard VGG layers based on the configuration.
For Beginners: This method builds the VGG network layer by layer. If you've provided custom layers, it uses those. Otherwise, it creates the standard VGG architecture with the appropriate number of convolutional blocks for your chosen variant.
Predict(Tensor<T>)
Makes a prediction using the VGG network for the given input.
public override Tensor<T> Predict(Tensor<T> input)
Parameters
inputTensor<T>The input tensor to make a prediction for.
Returns
- Tensor<T>
The predicted output tensor containing class probabilities.
Remarks
For Beginners: This is the main method you'll use after training. Give it an image, and it returns probabilities for each class. The class with the highest probability is the network's prediction.
SerializeNetworkSpecificData(BinaryWriter)
Serializes VGG network-specific data to a binary writer.
protected override void SerializeNetworkSpecificData(BinaryWriter writer)
Parameters
writerBinaryWriter
Train(Tensor<T>, Tensor<T>)
Trains the VGG network using the provided input and expected output.
public override void Train(Tensor<T> input, Tensor<T> expectedOutput)
Parameters
inputTensor<T>The input tensor for training.
expectedOutputTensor<T>The expected output tensor (one-hot encoded class labels).
Remarks
This method performs one training iteration: forward pass, loss calculation, backward pass, and parameter update.
For Beginners: This is how the network learns. You show it an image and tell it what class the image belongs to. The network makes a guess, compares it to the correct answer, and adjusts its weights to do better next time.
UpdateParameters(Vector<T>)
Updates the parameters of all layers in the network.
public override void UpdateParameters(Vector<T> parameters)
Parameters
parametersVector<T>A vector containing all parameters for the network.
Remarks
For Beginners: After calculating how to improve each layer's weights, this method actually applies those improvements to make the network better.