Namespace AiDotNet.NeuralNetworks

Classes

ACGAN<T>: Represents an Auxiliary Classifier Generative Adversarial Network (AC-GAN), which extends conditional GANs by having the discriminator also predict the class label of the input.

AttentionNetwork<T>: Represents a neural network that utilizes attention mechanisms for sequence processing.

AudioVisualCorrespondenceNetwork<T>: Audio-visual correspondence learning network for cross-modal understanding.

AudioVisualEventLocalizationNetwork<T>: Neural network for audio-visual event localization - identifying WHEN and WHERE events occur in video by jointly analyzing audio and visual streams with precise temporal boundaries.

Autoencoder<T>: Represents an autoencoder neural network that can compress data into a lower-dimensional representation and reconstruct it.

BGE<T>: BGE (BAAI General Embedding) neural network implementation. A state-of-the-art retrieval model known for its high accuracy across diverse benchmarks.

BigGAN<T>

BigGAN implementation for large-scale high-fidelity image generation.

For Beginners: BigGAN is a state-of-the-art GAN architecture that generates extremely high-quality images by scaling up training in several ways:

Using very large batch sizes (256-2048 images at once)
Increasing model capacity (more parameters and feature maps)
Using class information to generate specific types of images

Think of it like training an artist:

Small batch = showing the artist 1-2 examples at a time
BigGAN batch = showing 256+ examples at once for better learning
Class conditioning = telling the artist exactly what to draw ("draw a cat" vs "draw something")

Key innovations:

Large Batch Training: Uses batch sizes of 256-2048 (vs typical 32-128)
Spectral Normalization: Stabilizes training for both G and D
Self-Attention: Helps model long-range dependencies in images
Class Conditioning: Uses class embeddings for controlled generation
Truncation Trick: Trade diversity for quality at generation time
Orthogonal Initialization: Better weight initialization
Skip Connections: Direct paths in generator architecture

Based on "Large Scale GAN Training for High Fidelity Natural Image Synthesis" by Brock et al. (2019)

Blip2NeuralNetwork<T>: BLIP-2 (Bootstrapped Language-Image Pre-training 2) neural network for vision-language tasks.

BlipNeuralNetwork<T>: BLIP (Bootstrapped Language-Image Pre-training) neural network for vision-language tasks.

CapsuleNetwork<T>: Represents a Capsule Network, a type of neural network that preserves spatial relationships between features.

ClipModelConfig: Configuration for a CLIP model variant.

ClipModelLoader: Loads CLIP models from HuggingFace Hub or local directories.

ClipNeuralNetwork<T>: CLIP (Contrastive Language-Image Pre-training) neural network that encodes both text and images into a shared embedding space, enabling cross-modal similarity and zero-shot classification.

ColBERT<T>: ColBERT (Contextualized Late Interaction over BERT) neural network implementation. Uses token-level representations for high-precision document retrieval.

ConditionalGAN<T>: Represents a Conditional Generative Adversarial Network (cGAN), which generates data conditioned on additional information such as class labels, attributes, or other contextual data.

Connection<T>: Represents a connection between two nodes in a neural network, particularly used in evolving neural networks.

ConvolutionalNeuralNetwork<T>: Represents a Convolutional Neural Network (CNN) that processes multi-dimensional data.

CycleGAN<T>: Represents a CycleGAN for unpaired image-to-image translation.

DCGAN<T>: Represents a Deep Convolutional Generative Adversarial Network (DCGAN), an architecture that uses convolutional and transposed convolutional layers with specific design guidelines for stable training.

DeepBeliefNetwork<T>: Represents a Deep Belief Network, a generative graphical model composed of multiple layers of Restricted Boltzmann Machines.

DeepBoltzmannMachine<T>: Represents a Deep Boltzmann Machine (DBM), a hierarchical generative model consisting of multiple layers of stochastic neurons.

DeepQNetwork<T>: Represents a Deep Q-Network (DQN), a reinforcement learning algorithm that combines Q-learning with deep neural networks.

DenseNetNetwork<T>: Implements the DenseNet (Densely Connected Convolutional Network) architecture.

DifferentiableNeuralComputer<T>: Represents a Differentiable Neural Computer (DNC), a neural network architecture that combines neural networks with external memory resources.

EchoStateNetwork<T>: Represents an Echo State Network (ESN), a type of recurrent neural network with a sparsely connected hidden layer called a reservoir.

EfficientNetNetwork<T>: Implements the EfficientNet architecture with compound scaling.

ExtremeLearningMachine<T>: Represents an Extreme Learning Machine (ELM), a type of feedforward neural network with a unique training approach.

FastText<T>: FastText neural network implementation, an extension of Word2Vec that considers subword information.

FeedForwardNeuralNetwork<T>: Represents a Feed-Forward Neural Network (FFNN) for processing data in a forward path.

FlamingoNeuralNetwork<T>: Flamingo neural network for in-context visual learning and few-shot tasks.

GRUNeuralNetwork<T>: Represents a Gated Recurrent Unit (GRU) Neural Network for processing sequential data.

GenerativeAdversarialNetwork<T>: Represents a Generative Adversarial Network (GAN), a deep learning architecture that consists of two neural networks (a generator and a discriminator) competing against each other in a zero-sum game.

Genome<T>: Represents a genome in a neuroevolutionary algorithm, containing a collection of connections between nodes.

GloVe<T>: GloVe (Global Vectors for Word Representation) neural network implementation.

Gpt4VisionNeuralNetwork<T>: GPT-4V-style neural network that combines vision understanding with large language model capabilities.

GraphAttentionNetwork<T>: Represents a Graph Attention Network (GAT) that uses attention mechanisms to process graph-structured data.

GraphGenerationModel<T>: Represents a Graph Generation Model using Variational Autoencoder (VAE) architecture.

GraphIsomorphismNetwork<T>: Represents a Graph Isomorphism Network (GIN) for powerful graph representation learning.

GraphNeuralNetwork<T>: Represents a Graph Neural Network that can process data represented as graphs.

GraphSAGENetwork<T>: Represents a GraphSAGE (Graph Sample and Aggregate) Network for inductive learning on graphs.

HTMNetwork<T>: Represents a Hierarchical Temporal Memory (HTM) network, a biologically-inspired sequence learning algorithm.

HopeNetwork<T>: Hope architecture - a self-modifying recurrent neural network variant of Titans with unbounded levels of in-context learning. Core innovation of Google's Nested Learning paradigm.

HopfieldNetwork<T>: Represents a Hopfield Network, a recurrent neural network designed for pattern storage and retrieval.

HyperbolicNeuralNetwork<T>: Represents a Hyperbolic Neural Network for learning hierarchical representations in Poincare ball space.

ImageBindNeuralNetwork<T>: ImageBind neural network for binding multiple modalities (6+) into a shared embedding space.

InfoGAN<T>: Represents an Information Maximizing Generative Adversarial Network (InfoGAN), which learns disentangled representations in an unsupervised manner by maximizing mutual information between latent codes and generated observations.

InstructorEmbedding<T>: Instructor/E5 (Instruction-Tuned) embedding model implementation. Uses task-specific instructions to adapt embeddings for different use cases.

LLaVANeuralNetwork<T>: LLaVA (Large Language and Vision Assistant) neural network for visual instruction following.

LSTMNeuralNetwork<T>: Represents a Long Short-Term Memory (LSTM) Neural Network, which is specialized for processing sequential data like text, time series, or audio.

LiquidStateMachine<T>: Represents a Liquid State Machine (LSM), a type of reservoir computing neural network.

MatryoshkaEmbedding<T>: Matryoshka Representation Learning (MRL) neural network implementation. Learns nested embeddings where smaller prefixes of the full vector are valid representations.

MemoryNetwork<T>: Represents a Memory Network, a neural network architecture designed with explicit memory components for improved reasoning and question answering capabilities.

MeshCNN<T>: Implements the MeshCNN architecture for processing 3D triangle meshes.

MixtureOfExpertsNeuralNetwork<T>: Represents a Mixture-of-Experts (MoE) neural network that routes inputs through multiple specialist networks.

MobileNetV2Network<T>: Implements the MobileNetV2 architecture for efficient mobile inference.

MobileNetV3Network<T>: Implements the MobileNetV3 architecture for efficient mobile inference.

NEAT<T>: Represents a NeuroEvolution of Augmenting Topologies (NEAT) algorithm implementation, which evolves neural networks through genetic algorithms.

NeuralNetworkArchitecture<T>: Defines the structure and configuration of a neural network, including its layers, input/output dimensions, and task-specific properties.

NeuralNetworkBase<T>: Base class for all neural network implementations in AiDotNet.

NeuralNetwork<T>: A neural network implementation that processes data through multiple layers to make predictions.

NeuralTuringMachine<T>: Represents a Neural Turing Machine, which is a neural network architecture that combines a neural network with external memory.

OccupancyNeuralNetwork<T>: Represents a Neural Network specialized for occupancy detection and prediction in spaces.

OctonionNeuralNetwork<T>: Represents an Octonion-valued Neural Network for processing data in 8-dimensional hypercomplex space.

Pix2Pix<T>: Represents a Pix2Pix GAN for paired image-to-image translation tasks.

ProgressiveGAN<T>

Production-ready Progressive GAN (ProGAN) implementation that generates high-resolution images by progressively growing the generator and discriminator during training.

For Beginners: Progressive GAN is a technique for training GANs that can generate very high-resolution images (e.g., 1024x1024 pixels). Instead of trying to generate high-resolution images from the start, it begins by generating small images (e.g., 4x4) and progressively adds new layers to both the generator and discriminator to increase the resolution (4x4 → 8x8 → 16x16 → 32x32 → 64x64 → 128x128 → 256x256 → 1024x1024).

Key innovations:

Progressive Growing: Start with low resolution and gradually add layers
Smooth Fade-in: New layers are faded in smoothly using a blending parameter (alpha)
Minibatch Standard Deviation: Helps prevent mode collapse by adding diversity
Equalized Learning Rate: Normalizes weights at runtime for better training dynamics
Pixel Normalization: Normalizes feature vectors in generator to prevent escalation

Based on "Progressive Growing of GANs for Improved Quality, Stability, and Variation" by Karras et al. (2018)

QuantumNeuralNetwork<T>: Represents a Quantum Neural Network, which combines quantum computing principles with neural network architecture.

RadialBasisFunctionNetwork<T>: Represents a Radial Basis Function Network, which is a type of neural network that uses radial basis functions as activation functions.

RecurrentNeuralNetwork<T>: Represents a Recurrent Neural Network, which is a type of neural network designed to process sequential data by maintaining an internal state.

ResNetNetwork<T>: Represents a ResNet (Residual Network) neural network architecture for image classification.

ResidualNeuralNetwork<T>: Represents a Residual Neural Network, which is a type of neural network that uses skip connections to address the vanishing gradient problem in deep networks.

RestrictedBoltzmannMachine<T>: Represents a Restricted Boltzmann Machine, which is a type of neural network that learns probability distributions over its inputs.

SAGAN<T>

Self-Attention GAN (SAGAN) implementation that uses self-attention mechanisms to model long-range dependencies in generated images.

For Beginners: Traditional CNNs in GANs only look at nearby pixels (local receptive fields). This works well for textures and local patterns, but struggles with global structure and long-range relationships (like making sure both eyes of a face look similar, or ensuring consistent geometric patterns).

Self-Attention solves this by letting each pixel "attend to" all other pixels, similar to how Transformers work in NLP. Think of it as:

CNN: "I can only see my immediate neighbors"
Self-Attention: "I can see the entire image and decide what's important"

Example: When generating a dog's face:

CNN: Might make one ear pointy and one floppy (inconsistent)
SAGAN: Notices both ears and makes them match (consistent)

Key innovations:

Self-Attention Layers: Allow modeling of long-range dependencies
Spectral Normalization: Stabilizes training for both G and D
Hinge Loss: More stable than standard GAN loss
Two Time-Scale Update Rule (TTUR): Different learning rates for G and D
Conditional Batch Normalization: For class-conditional generation

Based on "Self-Attention Generative Adversarial Networks" by Zhang et al. (2019)

SGPT<T>: SGPT (Sentence GPT) neural network implementation using decoder-only transformer architectures.

SPLADE<T>: SPLADE (Sparse Lexical and Expansion Model) neural network implementation. Maps text to a high-dimensional sparse vector in the vocabulary space.

SelfOrganizingMap<T>: Represents a Self-Organizing Map, which is an unsupervised neural network that produces a low-dimensional representation of input data.

SiameseNetwork<T>: Implements a Siamese Neural Network for comparing pairs of inputs and determining their similarity.

SiameseNeuralNetwork<T>: Siamese Neural Network implementation for dual-encoder comparison and similarity learning.

SimCSE<T>: SimCSE (Simple Contrastive Learning of Sentence Embeddings) neural network implementation.

SparseNeuralNetwork<T>: Represents a Sparse Neural Network with efficient sparse weight matrices.

SpikingNeuralNetwork<T>: Represents a Spiking Neural Network, which is a type of neural network that more closely models biological neurons with temporal dynamics.

SpiralNet<T>: Implements the SpiralNet++ architecture for mesh-based deep learning.

StyleGAN<T>: Represents a StyleGAN (Style-Based Generator Architecture for GANs) that generates high-quality images with fine-grained control over image style at different levels.

SuperNet<T>: SuperNet implementation for gradient-based neural architecture search (DARTS). Implements a differentiable architecture search by maintaining architecture parameters (alpha) and network weights simultaneously.

TransformerArchitecture<T>: Defines the architecture configuration for a Transformer neural network.

TransformerEmbeddingNetwork<T>: A customizable Transformer-based embedding network. This serves as the high-performance foundation for modern sentence and document encoders.

Transformer<T>: Represents a Transformer neural network architecture, which is particularly effective for sequence-based tasks like natural language processing.

UNet3D<T>: Represents a 3D U-Net neural network for volumetric semantic segmentation.

UnifiedMultimodalNetwork<T>: Unified multimodal network that handles text, images, audio, and video in a single architecture with cross-modal attention and any-to-any generation.

VGGNetwork<T>: Represents a VGG (Visual Geometry Group) neural network architecture for image classification.

VariationalAutoencoder<T>: Represents a Variational Autoencoder (VAE) neural network architecture, which is used for generating new data similar to the training data and learning compressed representations.

VideoCLIPNeuralNetwork<T>: VideoCLIP neural network for video-text alignment and temporal understanding.

VisionTransformer<T>: Implements the Vision Transformer (ViT) architecture for image classification tasks.

VoxelCNN<T>: Represents a Voxel-based 3D Convolutional Neural Network for processing volumetric data.

WGANGP<T>: Represents a Wasserstein GAN with Gradient Penalty (WGAN-GP), an improved version of WGAN that uses gradient penalty instead of weight clipping to enforce the Lipschitz constraint.

WGAN<T>: Represents a Wasserstein Generative Adversarial Network (WGAN), which uses the Wasserstein distance (Earth Mover's distance) to measure the difference between the generated and real data distributions.

Word2Vec<T>: Word2Vec neural network implementation supporting both Skip-Gram and CBOW architectures.

Enums

TransformerEmbeddingNetwork<T>.PoolingStrategy: Defines the available pooling strategies for creating a single sentence embedding.

Table of Contents

Namespace AiDotNet.NeuralNetworks

Classes

Enums