Namespace AiDotNet.NeuralNetworks
Classes
- ACGAN<T>
Represents an Auxiliary Classifier Generative Adversarial Network (AC-GAN), which extends conditional GANs by having the discriminator also predict the class label of the input.
- AttentionNetwork<T>
Represents a neural network that utilizes attention mechanisms for sequence processing.
- AudioVisualCorrespondenceNetwork<T>
Audio-visual correspondence learning network for cross-modal understanding.
- AudioVisualEventLocalizationNetwork<T>
Neural network for audio-visual event localization - identifying WHEN and WHERE events occur in video by jointly analyzing audio and visual streams with precise temporal boundaries.
- Autoencoder<T>
Represents an autoencoder neural network that can compress data into a lower-dimensional representation and reconstruct it.
- BGE<T>
BGE (BAAI General Embedding) neural network implementation. A state-of-the-art retrieval model known for its high accuracy across diverse benchmarks.
- BigGAN<T>
BigGAN implementation for large-scale high-fidelity image generation.
For Beginners: BigGAN is a state-of-the-art GAN architecture that generates extremely high-quality images by scaling up training in several ways:
- Using very large batch sizes (256-2048 images at once)
- Increasing model capacity (more parameters and feature maps)
- Using class information to generate specific types of images
Think of it like training an artist:
- Small batch = showing the artist 1-2 examples at a time
- BigGAN batch = showing 256+ examples at once for better learning
- Class conditioning = telling the artist exactly what to draw ("draw a cat" vs "draw something")
Key innovations:
- Large Batch Training: Uses batch sizes of 256-2048 (vs typical 32-128)
- Spectral Normalization: Stabilizes training for both G and D
- Self-Attention: Helps model long-range dependencies in images
- Class Conditioning: Uses class embeddings for controlled generation
- Truncation Trick: Trade diversity for quality at generation time
- Orthogonal Initialization: Better weight initialization
- Skip Connections: Direct paths in generator architecture
Based on "Large Scale GAN Training for High Fidelity Natural Image Synthesis" by Brock et al. (2019)
- Blip2NeuralNetwork<T>
BLIP-2 (Bootstrapped Language-Image Pre-training 2) neural network for vision-language tasks.
- BlipNeuralNetwork<T>
BLIP (Bootstrapped Language-Image Pre-training) neural network for vision-language tasks.
- CapsuleNetwork<T>
Represents a Capsule Network, a type of neural network that preserves spatial relationships between features.
- ClipModelConfig
Configuration for a CLIP model variant.
- ClipModelLoader
Loads CLIP models from HuggingFace Hub or local directories.
- ClipNeuralNetwork<T>
CLIP (Contrastive Language-Image Pre-training) neural network that encodes both text and images into a shared embedding space, enabling cross-modal similarity and zero-shot classification.
- ColBERT<T>
ColBERT (Contextualized Late Interaction over BERT) neural network implementation. Uses token-level representations for high-precision document retrieval.
- ConditionalGAN<T>
Represents a Conditional Generative Adversarial Network (cGAN), which generates data conditioned on additional information such as class labels, attributes, or other contextual data.
- Connection<T>
Represents a connection between two nodes in a neural network, particularly used in evolving neural networks.
- ConvolutionalNeuralNetwork<T>
Represents a Convolutional Neural Network (CNN) that processes multi-dimensional data.
- CycleGAN<T>
Represents a CycleGAN for unpaired image-to-image translation.
- DCGAN<T>
Represents a Deep Convolutional Generative Adversarial Network (DCGAN), an architecture that uses convolutional and transposed convolutional layers with specific design guidelines for stable training.
- DeepBeliefNetwork<T>
Represents a Deep Belief Network, a generative graphical model composed of multiple layers of Restricted Boltzmann Machines.
- DeepBoltzmannMachine<T>
Represents a Deep Boltzmann Machine (DBM), a hierarchical generative model consisting of multiple layers of stochastic neurons.
- DeepQNetwork<T>
Represents a Deep Q-Network (DQN), a reinforcement learning algorithm that combines Q-learning with deep neural networks.
- DenseNetNetwork<T>
Implements the DenseNet (Densely Connected Convolutional Network) architecture.
- DifferentiableNeuralComputer<T>
Represents a Differentiable Neural Computer (DNC), a neural network architecture that combines neural networks with external memory resources.
- EchoStateNetwork<T>
Represents an Echo State Network (ESN), a type of recurrent neural network with a sparsely connected hidden layer called a reservoir.
- EfficientNetNetwork<T>
Implements the EfficientNet architecture with compound scaling.
- ExtremeLearningMachine<T>
Represents an Extreme Learning Machine (ELM), a type of feedforward neural network with a unique training approach.
- FastText<T>
FastText neural network implementation, an extension of Word2Vec that considers subword information.
- FeedForwardNeuralNetwork<T>
Represents a Feed-Forward Neural Network (FFNN) for processing data in a forward path.
- FlamingoNeuralNetwork<T>
Flamingo neural network for in-context visual learning and few-shot tasks.
- GRUNeuralNetwork<T>
Represents a Gated Recurrent Unit (GRU) Neural Network for processing sequential data.
- GenerativeAdversarialNetwork<T>
Represents a Generative Adversarial Network (GAN), a deep learning architecture that consists of two neural networks (a generator and a discriminator) competing against each other in a zero-sum game.
- Genome<T>
Represents a genome in a neuroevolutionary algorithm, containing a collection of connections between nodes.
- GloVe<T>
GloVe (Global Vectors for Word Representation) neural network implementation.
- Gpt4VisionNeuralNetwork<T>
GPT-4V-style neural network that combines vision understanding with large language model capabilities.
- GraphAttentionNetwork<T>
Represents a Graph Attention Network (GAT) that uses attention mechanisms to process graph-structured data.
- GraphGenerationModel<T>
Represents a Graph Generation Model using Variational Autoencoder (VAE) architecture.
- GraphIsomorphismNetwork<T>
Represents a Graph Isomorphism Network (GIN) for powerful graph representation learning.
- GraphNeuralNetwork<T>
Represents a Graph Neural Network that can process data represented as graphs.
- GraphSAGENetwork<T>
Represents a GraphSAGE (Graph Sample and Aggregate) Network for inductive learning on graphs.
- HTMNetwork<T>
Represents a Hierarchical Temporal Memory (HTM) network, a biologically-inspired sequence learning algorithm.
- HopeNetwork<T>
Hope architecture - a self-modifying recurrent neural network variant of Titans with unbounded levels of in-context learning. Core innovation of Google's Nested Learning paradigm.
- HopfieldNetwork<T>
Represents a Hopfield Network, a recurrent neural network designed for pattern storage and retrieval.
- HyperbolicNeuralNetwork<T>
Represents a Hyperbolic Neural Network for learning hierarchical representations in Poincare ball space.
- ImageBindNeuralNetwork<T>
ImageBind neural network for binding multiple modalities (6+) into a shared embedding space.
- InfoGAN<T>
Represents an Information Maximizing Generative Adversarial Network (InfoGAN), which learns disentangled representations in an unsupervised manner by maximizing mutual information between latent codes and generated observations.
- InstructorEmbedding<T>
Instructor/E5 (Instruction-Tuned) embedding model implementation. Uses task-specific instructions to adapt embeddings for different use cases.
- LLaVANeuralNetwork<T>
LLaVA (Large Language and Vision Assistant) neural network for visual instruction following.
- LSTMNeuralNetwork<T>
Represents a Long Short-Term Memory (LSTM) Neural Network, which is specialized for processing sequential data like text, time series, or audio.
- LiquidStateMachine<T>
Represents a Liquid State Machine (LSM), a type of reservoir computing neural network.
- MatryoshkaEmbedding<T>
Matryoshka Representation Learning (MRL) neural network implementation. Learns nested embeddings where smaller prefixes of the full vector are valid representations.
- MemoryNetwork<T>
Represents a Memory Network, a neural network architecture designed with explicit memory components for improved reasoning and question answering capabilities.
- MeshCNN<T>
Implements the MeshCNN architecture for processing 3D triangle meshes.
- MixtureOfExpertsNeuralNetwork<T>
Represents a Mixture-of-Experts (MoE) neural network that routes inputs through multiple specialist networks.
- MobileNetV2Network<T>
Implements the MobileNetV2 architecture for efficient mobile inference.
- MobileNetV3Network<T>
Implements the MobileNetV3 architecture for efficient mobile inference.
- NEAT<T>
Represents a NeuroEvolution of Augmenting Topologies (NEAT) algorithm implementation, which evolves neural networks through genetic algorithms.
- NeuralNetworkArchitecture<T>
Defines the structure and configuration of a neural network, including its layers, input/output dimensions, and task-specific properties.
- NeuralNetworkBase<T>
Base class for all neural network implementations in AiDotNet.
- NeuralNetwork<T>
A neural network implementation that processes data through multiple layers to make predictions.
- NeuralTuringMachine<T>
Represents a Neural Turing Machine, which is a neural network architecture that combines a neural network with external memory.
- OccupancyNeuralNetwork<T>
Represents a Neural Network specialized for occupancy detection and prediction in spaces.
- OctonionNeuralNetwork<T>
Represents an Octonion-valued Neural Network for processing data in 8-dimensional hypercomplex space.
- Pix2Pix<T>
Represents a Pix2Pix GAN for paired image-to-image translation tasks.
- ProgressiveGAN<T>
Production-ready Progressive GAN (ProGAN) implementation that generates high-resolution images by progressively growing the generator and discriminator during training.
For Beginners: Progressive GAN is a technique for training GANs that can generate very high-resolution images (e.g., 1024x1024 pixels). Instead of trying to generate high-resolution images from the start, it begins by generating small images (e.g., 4x4) and progressively adds new layers to both the generator and discriminator to increase the resolution (4x4 → 8x8 → 16x16 → 32x32 → 64x64 → 128x128 → 256x256 → 1024x1024).
Key innovations:
- Progressive Growing: Start with low resolution and gradually add layers
- Smooth Fade-in: New layers are faded in smoothly using a blending parameter (alpha)
- Minibatch Standard Deviation: Helps prevent mode collapse by adding diversity
- Equalized Learning Rate: Normalizes weights at runtime for better training dynamics
- Pixel Normalization: Normalizes feature vectors in generator to prevent escalation
Based on "Progressive Growing of GANs for Improved Quality, Stability, and Variation" by Karras et al. (2018)
- QuantumNeuralNetwork<T>
Represents a Quantum Neural Network, which combines quantum computing principles with neural network architecture.
- RadialBasisFunctionNetwork<T>
Represents a Radial Basis Function Network, which is a type of neural network that uses radial basis functions as activation functions.
- RecurrentNeuralNetwork<T>
Represents a Recurrent Neural Network, which is a type of neural network designed to process sequential data by maintaining an internal state.
- ResNetNetwork<T>
Represents a ResNet (Residual Network) neural network architecture for image classification.
- ResidualNeuralNetwork<T>
Represents a Residual Neural Network, which is a type of neural network that uses skip connections to address the vanishing gradient problem in deep networks.
- RestrictedBoltzmannMachine<T>
Represents a Restricted Boltzmann Machine, which is a type of neural network that learns probability distributions over its inputs.
- SAGAN<T>
Self-Attention GAN (SAGAN) implementation that uses self-attention mechanisms to model long-range dependencies in generated images.
For Beginners: Traditional CNNs in GANs only look at nearby pixels (local receptive fields). This works well for textures and local patterns, but struggles with global structure and long-range relationships (like making sure both eyes of a face look similar, or ensuring consistent geometric patterns).
Self-Attention solves this by letting each pixel "attend to" all other pixels, similar to how Transformers work in NLP. Think of it as:
- CNN: "I can only see my immediate neighbors"
- Self-Attention: "I can see the entire image and decide what's important"
Example: When generating a dog's face:
- CNN: Might make one ear pointy and one floppy (inconsistent)
- SAGAN: Notices both ears and makes them match (consistent)
Key innovations:
- Self-Attention Layers: Allow modeling of long-range dependencies
- Spectral Normalization: Stabilizes training for both G and D
- Hinge Loss: More stable than standard GAN loss
- Two Time-Scale Update Rule (TTUR): Different learning rates for G and D
- Conditional Batch Normalization: For class-conditional generation
Based on "Self-Attention Generative Adversarial Networks" by Zhang et al. (2019)
- SGPT<T>
SGPT (Sentence GPT) neural network implementation using decoder-only transformer architectures.
- SPLADE<T>
SPLADE (Sparse Lexical and Expansion Model) neural network implementation. Maps text to a high-dimensional sparse vector in the vocabulary space.
- SelfOrganizingMap<T>
Represents a Self-Organizing Map, which is an unsupervised neural network that produces a low-dimensional representation of input data.
- SiameseNetwork<T>
Implements a Siamese Neural Network for comparing pairs of inputs and determining their similarity.
- SiameseNeuralNetwork<T>
Siamese Neural Network implementation for dual-encoder comparison and similarity learning.
- SimCSE<T>
SimCSE (Simple Contrastive Learning of Sentence Embeddings) neural network implementation.
- SparseNeuralNetwork<T>
Represents a Sparse Neural Network with efficient sparse weight matrices.
- SpikingNeuralNetwork<T>
Represents a Spiking Neural Network, which is a type of neural network that more closely models biological neurons with temporal dynamics.
- SpiralNet<T>
Implements the SpiralNet++ architecture for mesh-based deep learning.
- StyleGAN<T>
Represents a StyleGAN (Style-Based Generator Architecture for GANs) that generates high-quality images with fine-grained control over image style at different levels.
- SuperNet<T>
SuperNet implementation for gradient-based neural architecture search (DARTS). Implements a differentiable architecture search by maintaining architecture parameters (alpha) and network weights simultaneously.
- TransformerArchitecture<T>
Defines the architecture configuration for a Transformer neural network.
- TransformerEmbeddingNetwork<T>
A customizable Transformer-based embedding network. This serves as the high-performance foundation for modern sentence and document encoders.
- Transformer<T>
Represents a Transformer neural network architecture, which is particularly effective for sequence-based tasks like natural language processing.
- UNet3D<T>
Represents a 3D U-Net neural network for volumetric semantic segmentation.
- UnifiedMultimodalNetwork<T>
Unified multimodal network that handles text, images, audio, and video in a single architecture with cross-modal attention and any-to-any generation.
- VGGNetwork<T>
Represents a VGG (Visual Geometry Group) neural network architecture for image classification.
- VariationalAutoencoder<T>
Represents a Variational Autoencoder (VAE) neural network architecture, which is used for generating new data similar to the training data and learning compressed representations.
- VideoCLIPNeuralNetwork<T>
VideoCLIP neural network for video-text alignment and temporal understanding.
- VisionTransformer<T>
Implements the Vision Transformer (ViT) architecture for image classification tasks.
- VoxelCNN<T>
Represents a Voxel-based 3D Convolutional Neural Network for processing volumetric data.
- WGANGP<T>
Represents a Wasserstein GAN with Gradient Penalty (WGAN-GP), an improved version of WGAN that uses gradient penalty instead of weight clipping to enforce the Lipschitz constraint.
- WGAN<T>
Represents a Wasserstein Generative Adversarial Network (WGAN), which uses the Wasserstein distance (Earth Mover's distance) to measure the difference between the generated and real data distributions.
- Word2Vec<T>
Word2Vec neural network implementation supporting both Skip-Gram and CBOW architectures.
Enums
- TransformerEmbeddingNetwork<T>.PoolingStrategy
Defines the available pooling strategies for creating a single sentence embedding.