Table of Contents

Namespace AiDotNet.Interfaces

Classes

AcousticCharacteristics<T>

Acoustic characteristics of a scene.

AdagradGpuConfig

Configuration for Adagrad optimizer on GPU.

AdamGpuConfig

Configuration for Adam optimizer on GPU.

AdamWGpuConfig

Configuration for AdamW optimizer on GPU.

AudioEffectParameter<T>

Represents an adjustable parameter for an audio effect.

AudioEventResult<T>

Result of audio event detection.

AudioEvent<T>

A detected audio event.

AudioFeatureOptions

Options for audio feature extraction.

AudioFingerprint<T>

Represents an audio fingerprint.

AudioGenerationOptions<T>

Advanced options for audio generation.

AudioVisualEvent

Represents an audio-visual event with temporal boundaries.

BeatTrackingResult<T>

Result of beat tracking.

ChainResult

Represents the result of executing a prompt chain.

ChordRecognitionResult<T>

Result of chord recognition.

ChordSegment<T>

A segment of audio with a detected chord.

ChordStatistics<T>

Statistics for a chord in the recognition result.

Contradiction

Represents a detected contradiction between reasoning steps.

CritiqueResult<T>

Result of critiquing a reasoning step or chain.

DatasetMetadata

Metadata about a dataset.

DiarizationResult<T>

Result of speaker diarization.

DirectionEstimate<T>

A direction estimate without full source information.

DownbeatResult<T>

Result of downbeat detection.

EmotionResult<T>

Represents the result of emotion recognition.

EpisodeResult<T>

Result of running a single RL episode.

EventStatistics<T>

Statistics for an event type.

FewShotExample

Represents a single few-shot example with input and output.

FingerprintMatch

Represents a match between two fingerprints.

FrequencyRange<T>

A frequency range.

GeneticParameters

Parameters for configuring a genetic algorithm.

GenreClassificationResult<T>

Result of genre classification.

GenreFeatures<T>

Features extracted for genre classification (generic version).

GenrePrediction<T>

A single genre prediction with confidence.

GenreSegment<T>

A segment with genre information.

GenreTrackingResult<T>

Result of tracking genre over time.

GpuOptimizerState

Holds the optimizer state buffers for GPU-resident training.

KeyDetectionResult<T>

Result of key detection.

KeyHypothesis<T>

A key hypothesis with confidence score.

KeySegment<T>

A segment with a detected key.

KeyTrackingResult<T>

Result of key tracking over time.

LambGpuConfig

Configuration for LAMB (Layer-wise Adaptive Moments) optimizer on GPU.

LanguageResult<T>

Represents the result of language identification.

LanguageSegment<T>

Represents a time-segmented language detection result.

LarsGpuConfig

Configuration for LARS (Layer-wise Adaptive Rate Scaling) optimizer on GPU.

LocalizationResult<T>

Result of sound localization.

LossFunctionExtensions
Mesh3D<T>

Represents a 3D mesh with vertices, faces, and optional textures.

MicrophoneArrayConfig<T>

Configuration for microphone array geometry.

ModelDownloadProgress

Progress information for model downloads.

ModulationPoint<T>

A point where the key changes (modulation).

MultimodalInput<T>

Represents an input item for unified multimodal models.

MultimodalOutput<T>

Represents an output from unified multimodal models.

NagGpuConfig

Configuration for Nesterov Accelerated Gradient (NAG) optimizer on GPU.

OptimizationHistoryEntry<T>

Represents a single entry in the optimization history.

OverlapRegion<T>

Represents a region where speakers overlap.

PitchFrame<T>

Represents a single pitch detection frame.

PruningConfig

Configuration for pruning operations.

ReasoningContext

Context information for critiquing reasoning steps.

RmsPropGpuConfig

Configuration for RMSprop optimizer on GPU.

SceneClassificationResult<T>

Result of scene classification.

SceneFeatures<T>

Features extracted for scene classification (generic version).

ScenePrediction<T>

A single scene prediction with confidence.

SceneSegment<T>

A segment with scene information.

SceneTrackingResult<T>

Result of tracking scene changes over time.

SceneTransition<T>

A detected scene transition.

SeparationQuality<T>

Quality metrics for source separation.

SgdGpuConfig

Configuration for SGD (Stochastic Gradient Descent) optimizer on GPU.

SoundSource<T>

A detected sound source with position.

SoundTrackingResult<T>

Result of sound source tracking over time.

SourceSeparationResult<T>

Result of source separation.

SourceTrajectory<T>

A tracked trajectory of a sound source.

SpeakerProfile<T>

Represents an enrolled speaker profile.

SpeakerSegment<T>

Represents a speaker segment in diarization output.

SpeakerStatistics<T>

Statistics for a speaker in diarization output.

SpeakerTurn<T>

Represents a speaker turn in diarization output (legacy API compatibility).

SpeakerVerificationResult<T>

Result of a speaker verification attempt.

StepResult

Represents the result of executing a single step in a chain.

TempoHypothesis<T>

A tempo hypothesis with confidence score.

TimeSignature

Represents a musical time signature.

TimedEmotionResult<T>

Represents a timed emotion prediction.

TrajectoryPoint<T>

A single point in a source trajectory.

TranscriptionResult<T>

Represents the result of a transcription operation.

TranscriptionSegment<T>

Represents a segment of transcribed text with timing information.

VerificationResult<T>

Represents the result of external tool verification.

VoiceInfo<T>

Information about an available TTS voice.

WeightLoadResult

Result of weight loading operation.

WeightLoadValidation

Result of weight validation.

Interfaces

I2DInterpolation<T>

Defines an interface for two-dimensional interpolation algorithms.

I3DDiffusionModel<T>

Interface for 3D diffusion models that generate 3D content like point clouds, meshes, and scenes.

IActivationFunction<T>

Defines an interface for activation functions used in neural networks and other machine learning algorithms.

IActiveLearningStrategy<T>

Defines a strategy for active learning that selects the most informative samples for labeling from a pool of unlabeled data.

IAdversarialAttack<T, TInput, TOutput>

Defines the contract for adversarial attack algorithms that generate adversarial examples.

IAdversarialDefense<T, TInput, TOutput>

Defines the contract for adversarial defense mechanisms that protect models against attacks.

IAgent<T>

Defines an agent that can reason and use tools to solve complex problems. An agent combines a language model with a set of tools to autonomously work toward a goal.

IAggregationStrategy<TModel>

Defines strategies for aggregating model updates from multiple clients in federated learning.

IAiModelBuilder<T, TInput, TOutput>

Defines a builder pattern interface for creating and configuring predictive models.

IAlignmentMethod<T>

Defines the contract for AI alignment methods that ensure models behave according to human values and intentions.

IAssociativeMemory<T>

Interface for Associative Memory modules used in nested learning. Models both backpropagation and attention mechanisms as associative memory.

IAsyncTreeBasedModel<T>

Defines an interface for asynchronous tree-based machine learning models.

IAudioDiffusionModel<T>

Interface for audio diffusion models that generate sound and music.

IAudioEffect<T>

Defines the contract for audio effects processors.

IAudioEnhancer<T>

Defines the contract for audio enhancement models that improve audio quality.

IAudioEventDetector<T>

Interface for audio event detection models that identify specific sounds/events in audio.

IAudioFeatureExtractor<T>

Defines the contract for audio feature extraction algorithms.

IAudioFingerprinter<T>

Interface for audio fingerprinting algorithms.

IAudioGenerator<T>

Interface for audio generation models that create audio from text descriptions or other conditions.

IAudioVisualCorrespondenceModel<T>

Defines the contract for audio-visual correspondence learning models.

IAudioVisualEventLocalizationModel<T>

Defines the contract for audio-visual event localization models.

IAutoMLModel<T, TInput, TOutput>

Defines the contract for AutoML models that automatically search for optimal model configurations.

IAutoregressiveMultimodalModel<T>

Defines the contract for autoregressive multimodal generation models that can generate tokens from any modality in an interleaved fashion.

IAuxiliaryLossLayer<T>

Interface for neural network layers that report auxiliary losses in addition to the primary task loss. Extends IDiagnosticsProvider<T> to provide diagnostic information about auxiliary loss computation.

IBatchIterable<TBatch>

Defines capability to iterate through data in batches.

IBatchSampler

Extended interface for samplers that support batch-level sampling.

IBeatTracker<T>

Interface for beat tracking models that detect tempo and beat positions in audio.

IBenchmark<T>

Defines the contract for reasoning benchmarks that evaluate model performance.

IBiasDetector<T>

Defines an interface for detecting bias in machine learning model predictions.

IBlip2Model<T>

Defines the contract for BLIP-2 (Bootstrapped Language-Image Pre-training 2) models.

IBlipModel<T>

Defines the contract for BLIP (Bootstrapped Language-Image Pre-training) models.

ICertifiedDefense<T, TInput, TOutput>

Defines the contract for certified defense mechanisms that provide provable robustness guarantees.

IChainStep

Defines a single step within a prompt chain.

IChain<TInput, TOutput>

Defines the contract for chains that compose multiple language model operations.

IChainableComputationGraph<T>

Interface for layers that can chain their computation graph with a provided input node.

IChatModel<T>

Defines an interface for chat-based language models that can generate responses to prompts. This interface abstracts the underlying implementation, allowing agents to work with different LLM providers. Extends ILanguageModel to provide unified language model capabilities across the AiDotNet ecosystem.

ICheckpointManager<T, TInput, TOutput>

Defines the contract for checkpoint management systems that save and restore training state.

ICheckpointableModel

Defines the contract for models that support saving and loading their internal state (checkpointing).

IChordRecognizer<T>

Interface for chord recognition models that identify musical chords in audio.

IChunkingStrategy

Defines the contract for text chunking strategies that split documents into smaller segments.

IClassifier<T>

Defines the common interface for all classification algorithms in the AiDotNet library.

IClientModel<TData, TUpdate>

Defines the functionality for a client-side model in federated learning.

IClientSelectionStrategy

Selects which clients participate in a federated learning round.

ICloneable<T>

Interface for objects that can be cloned or copied.

ICompressionMetadata<T>

Defines the contract for compression metadata that stores information needed to decompress model weights.

IConditioningModule<T>

Interface for conditioning modules that encode various inputs into embeddings for diffusion models.

IContextCompressor<T>

Defines the contract for compressing context documents to reduce token usage while preserving relevance.

IContextFlow<T>

Interface for Context Flow mechanism - maintains distinct information pathways and update rates for each nested optimization level. Core component of nested learning paradigm.

IContinualLearningStrategy<T>

Defines a strategy for continual learning that helps neural networks learn multiple tasks sequentially without forgetting previously learned knowledge.

ICountable

Defines capability to report dataset size and iteration progress.

ICrossValidator<T, TInput, TOutput>

Defines the contract for cross-validation implementations in machine learning models.

IDallE3Model<T>

Defines the contract for DALL-E 3-style text-to-image generation models.

IDataLoader<T>

Base interface for all data loaders providing common data loading capabilities.

IDataPreprocessor<T, TInput, TOutput>
IDataSampler

Defines the contract for sampling indices from a dataset during batch iteration.

IDataTransformer<T, TInput, TOutput>

Defines a data transformer that can fit to data and transform it.

IDataVersionControl<T>

Defines the contract for data version control systems that track dataset changes over time.

IDatasetFactory<T, TInput, TOutput>

Factory for creating datasets.

IDataset<T, TInput, TOutput>

Interface for datasets used in active learning scenarios.

IDecisionFunctionClassifier<T>

Interface for classifiers that compute a decision function for predictions.

IDiagnosticsProvider

Interface for components that provide diagnostic information for monitoring and debugging.

IDiffusionModel<T>

Interface for diffusion-based generative models.

IDistillationStrategy<T>

Defines a strategy for computing knowledge distillation loss between student and teacher models.

IDocumentStore<T>

Defines the contract for document stores that index and retrieve vectorized documents.

IDownloadable

Defines capability to automatically download and cache datasets.

IEmbeddingModel<T>

Defines the contract for embedding models that convert text into vector representations.

IEmotionRecognizer<T>

Defines the contract for speech emotion recognition models.

IEnvironment<T>

Represents a reinforcement learning environment that an agent interacts with.

IEpisodicDataLoader<T, TInput, TOutput>

Interface for data loaders that provide episodic tasks for meta-learning.

IEpisodicDataset<T, TInput, TOutput>

Interface for episodic datasets used in meta-learning.

IEvolvable<TGene, T>

Represents an individual that can evolve through genetic operations.

IExperiment

Represents a machine learning experiment that groups related training runs.

IExperimentRun<T>

Represents a single training run within an experiment.

IExperimentTracker<T>

Defines the contract for experiment tracking systems that log machine learning experiments.

IExtendedDataset<T, TInput, TOutput>

Extended dataset interface with additional metadata and features.

IFairnessEvaluator<T>

Defines an interface for evaluating fairness in machine learning models.

IFeatureAware

Interface for models that can provide information about their feature usage.

IFeatureImportance<T>

Interface for models that can provide feature importance scores.

IFeatureSelector<T, TInput>

Defines an interface for selecting the most relevant features from a dataset.

IFederatedClientDataLoader<T, TInput, TOutput>

Represents a data loader that can provide per-client datasets for federated learning.

IFederatedHeterogeneityCorrection<T>

Applies a heterogeneity correction transform to client updates in federated learning.

IFederatedServerOptimizer<T>

Applies a server-side optimization step in federated learning (FedOpt family).

IFederatedTrainer<TModel, TData, TMetadata>

Defines the core functionality for federated learning trainers that coordinate distributed training across multiple clients.

IFewShotExampleSelector<T>

Defines the contract for selecting few-shot examples to include in prompts.

IFineTuning<T, TInput, TOutput>

Defines the contract for fine-tuning methods that adapt pre-trained models to specific tasks or preferences.

IFitDetector<T, TInput, TOutput>

Defines an interface for detecting how well a machine learning model fits the data.

IFitnessCalculator<T, TInput, TOutput>

Defines an interface for calculating how well a machine learning model performs.

IFlamingoModel<T>

Defines the contract for Flamingo-style models with in-context visual learning capabilities.

IFullModel<T, TInput, TOutput>

Represents a complete machine learning model that combines prediction capabilities with serialization and checkpointing support.

IFunctionTool

Defines the contract for tools (functions) that language models can invoke.

IGaussianProcess<T>

Defines an interface for Gaussian Process regression, a powerful probabilistic machine learning technique.

IGenerator<T>

Defines the contract for text generation models used in retrieval-augmented generation.

IGeneticAlgorithm<T, TInput, TOutput, TIndividual, TGene>

Represents a machine learning model that uses genetic algorithms or evolutionary computation while maintaining the core capabilities of a full model.

IGenreClassifier<T>
IGpt4VisionModel<T>

Defines the contract for GPT-4V-style models that combine vision understanding with large language model capabilities.

IGpuOptimizerConfig

Configuration for GPU-resident optimizer updates.

IGradientBasedOptimizer<T, TInput, TOutput>
IGradientCache<T>

Defines an interface for storing and retrieving pre-computed gradients to improve performance in machine learning models.

IGradientComputable<T, TInput, TOutput>

Base interface for models that can compute gradients explicitly without updating parameters.

IGradientModel<T>

Represents a gradient for optimization algorithms.

IGraphConvolutionLayer<T>

Defines the contract for graph convolutional layers that process graph-structured data.

IGraphDataLoader<T>

Interface for data loaders that provide graph-structured data for graph neural networks.

IGraphStore<T>

Defines the contract for graph storage backends that manage nodes and edges.

IHomomorphicEncryptionProvider<T>

Provides homomorphic encryption operations for federated learning aggregation.

IHyperparameterOptimizer<T, TInput, TOutput>

Defines the contract for hyperparameter optimization algorithms.

IImageBindModel<T>

Defines the contract for ImageBind models that bind multiple modalities (6+) into a shared embedding space.

IInputGradientComputable<T>

Interface for models that support computing gradients with respect to input data.

IInputOutputDataLoader<T, TInput, TOutput>

Interface for data loaders that provide standard input-output (X, Y) data for supervised learning.

IIntermediateActivationStrategy<T>

Defines methods for distillation strategies that utilize intermediate layer activations.

IInterpolation<T>

Defines an interface for interpolation algorithms that estimate values between known data points.

IInterpretableModel<T>

Interface for models that support interpretability features.

IJitCompilable<T>

Interface for models that can expose their computation graph for JIT compilation.

IKernelFunction<T>

Defines an interface for kernel functions that measure similarity between data points in machine learning algorithms.

IKeyDetector<T>

Interface for musical key detection models that identify the key and mode of music.

IKnowledgeDistillationTrainer<T, TInput, TOutput>

Defines the contract for knowledge distillation trainers that train student models using knowledge transferred from teacher models.

ILLaVAModel<T>

Defines the contract for LLaVA (Large Language and Vision Assistant) models.

ILanguageIdentifier<T>

Defines the contract for spoken language identification from audio.

ILanguageModel<T>

Defines the base contract for language models that can generate text responses. This interface unifies both synchronous and asynchronous text generation capabilities.

ILatentDiffusionModel<T>

Interface for latent diffusion models that operate in a compressed latent space.

ILayer<T>

Defines the contract for neural network layers in the AiDotNet framework.

ILinearRegression<T>

Defines an interface for linear regression in machine learning, which predict outputs as a weighted sum of inputs plus an optional constant.

ILoRAAdapter<T>

Interface for LoRA (Low-Rank Adaptation) adapters that wrap existing layers with parameter-efficient adaptations.

ILoRAConfiguration<T>

Interface for configuring how LoRA (Low-Rank Adaptation) should be applied to neural network layers.

ILossFunction<T>

Interface for loss functions used in neural networks.

IMatrixDecomposition<T>

Represents a matrix decomposition that can be used to solve linear systems and invert matrices.

IMetaLearnerOptions<T>

Configuration options interface for meta-learning algorithms.

IMetaLearner<T, TInput, TOutput>

Unified interface for meta-learning algorithms that train models to quickly adapt to new tasks.

IMetaLearningTask<T, TInput, TOutput>

Represents a single meta-learning task for few-shot learning.

IModelCache<T, TInput, TOutput>

Defines a caching mechanism for storing and retrieving optimization step data during model training.

IModelCompressionStrategy<T>

Defines an interface for model compression strategies used to reduce model size while preserving accuracy.

IModelCompression<T, TMetadata>

Defines a type-safe interface for model compression used to reduce model size while preserving accuracy.

IModelEvaluator<T, TInput, TOutput>

Defines methods for evaluating machine learning models.

IModelRegistry<T, TInput, TOutput>

Defines the contract for model registry systems that manage trained model storage and versioning.

IModelSerializer

Defines methods for converting machine learning models to and from binary data for storage or transmission.

IModel<TInput, TOutput, TMetadata>

Defines the core functionality for machine learning models that can be trained on data and make predictions.

IMultiLabelClassifier<T>

Interface for multi-label classifiers that can predict multiple labels per sample.

IMultiObjectiveIndividual<T>

Interface for individuals supporting multi-objective optimization.

IMultimodalEmbedding<T>

Interface for multimodal embedding models that can encode multiple modalities (text, images, audio) into a shared embedding space.

IMusicSourceSeparator<T>

Interface for music source separation models that isolate individual instruments/vocals from a mix.

INeuralNetworkModel<T>

Defines the contract for neural network models with advanced architectural introspection capabilities.

INeuralNetwork<T>

Defines the core functionality for neural network models in the AiDotNet library.

INoisePredictor<T>

Interface for noise prediction networks used in diffusion models.

INoiseScheduler<T>

Interface for diffusion model noise schedulers that control the noise schedule during inference.

INonLinearRegression<T>

Defines the functionality for non-linear regression models in the AiDotNet library.

INormalizer<T, TInput, TOutput>
IOnnxModelDownloader

Defines the contract for downloading ONNX models from remote sources.

IOnnxModelMetadata

Metadata about an ONNX model including inputs, outputs, and opset version.

IOnnxModel<T>

Defines the contract for ONNX model wrappers that provide cross-platform model inference.

IOnnxTensorInfo

Information about an ONNX tensor (input or output).

IOptimizer<T, TInput, TOutput>

Defines the contract for optimization algorithms used in machine learning models.

IOutlierRemoval<T, TInput, TOutput>

Defines methods for detecting and removing outliers from datasets.

IParameterizable<T, TInput, TOutput>

Interface for models that have optimizable parameters.

IPipelineStep<T, TInput, TOutput>

Represents a step in a data processing pipeline

IPitchDetector<T>

Defines the contract for pitch (fundamental frequency) detection.

IPostprocessor<T, TInput, TOutput>

Defines a postprocessor that transforms model outputs into final results.

IPredictiveModel<T, TInput, TOutput>

Defines the core functionality of a trained predictive model that can make predictions on new data.

IPrivacyAccountant

Tracks cumulative privacy loss across federated learning rounds.

IPrivacyMechanism<TModel>

Defines privacy-preserving mechanisms for federated learning to protect client data.

IProbabilisticClassifier<T>

Defines the interface for classifiers that can output probability estimates for each class.

IPromptAnalyzer

Defines the contract for analyzing prompts before sending them to language models.

IPromptChain

Defines the contract for executing multi-step prompt workflows.

IPromptCompressor

Defines the contract for compressing prompts to reduce token counts and API costs.

IPromptOptimizer<T>

Defines the contract for optimizing prompts to improve language model performance.

IPromptTemplate

Defines the contract for prompt templates used in language model interactions.

IPruningMask<T>

Represents a binary mask for pruning weights in a neural network layer.

IPruningStrategy<T>

Interface for pruning strategies that remove unimportant weights to create sparsity.

IQueryProcessor

Defines the contract for processing and transforming user queries before retrieval.

IRAGMetric<T>

Defines the contract for RAG evaluation metrics.

IRLAgent<T>

Marker interface for reinforcement learning agents that integrate with AiModelBuilder.

IRLDataLoader<T>

Interface for data loaders that provide experience data for reinforcement learning.

IRadialBasisFunction<T>

Defines a radial basis function (RBF) that measures similarity based on distance.

IReasoningStrategy<T>

Defines the contract for reasoning strategies that solve problems through structured thinking.

IRegression<T>

Defines the common interface for all regression algorithms in the AiDotNet library.

IRegularization<T, TInput, TOutput>
IReranker<T>

Defines the contract for reranking retrieved documents to improve relevance ordering.

IResettable

Defines capability to reset iteration state back to the beginning.

IRetriever<T>

Defines the contract for retrieving relevant documents based on a query.

ISafetyFilter<T>

Defines the contract for safety filters that detect and prevent harmful or inappropriate model inputs and outputs.

ISceneClassifier<T>

Interface for acoustic scene classification models that identify the environment/context of audio.

ISecondOrderGradientComputable<T, TInput, TOutput>

Extended gradient computation interface for MAML meta-learning algorithms.

ISelfSupervisedLoss<T>

Interface for self-supervised loss functions used in meta-learning.

ISequenceLossFunction<T>

Interface for sequence loss functions that operate on variable-length sequences.

IShuffleable

Defines capability to shuffle data for randomized iteration.

ISoundLocalizer<T>

Interface for sound localization models that estimate the spatial position of sound sources.

ISpeakerDiarizer<T>

Interface for speaker diarization models that segment audio by speaker ("who spoke when").

ISpeakerEmbeddingExtractor<T>

Interface for speaker embedding extraction models (d-vector/x-vector extraction).

ISpeakerVerifier<T>

Interface for speaker verification models that determine if audio matches a claimed identity.

ISpeechRecognizer<T>

Interface for speech recognition models that transcribe audio to text (ASR - Automatic Speech Recognition).

IStratifiedSampler

Interface for samplers that use class labels for stratification.

IStreamingDataLoader<T, TInput, TOutput>

Interface for streaming data loaders that process data on-demand without loading all data into memory.

IStreamingEventDetectionSession<T>

Interface for streaming event detection sessions.

IStreamingSynthesisSession<T>

Interface for streaming TTS synthesis sessions.

IStreamingTranscriptionSession<T>

Interface for streaming transcription sessions.

ITeacherModel<TInput, TOutput>

Represents a trained teacher model for knowledge distillation.

ITextToSpeech<T>

Interface for text-to-speech (TTS) models that synthesize spoken audio from text.

ITimeSeriesDecomposition<T>

Defines methods and properties for decomposing time series data into its component parts.

ITimeSeriesModel<T>

Defines the core functionality for time series prediction models.

ITokenEmbedding<T>

Exposes token embedding lookup for models that maintain a token embedding table.

ITool

Defines a tool that can be used by an agent to perform specific operations. Tools enable agents to interact with external systems, perform calculations, or access data.

ITrainingMonitor<T>

Defines the contract for training monitoring systems that track and visualize model training progress.

ITreeBasedClassifier<T>

Interface for tree-based classification algorithms.

ITreeBasedRegression<T>

Defines the core functionality for tree-based machine learning models.

IUnifiedMultimodalModel<T>

Defines the contract for unified multimodal models that handle multiple modalities in a single architecture, similar to GPT-4o, Gemini, or Meta's CM3Leon.

IVAEModel<T>

Interface for Variational Autoencoder (VAE) models used in latent diffusion.

IVectorActivationFunction<T>

Defines activation functions that operate on vectors and tensors in neural networks.

IVideoCLIPModel<T>

Defines the contract for VideoCLIP-style models that align video and text in a shared embedding space.

IVideoDiffusionModel<T>

Interface for video diffusion models that generate temporal sequences.

IVoiceActivityDetector<T>

Defines the contract for Voice Activity Detection (VAD) models.

IWaveletFunction<T>

Defines the functionality for wavelet transforms used in signal processing and data analysis.

IWeightLoadable<T>

Defines the contract for models that support loading weights by name.

IWeightedSampler<T>

Interface for samplers that use sample weights.

IWindowFunction<T>

Defines functionality for creating window functions used in signal processing and data analysis.

Enums

ArrayType

Types of microphone arrays.

ConditioningType

Types of conditioning supported by diffusion models.

DallE3ImageSize

Represents the available image sizes for DALL-E 3 generation.

DallE3Quality

Represents the quality settings for DALL-E 3 generation.

DallE3Style

Represents the style settings for DALL-E 3 generation.

FineTuningCategory

Categories of fine-tuning methods.

FineTuningMethodType

Specific fine-tuning method types.

FrameInterpolationMethod

Methods for interpolating between video frames.

GpuOptimizerType

Enumerates the types of GPU-optimized optimizers available.

LogLevel

Log levels for training messages.

ModalityType

Represents the different modality types supported by ImageBind.

ModelStage

Represents the stages a model can be in during its lifecycle.

MusicalMode

Musical mode (major or minor).

RewardModelType

Types of reward models.

SparsityPattern

Types of sparsity patterns.

SurfaceReconstructionMethod

Methods for reconstructing surfaces from point clouds.

TempoRelation

Relationship of a tempo hypothesis to the primary tempo.

VoiceGender

Gender classification for TTS voices.