Class AudioClassifierBase<T>
- Namespace
- AiDotNet.Audio.Classification
- Assembly
- AiDotNet.dll
Base class for audio classification models (genre, event detection, scene classification).
public abstract class AudioClassifierBase<T> : AudioNeuralNetworkBase<T>, INeuralNetworkModel<T>, INeuralNetwork<T>, IFullModel<T, Tensor<T>, Tensor<T>>, IModel<Tensor<T>, Tensor<T>, ModelMetadata<T>>, IModelSerializer, ICheckpointableModel, IParameterizable<T, Tensor<T>, Tensor<T>>, IFeatureAware, IFeatureImportance<T>, ICloneable<IFullModel<T, Tensor<T>, Tensor<T>>>, IGradientComputable<T, Tensor<T>, Tensor<T>>, IJitCompilable<T>, IInterpretableModel<T>, IInputGradientComputable<T>, IDisposable
Type Parameters
TThe numeric type used for calculations.
- Inheritance
-
AudioClassifierBase<T>
- Implements
- Derived
- Inherited Members
- Extension Methods
Remarks
Audio classification assigns labels to audio clips. This base class provides common functionality for various classification tasks including: - Genre classification (rock, jazz, classical) - Audio event detection (dog bark, car horn) - Scene classification (office, park, street)
For Beginners: Audio classification is like teaching a computer to recognize different types of sounds, similar to how you can tell the difference between a dog barking and a car horn.
This base class provides:
- Class label management
- Softmax for probability conversion
- Common feature extraction
Constructors
AudioClassifierBase(NeuralNetworkArchitecture<T>)
Initializes a new instance of the AudioClassifierBase class.
protected AudioClassifierBase(NeuralNetworkArchitecture<T> architecture)
Parameters
architectureNeuralNetworkArchitecture<T>The neural network architecture.
Properties
ClassLabels
Gets the list of class labels this model can classify.
public IReadOnlyList<string> ClassLabels { get; protected set; }
Property Value
NumClasses
Gets the number of classes this model can classify.
public int NumClasses { get; }
Property Value
Methods
ApplySoftmax(Tensor<T>)
Applies softmax to convert logits tensor to probabilities.
protected Dictionary<string, T> ApplySoftmax(Tensor<T> logits)
Parameters
logitsTensor<T>Raw model output tensor.
Returns
- Dictionary<string, T>
Dictionary mapping class labels to probabilities.
ApplySoftmax(Vector<T>)
Applies softmax to convert logits to probabilities.
protected Dictionary<string, T> ApplySoftmax(Vector<T> logits)
Parameters
logitsVector<T>Raw model output (logits).
Returns
- Dictionary<string, T>
Dictionary mapping class labels to probabilities.
Remarks
For Beginners: Softmax converts raw scores into probabilities that sum to 1. For example, raw scores [2.0, 1.0, 0.5] might become probabilities [0.6, 0.27, 0.13].
ApplyThreshold(Dictionary<string, T>, T)
Applies threshold for multi-label classification.
protected IReadOnlyList<(string Label, T Probability)> ApplyThreshold(Dictionary<string, T> probabilities, T threshold)
Parameters
probabilitiesDictionary<string, T>Class probabilities.
thresholdTMinimum probability to consider as positive.
Returns
- IReadOnlyList<(string Label, T Probability)>
List of labels above the threshold.
Remarks
For Beginners: In multi-label classification, an audio clip can belong to multiple classes (e.g., both "speech" and "music"). This method returns all classes with probability above the threshold.
ComputeClassWeights(Dictionary<string, int>)
Computes class weights for imbalanced datasets.
protected Dictionary<string, T> ComputeClassWeights(Dictionary<string, int> classCounts)
Parameters
classCountsDictionary<string, int>Dictionary of class labels to sample counts.
Returns
- Dictionary<string, T>
Dictionary of class labels to weights (inverse frequency).
Remarks
For Beginners: If some classes have many more examples than others, the model might become biased. Class weights help balance training by giving more importance to rare classes.
GetPrediction(Dictionary<string, T>)
Gets the predicted class (highest probability).
protected (string Label, T Confidence) GetPrediction(Dictionary<string, T> probabilities)
Parameters
probabilitiesDictionary<string, T>Class probabilities.
Returns
- (string Label, T Confidence)
Tuple of (predicted label, confidence).
GetTopK(Dictionary<string, T>, int)
Gets the top-K predictions sorted by probability.
protected IReadOnlyList<(string Label, T Probability, int Rank)> GetTopK(Dictionary<string, T> probabilities, int k)
Parameters
probabilitiesDictionary<string, T>Class probabilities.
kintNumber of top predictions to return.
Returns
- IReadOnlyList<(string Label, T Probability, int Rank)>
List of top predictions with their probabilities.