Table of Contents

Class AudioClassifierBase<T>

Namespace
AiDotNet.Audio.Classification
Assembly
AiDotNet.dll

Base class for audio classification models (genre, event detection, scene classification).

public abstract class AudioClassifierBase<T> : AudioNeuralNetworkBase<T>, INeuralNetworkModel<T>, INeuralNetwork<T>, IFullModel<T, Tensor<T>, Tensor<T>>, IModel<Tensor<T>, Tensor<T>, ModelMetadata<T>>, IModelSerializer, ICheckpointableModel, IParameterizable<T, Tensor<T>, Tensor<T>>, IFeatureAware, IFeatureImportance<T>, ICloneable<IFullModel<T, Tensor<T>, Tensor<T>>>, IGradientComputable<T, Tensor<T>, Tensor<T>>, IJitCompilable<T>, IInterpretableModel<T>, IInputGradientComputable<T>, IDisposable

Type Parameters

T

The numeric type used for calculations.

Inheritance
AudioClassifierBase<T>
Implements
IFullModel<T, Tensor<T>, Tensor<T>>
IModel<Tensor<T>, Tensor<T>, ModelMetadata<T>>
IParameterizable<T, Tensor<T>, Tensor<T>>
ICloneable<IFullModel<T, Tensor<T>, Tensor<T>>>
IGradientComputable<T, Tensor<T>, Tensor<T>>
Derived
Inherited Members
Extension Methods

Remarks

Audio classification assigns labels to audio clips. This base class provides common functionality for various classification tasks including: - Genre classification (rock, jazz, classical) - Audio event detection (dog bark, car horn) - Scene classification (office, park, street)

For Beginners: Audio classification is like teaching a computer to recognize different types of sounds, similar to how you can tell the difference between a dog barking and a car horn.

This base class provides:

  • Class label management
  • Softmax for probability conversion
  • Common feature extraction

Constructors

AudioClassifierBase(NeuralNetworkArchitecture<T>)

Initializes a new instance of the AudioClassifierBase class.

protected AudioClassifierBase(NeuralNetworkArchitecture<T> architecture)

Parameters

architecture NeuralNetworkArchitecture<T>

The neural network architecture.

Properties

ClassLabels

Gets the list of class labels this model can classify.

public IReadOnlyList<string> ClassLabels { get; protected set; }

Property Value

IReadOnlyList<string>

NumClasses

Gets the number of classes this model can classify.

public int NumClasses { get; }

Property Value

int

Methods

ApplySoftmax(Tensor<T>)

Applies softmax to convert logits tensor to probabilities.

protected Dictionary<string, T> ApplySoftmax(Tensor<T> logits)

Parameters

logits Tensor<T>

Raw model output tensor.

Returns

Dictionary<string, T>

Dictionary mapping class labels to probabilities.

ApplySoftmax(Vector<T>)

Applies softmax to convert logits to probabilities.

protected Dictionary<string, T> ApplySoftmax(Vector<T> logits)

Parameters

logits Vector<T>

Raw model output (logits).

Returns

Dictionary<string, T>

Dictionary mapping class labels to probabilities.

Remarks

For Beginners: Softmax converts raw scores into probabilities that sum to 1. For example, raw scores [2.0, 1.0, 0.5] might become probabilities [0.6, 0.27, 0.13].

ApplyThreshold(Dictionary<string, T>, T)

Applies threshold for multi-label classification.

protected IReadOnlyList<(string Label, T Probability)> ApplyThreshold(Dictionary<string, T> probabilities, T threshold)

Parameters

probabilities Dictionary<string, T>

Class probabilities.

threshold T

Minimum probability to consider as positive.

Returns

IReadOnlyList<(string Label, T Probability)>

List of labels above the threshold.

Remarks

For Beginners: In multi-label classification, an audio clip can belong to multiple classes (e.g., both "speech" and "music"). This method returns all classes with probability above the threshold.

ComputeClassWeights(Dictionary<string, int>)

Computes class weights for imbalanced datasets.

protected Dictionary<string, T> ComputeClassWeights(Dictionary<string, int> classCounts)

Parameters

classCounts Dictionary<string, int>

Dictionary of class labels to sample counts.

Returns

Dictionary<string, T>

Dictionary of class labels to weights (inverse frequency).

Remarks

For Beginners: If some classes have many more examples than others, the model might become biased. Class weights help balance training by giving more importance to rare classes.

GetPrediction(Dictionary<string, T>)

Gets the predicted class (highest probability).

protected (string Label, T Confidence) GetPrediction(Dictionary<string, T> probabilities)

Parameters

probabilities Dictionary<string, T>

Class probabilities.

Returns

(string Label, T Confidence)

Tuple of (predicted label, confidence).

GetTopK(Dictionary<string, T>, int)

Gets the top-K predictions sorted by probability.

protected IReadOnlyList<(string Label, T Probability, int Rank)> GetTopK(Dictionary<string, T> probabilities, int k)

Parameters

probabilities Dictionary<string, T>

Class probabilities.

k int

Number of top predictions to return.

Returns

IReadOnlyList<(string Label, T Probability, int Rank)>

List of top predictions with their probabilities.