Class GenreClassifier<T>
- Namespace
- AiDotNet.Audio.Classification
- Assembly
- AiDotNet.dll
Music genre classification model.
public class GenreClassifier<T> : AudioClassifierBase<T>, INeuralNetworkModel<T>, INeuralNetwork<T>, IInterpretableModel<T>, IInputGradientComputable<T>, IDisposable, IGenreClassifier<T>, IFullModel<T, Tensor<T>, Tensor<T>>, IModel<Tensor<T>, Tensor<T>, ModelMetadata<T>>, IModelSerializer, ICheckpointableModel, IParameterizable<T, Tensor<T>, Tensor<T>>, IFeatureAware, IFeatureImportance<T>, ICloneable<IFullModel<T, Tensor<T>, Tensor<T>>>, IGradientComputable<T, Tensor<T>, Tensor<T>>, IJitCompilable<T>
Type Parameters
TThe numeric type used for calculations.
- Inheritance
-
GenreClassifier<T>
- Implements
- Inherited Members
- Extension Methods
Remarks
Classifies audio into music genres using spectral features and neural network models. Supports common genres: rock, pop, hip-hop, jazz, classical, electronic, country, R&B, metal, folk.
This class supports both:
- ONNX mode: Load pre-trained models for fast inference
- Native training mode: Train from scratch using the layer architecture
For Beginners: Genre classification analyzes audio characteristics:
- Tempo and rhythm patterns (fast/slow, complex/simple beats)
- Timbre and instrumentation (acoustic vs electronic sounds)
- Harmonic content (simple vs complex chords)
- Spectral features (brightness, warmth, texture)
Usage:
// ONNX mode
var classifier = new GenreClassifier<float>(architecture, modelPath);
var result = classifier.Classify(audio);
// Native training mode
var classifier = new GenreClassifier<float>(architecture);
classifier.Train(features, labels);
Constructors
GenreClassifier(GenreClassifierOptions?)
Creates a new genre classifier with legacy options only.
public GenreClassifier(GenreClassifierOptions? options = null)
Parameters
optionsGenreClassifierOptionsClassification options.
GenreClassifier(NeuralNetworkArchitecture<T>, GenreClassifierOptions?, IGradientBasedOptimizer<T, Tensor<T>, Tensor<T>>?)
Creates a new genre classifier in native training mode.
public GenreClassifier(NeuralNetworkArchitecture<T> architecture, GenreClassifierOptions? options = null, IGradientBasedOptimizer<T, Tensor<T>, Tensor<T>>? optimizer = null)
Parameters
architectureNeuralNetworkArchitecture<T>Neural network architecture configuration.
optionsGenreClassifierOptionsClassification options.
optimizerIGradientBasedOptimizer<T, Tensor<T>, Tensor<T>>Optional custom optimizer (defaults to AdamW).
GenreClassifier(NeuralNetworkArchitecture<T>, string, GenreClassifierOptions?)
Creates a new genre classifier in ONNX inference mode.
public GenreClassifier(NeuralNetworkArchitecture<T> architecture, string modelPath, GenreClassifierOptions? options = null)
Parameters
architectureNeuralNetworkArchitecture<T>Neural network architecture configuration.
modelPathstringPath to the ONNX model file.
optionsGenreClassifierOptionsClassification options.
Fields
StandardGenres
Standard music genres.
public static readonly string[] StandardGenres
Field Value
- string[]
Properties
Genres
Gets the supported genre labels (legacy).
public string[] Genres { get; }
Property Value
- string[]
IsOnnxMode
Gets whether the model is operating in ONNX inference mode.
public bool IsOnnxMode { get; }
Property Value
SampleRate
Gets the sample rate.
public int SampleRate { get; }
Property Value
SupportedGenres
Gets the supported genre labels.
public IReadOnlyList<string> SupportedGenres { get; }
Property Value
SupportsMultiLabel
Gets whether this model supports multi-label classification.
public bool SupportsMultiLabel { get; }
Property Value
Methods
Classify(Tensor<T>)
Classifies the genre of an audio clip.
public GenreClassificationResult<T> Classify(Tensor<T> audio)
Parameters
audioTensor<T>Audio waveform tensor.
Returns
- GenreClassificationResult<T>
Classification result with predicted genre and probabilities.
ClassifyAsync(Tensor<T>, CancellationToken)
Classifies audio asynchronously.
public Task<GenreClassificationResult<T>> ClassifyAsync(Tensor<T> audio, CancellationToken cancellationToken = default)
Parameters
audioTensor<T>cancellationTokenCancellationToken
Returns
ClassifyBatch(IEnumerable<Tensor<T>>)
Classifies multiple audio segments in batch (legacy API).
public List<GenreClassificationResult> ClassifyBatch(IEnumerable<Tensor<T>> audioSegments)
Parameters
audioSegmentsIEnumerable<Tensor<T>>
Returns
ClassifyLegacy(Tensor<T>)
Classifies the genre of an audio clip (legacy API).
public GenreClassificationResult ClassifyLegacy(Tensor<T> audio)
Parameters
audioTensor<T>
Returns
CreateAsync(GenreClassifierOptions?, IProgress<double>?, CancellationToken)
Creates a GenreClassifier asynchronously with model download.
public static Task<GenreClassifier<T>> CreateAsync(GenreClassifierOptions? options = null, IProgress<double>? progress = null, CancellationToken cancellationToken = default)
Parameters
optionsGenreClassifierOptionsprogressIProgress<double>cancellationTokenCancellationToken
Returns
- Task<GenreClassifier<T>>
CreateNewInstance()
Creates a new instance of this model for cloning.
protected override IFullModel<T, Tensor<T>, Tensor<T>> CreateNewInstance()
Returns
- IFullModel<T, Tensor<T>, Tensor<T>>
DeserializeNetworkSpecificData(BinaryReader)
Deserializes network-specific data.
protected override void DeserializeNetworkSpecificData(BinaryReader reader)
Parameters
readerBinaryReader
Dispose(bool)
Disposes managed resources.
protected override void Dispose(bool disposing)
Parameters
disposingbool
GetGenreProbabilities(Tensor<T>)
Gets genre probabilities for all supported genres.
public IReadOnlyDictionary<string, T> GetGenreProbabilities(Tensor<T> audio)
Parameters
audioTensor<T>
Returns
GetModelMetadata()
Gets metadata about the model.
public override ModelMetadata<T> GetModelMetadata()
Returns
GetTopGenres(Tensor<T>, int)
Gets top-K genre predictions.
public IReadOnlyList<GenrePrediction<T>> GetTopGenres(Tensor<T> audio, int k = 5)
Parameters
audioTensor<T>kint
Returns
InitializeLayers()
Initializes the neural network layers.
protected override void InitializeLayers()
PostprocessOutput(Tensor<T>)
Postprocesses model output into the final result format.
protected override Tensor<T> PostprocessOutput(Tensor<T> modelOutput)
Parameters
modelOutputTensor<T>
Returns
- Tensor<T>
Predict(Tensor<T>)
Predicts output for the given input.
public override Tensor<T> Predict(Tensor<T> input)
Parameters
inputTensor<T>
Returns
- Tensor<T>
PreprocessAudio(Tensor<T>)
Preprocesses raw audio for model input.
protected override Tensor<T> PreprocessAudio(Tensor<T> rawAudio)
Parameters
rawAudioTensor<T>
Returns
- Tensor<T>
SerializeNetworkSpecificData(BinaryWriter)
Serializes network-specific data.
protected override void SerializeNetworkSpecificData(BinaryWriter writer)
Parameters
writerBinaryWriter
TrackGenreOverTime(Tensor<T>, double)
Tracks genre over time within a piece.
public GenreTrackingResult<T> TrackGenreOverTime(Tensor<T> audio, double segmentDuration = 10)
Parameters
audioTensor<T>segmentDurationdouble
Returns
Train(Tensor<T>, Tensor<T>)
Trains the model on a single example.
public override void Train(Tensor<T> input, Tensor<T> expected)
Parameters
inputTensor<T>expectedTensor<T>
UpdateParameters(Vector<T>)
Updates model parameters.
public override void UpdateParameters(Vector<T> parameters)
Parameters
parametersVector<T>