Table of Contents

Class GenreClassifier<T>

Namespace
AiDotNet.Audio.Classification
Assembly
AiDotNet.dll

Music genre classification model.

public class GenreClassifier<T> : AudioClassifierBase<T>, INeuralNetworkModel<T>, INeuralNetwork<T>, IInterpretableModel<T>, IInputGradientComputable<T>, IDisposable, IGenreClassifier<T>, IFullModel<T, Tensor<T>, Tensor<T>>, IModel<Tensor<T>, Tensor<T>, ModelMetadata<T>>, IModelSerializer, ICheckpointableModel, IParameterizable<T, Tensor<T>, Tensor<T>>, IFeatureAware, IFeatureImportance<T>, ICloneable<IFullModel<T, Tensor<T>, Tensor<T>>>, IGradientComputable<T, Tensor<T>, Tensor<T>>, IJitCompilable<T>

Type Parameters

T

The numeric type used for calculations.

Inheritance
GenreClassifier<T>
Implements
IFullModel<T, Tensor<T>, Tensor<T>>
IModel<Tensor<T>, Tensor<T>, ModelMetadata<T>>
IParameterizable<T, Tensor<T>, Tensor<T>>
ICloneable<IFullModel<T, Tensor<T>, Tensor<T>>>
IGradientComputable<T, Tensor<T>, Tensor<T>>
Inherited Members
Extension Methods

Remarks

Classifies audio into music genres using spectral features and neural network models. Supports common genres: rock, pop, hip-hop, jazz, classical, electronic, country, R&B, metal, folk.

This class supports both:

  • ONNX mode: Load pre-trained models for fast inference
  • Native training mode: Train from scratch using the layer architecture

For Beginners: Genre classification analyzes audio characteristics:

  • Tempo and rhythm patterns (fast/slow, complex/simple beats)
  • Timbre and instrumentation (acoustic vs electronic sounds)
  • Harmonic content (simple vs complex chords)
  • Spectral features (brightness, warmth, texture)

Usage:

// ONNX mode
var classifier = new GenreClassifier<float>(architecture, modelPath);
var result = classifier.Classify(audio);

// Native training mode
var classifier = new GenreClassifier<float>(architecture);
classifier.Train(features, labels);

Constructors

GenreClassifier(GenreClassifierOptions?)

Creates a new genre classifier with legacy options only.

public GenreClassifier(GenreClassifierOptions? options = null)

Parameters

options GenreClassifierOptions

Classification options.

GenreClassifier(NeuralNetworkArchitecture<T>, GenreClassifierOptions?, IGradientBasedOptimizer<T, Tensor<T>, Tensor<T>>?)

Creates a new genre classifier in native training mode.

public GenreClassifier(NeuralNetworkArchitecture<T> architecture, GenreClassifierOptions? options = null, IGradientBasedOptimizer<T, Tensor<T>, Tensor<T>>? optimizer = null)

Parameters

architecture NeuralNetworkArchitecture<T>

Neural network architecture configuration.

options GenreClassifierOptions

Classification options.

optimizer IGradientBasedOptimizer<T, Tensor<T>, Tensor<T>>

Optional custom optimizer (defaults to AdamW).

GenreClassifier(NeuralNetworkArchitecture<T>, string, GenreClassifierOptions?)

Creates a new genre classifier in ONNX inference mode.

public GenreClassifier(NeuralNetworkArchitecture<T> architecture, string modelPath, GenreClassifierOptions? options = null)

Parameters

architecture NeuralNetworkArchitecture<T>

Neural network architecture configuration.

modelPath string

Path to the ONNX model file.

options GenreClassifierOptions

Classification options.

Fields

StandardGenres

Standard music genres.

public static readonly string[] StandardGenres

Field Value

string[]

Properties

Genres

Gets the supported genre labels (legacy).

public string[] Genres { get; }

Property Value

string[]

IsOnnxMode

Gets whether the model is operating in ONNX inference mode.

public bool IsOnnxMode { get; }

Property Value

bool

SampleRate

Gets the sample rate.

public int SampleRate { get; }

Property Value

int

SupportedGenres

Gets the supported genre labels.

public IReadOnlyList<string> SupportedGenres { get; }

Property Value

IReadOnlyList<string>

SupportsMultiLabel

Gets whether this model supports multi-label classification.

public bool SupportsMultiLabel { get; }

Property Value

bool

Methods

Classify(Tensor<T>)

Classifies the genre of an audio clip.

public GenreClassificationResult<T> Classify(Tensor<T> audio)

Parameters

audio Tensor<T>

Audio waveform tensor.

Returns

GenreClassificationResult<T>

Classification result with predicted genre and probabilities.

ClassifyAsync(Tensor<T>, CancellationToken)

Classifies audio asynchronously.

public Task<GenreClassificationResult<T>> ClassifyAsync(Tensor<T> audio, CancellationToken cancellationToken = default)

Parameters

audio Tensor<T>
cancellationToken CancellationToken

Returns

Task<GenreClassificationResult<T>>

ClassifyBatch(IEnumerable<Tensor<T>>)

Classifies multiple audio segments in batch (legacy API).

public List<GenreClassificationResult> ClassifyBatch(IEnumerable<Tensor<T>> audioSegments)

Parameters

audioSegments IEnumerable<Tensor<T>>

Returns

List<GenreClassificationResult>

ClassifyLegacy(Tensor<T>)

Classifies the genre of an audio clip (legacy API).

public GenreClassificationResult ClassifyLegacy(Tensor<T> audio)

Parameters

audio Tensor<T>

Returns

GenreClassificationResult

CreateAsync(GenreClassifierOptions?, IProgress<double>?, CancellationToken)

Creates a GenreClassifier asynchronously with model download.

public static Task<GenreClassifier<T>> CreateAsync(GenreClassifierOptions? options = null, IProgress<double>? progress = null, CancellationToken cancellationToken = default)

Parameters

options GenreClassifierOptions
progress IProgress<double>
cancellationToken CancellationToken

Returns

Task<GenreClassifier<T>>

CreateNewInstance()

Creates a new instance of this model for cloning.

protected override IFullModel<T, Tensor<T>, Tensor<T>> CreateNewInstance()

Returns

IFullModel<T, Tensor<T>, Tensor<T>>

DeserializeNetworkSpecificData(BinaryReader)

Deserializes network-specific data.

protected override void DeserializeNetworkSpecificData(BinaryReader reader)

Parameters

reader BinaryReader

Dispose(bool)

Disposes managed resources.

protected override void Dispose(bool disposing)

Parameters

disposing bool

GetGenreProbabilities(Tensor<T>)

Gets genre probabilities for all supported genres.

public IReadOnlyDictionary<string, T> GetGenreProbabilities(Tensor<T> audio)

Parameters

audio Tensor<T>

Returns

IReadOnlyDictionary<string, T>

GetModelMetadata()

Gets metadata about the model.

public override ModelMetadata<T> GetModelMetadata()

Returns

ModelMetadata<T>

GetTopGenres(Tensor<T>, int)

Gets top-K genre predictions.

public IReadOnlyList<GenrePrediction<T>> GetTopGenres(Tensor<T> audio, int k = 5)

Parameters

audio Tensor<T>
k int

Returns

IReadOnlyList<GenrePrediction<T>>

InitializeLayers()

Initializes the neural network layers.

protected override void InitializeLayers()

PostprocessOutput(Tensor<T>)

Postprocesses model output into the final result format.

protected override Tensor<T> PostprocessOutput(Tensor<T> modelOutput)

Parameters

modelOutput Tensor<T>

Returns

Tensor<T>

Predict(Tensor<T>)

Predicts output for the given input.

public override Tensor<T> Predict(Tensor<T> input)

Parameters

input Tensor<T>

Returns

Tensor<T>

PreprocessAudio(Tensor<T>)

Preprocesses raw audio for model input.

protected override Tensor<T> PreprocessAudio(Tensor<T> rawAudio)

Parameters

rawAudio Tensor<T>

Returns

Tensor<T>

SerializeNetworkSpecificData(BinaryWriter)

Serializes network-specific data.

protected override void SerializeNetworkSpecificData(BinaryWriter writer)

Parameters

writer BinaryWriter

TrackGenreOverTime(Tensor<T>, double)

Tracks genre over time within a piece.

public GenreTrackingResult<T> TrackGenreOverTime(Tensor<T> audio, double segmentDuration = 10)

Parameters

audio Tensor<T>
segmentDuration double

Returns

GenreTrackingResult<T>

Train(Tensor<T>, Tensor<T>)

Trains the model on a single example.

public override void Train(Tensor<T> input, Tensor<T> expected)

Parameters

input Tensor<T>
expected Tensor<T>

UpdateParameters(Vector<T>)

Updates model parameters.

public override void UpdateParameters(Vector<T> parameters)

Parameters

parameters Vector<T>