Table of Contents

Class ChromaExtractor<T>

Namespace
AiDotNet.Audio.Features
Assembly
AiDotNet.dll

Extracts chromagram (pitch class profile) features from audio signals.

public class ChromaExtractor<T> : AudioFeatureExtractorBase<T>, IAudioFeatureExtractor<T>

Type Parameters

T

The numeric type used for calculations.

Inheritance
ChromaExtractor<T>
Implements
Inherited Members

Remarks

A chromagram represents the energy of the 12 pitch classes (C, C#, D, D#, E, F, F#, G, G#, A, A#, B) regardless of octave. It's particularly useful for music analysis tasks.

For Beginners: In Western music, there are 12 notes that repeat in each octave. A C note at 262 Hz and a C note at 524 Hz are both "C" - they're the same pitch class.

A chromagram collapses all octaves together, showing how much energy is in each of the 12 notes:

  • Index 0: C (do)
  • Index 1: C#/Db
  • Index 2: D (re)
  • ...and so on through B

This is useful for:

  • Chord recognition (chords have characteristic chroma patterns)
  • Key detection (which notes are emphasized in the music)
  • Music similarity (songs in the same key have similar chromagrams)
  • Cover song detection

Usage:

var chroma = new ChromaExtractor<float>();
var features = chroma.Extract(audioTensor);
// features.Shape = [numFrames, 12]

Constructors

ChromaExtractor(ChromaOptions?)

Initializes a new chroma feature extractor.

public ChromaExtractor(ChromaOptions? options = null)

Parameters

options ChromaOptions

Chroma extraction options.

Properties

FeatureDimension

Gets the number of features produced per frame.

public override int FeatureDimension { get; }

Property Value

int

Name

Gets the name of this feature extractor.

public override string Name { get; }

Property Value

string

Methods

Extract(Tensor<T>)

Extracts features from an audio waveform.

public override Tensor<T> Extract(Tensor<T> audio)

Parameters

audio Tensor<T>

The audio waveform as a 1D tensor [samples].

Returns

Tensor<T>

Features as a 2D tensor [frames, features].

GetDominantPitchClass(T[])

Gets the dominant pitch class for a chroma vector.

public int GetDominantPitchClass(T[] chromaFrame)

Parameters

chromaFrame T[]

A chroma vector of length 12.

Returns

int

The index of the dominant pitch class (0-11).

GetPitchClassName(int)

Gets the pitch class name for an index (0-11).

public static string GetPitchClassName(int index)

Parameters

index int

The pitch class index.

Returns

string

The pitch class name (C, C#, D, etc.).