Class ConstantQTransform<T>
Constant-Q Transform (CQT) for music analysis with logarithmic frequency resolution.
public class ConstantQTransform<T>
Type Parameters
TThe numeric type used for calculations.
- Inheritance
-
ConstantQTransform<T>
- Inherited Members
Remarks
Unlike the FFT which has linear frequency spacing, the CQT uses logarithmic spacing where each frequency bin is a constant ratio (Q factor) above the previous one. This matches how humans perceive pitch: octaves are equally spaced.
For Beginners: The CQT is perfect for music analysis because:
- Musical notes are logarithmically spaced (each octave doubles frequency)
- Low notes get wide bins, high notes get narrow bins (matches perception)
- Makes chord/key detection much easier than FFT
Usage:
var cqt = new ConstantQTransform<float>(
sampleRate: 22050,
fMin: 32.7, // C1
binsPerOctave: 12, // Semitones
numOctaves: 7);
var audio = LoadAudio("music.wav");
var cqtSpectrum = cqt.Transform(audio);
// cqtSpectrum has shape [time_frames, num_bins] where num_bins = 12 * 7 = 84
Constructors
ConstantQTransform(int, double, int, int, int, WindowType)
Creates a new Constant-Q Transform instance.
public ConstantQTransform(int sampleRate = 22050, double fMin = 32.7, int binsPerOctave = 12, int numOctaves = 7, int hopLength = 512, WindowType windowType = WindowType.Hann)
Parameters
sampleRateintAudio sample rate in Hz.
fMindoubleMinimum frequency in Hz (default: C1 = 32.7 Hz).
binsPerOctaveintNumber of bins per octave (default: 12 for semitones).
numOctavesintNumber of octaves to cover (default: 7).
hopLengthintHop length between frames (default: 512).
windowTypeWindowTypeWindow type for analysis (default: Hann).
Properties
Frequencies
Gets the center frequencies for each bin.
public double[] Frequencies { get; }
Property Value
- double[]
NumBins
Gets the number of frequency bins in the CQT output.
public int NumBins { get; }
Property Value
QFactor
Gets the Q factor (quality factor) for this CQT.
public double QFactor { get; }
Property Value
SampleRate
Gets the sample rate.
public int SampleRate { get; }
Property Value
Methods
GetMidiNote(int)
Gets the MIDI note number for a given bin index.
public double GetMidiNote(int binIndex)
Parameters
binIndexintThe CQT bin index.
Returns
- double
The corresponding MIDI note number.
GetNoteName(int)
Gets the note name for a given bin index.
public string GetNoteName(int binIndex)
Parameters
binIndexintThe CQT bin index.
Returns
- string
The note name (e.g., "C4", "A#3").
Transform(Tensor<T>)
Computes the Constant-Q Transform of an audio signal.
public Tensor<T> Transform(Tensor<T> audio)
Parameters
audioTensor<T>Input audio waveform.
Returns
- Tensor<T>
CQT magnitude spectrogram with shape [time_frames, num_bins].
TransformComplex(Tensor<T>)
Computes the complex CQT (with phase information).
public Tensor<T> TransformComplex(Tensor<T> audio)
Parameters
audioTensor<T>Input audio waveform.
Returns
- Tensor<T>
Complex CQT with shape [time_frames, num_bins, 2] where last dim is [real, imag].
TransformDb(Tensor<T>, double, double)
Computes CQT in decibels (log scale).
public Tensor<T> TransformDb(Tensor<T> audio, double refValue = 1, double minDb = -80)
Parameters
audioTensor<T>Input audio waveform.
refValuedoubleReference value for dB conversion.
minDbdoubleMinimum dB value (floor).
Returns
- Tensor<T>
CQT spectrogram in decibels.
TransformPower(Tensor<T>, double)
Computes CQT with power spectrum (magnitude squared).
public Tensor<T> TransformPower(Tensor<T> audio, double power = 2)
Parameters
audioTensor<T>Input audio waveform.
powerdoublePower exponent (default: 2.0 for power spectrum).
Returns
- Tensor<T>
Power CQT spectrogram.