Table of Contents

Interface IPitchDetector<T>

Namespace
AiDotNet.Interfaces
Assembly
AiDotNet.dll

Defines the contract for pitch (fundamental frequency) detection.

public interface IPitchDetector<T>

Type Parameters

T

The numeric type used for calculations.

Remarks

Pitch detection finds the fundamental frequency (F0) of periodic signals. This is essential for music analysis and speech processing.

For Beginners: Pitch is what makes a note sound "high" or "low".

Technical definition:

  • Pitch = perceived frequency of a sound
  • F0 (fundamental frequency) = the lowest frequency component
  • Measured in Hz (cycles per second)

Human voice pitch ranges:

  • Bass: 80-300 Hz
  • Baritone: 100-400 Hz
  • Tenor: 130-500 Hz
  • Alto: 175-700 Hz
  • Soprano: 250-1000 Hz

Applications:

  • Auto-tune / pitch correction (T-Pain effect)
  • Music transcription (audio to sheet music)
  • Karaoke scoring
  • Speech therapy (monitoring pitch for dysphonia)
  • Voice training for singing or public speaking
  • Lie detection (pitch changes under stress)

Common algorithms:

  • YIN: Fast, accurate for monophonic audio
  • PYIN: Probabilistic YIN (handles uncertainty)
  • CREPE: Neural network approach (most accurate)
  • Autocorrelation: Classic signal processing method

Properties

MaxPitch

Gets or sets the maximum detectable pitch in Hz.

double MaxPitch { get; set; }

Property Value

double

MinPitch

Gets or sets the minimum detectable pitch in Hz.

double MinPitch { get; set; }

Property Value

double

SampleRate

Gets the sample rate this detector operates at.

int SampleRate { get; }

Property Value

int

Methods

DetectPitch(Tensor<T>)

Detects the pitch of an audio frame.

(bool HasPitch, T Pitch) DetectPitch(Tensor<T> audioFrame)

Parameters

audioFrame Tensor<T>

Audio frame to analyze.

Returns

(bool SameLanguage, T Confidence)

Result with detected pitch in Hz, or HasPitch=false if no pitch detected (silence/noise).

DetectPitchWithConfidence(Tensor<T>)

Detects pitch with confidence score.

(T Pitch, T Confidence)? DetectPitchWithConfidence(Tensor<T> audioFrame)

Parameters

audioFrame Tensor<T>

Audio frame to analyze.

Returns

(T Pitch, T Confidence)?

Pitch in Hz and confidence (0-1), or null if unvoiced.

ExtractDetailedPitchContour(Tensor<T>, int)

Extracts pitch contour with voicing information.

IReadOnlyList<PitchFrame<T>> ExtractDetailedPitchContour(Tensor<T> audio, int hopSizeMs = 10)

Parameters

audio Tensor<T>

Full audio recording.

hopSizeMs int

Time step in milliseconds.

Returns

IReadOnlyList<PitchFrame<T>>

Array of (pitch, confidence, isVoiced) tuples.

ExtractPitchContour(Tensor<T>, int)

Extracts pitch contour from audio (F0 over time).

T[] ExtractPitchContour(Tensor<T> audio, int hopSizeMs = 10)

Parameters

audio Tensor<T>

Full audio recording.

hopSizeMs int

Time step between pitch estimates in milliseconds.

Returns

T[]

Array of pitch values (Hz), with 0 or NaN for unvoiced frames.

GetCentsDeviation(T)

Calculates cents deviation from nearest note.

double GetCentsDeviation(T pitchHz)

Parameters

pitchHz T

Pitch in Hz.

Returns

double

Cents deviation (-50 to +50, where 100 cents = 1 semitone).

Remarks

Used for tuning. 0 cents = perfectly in tune. Positive = sharp, Negative = flat.

MidiToPitch(double)

Converts MIDI note number to pitch in Hz.

T MidiToPitch(double midiNote)

Parameters

midiNote double

MIDI note number.

Returns

T

Pitch in Hz.

PitchToMidi(T)

Converts pitch in Hz to MIDI note number.

double PitchToMidi(T pitchHz)

Parameters

pitchHz T

Pitch in Hz.

Returns

double

MIDI note number (69 = A4 = 440 Hz).

Remarks

For Beginners: MIDI notes are numbered 0-127. Middle C is 60, A4 (440 Hz) is 69. Each note is one semitone apart.

PitchToNoteName(T)

Gets the note name for a pitch.

string PitchToNoteName(T pitchHz)

Parameters

pitchHz T

Pitch in Hz.

Returns

string

Note name like "A4", "C#5", "Bb3".