Interface IPitchDetector<T>
- Namespace
- AiDotNet.Interfaces
- Assembly
- AiDotNet.dll
Defines the contract for pitch (fundamental frequency) detection.
public interface IPitchDetector<T>
Type Parameters
TThe numeric type used for calculations.
Remarks
Pitch detection finds the fundamental frequency (F0) of periodic signals. This is essential for music analysis and speech processing.
For Beginners: Pitch is what makes a note sound "high" or "low".
Technical definition:
- Pitch = perceived frequency of a sound
- F0 (fundamental frequency) = the lowest frequency component
- Measured in Hz (cycles per second)
Human voice pitch ranges:
- Bass: 80-300 Hz
- Baritone: 100-400 Hz
- Tenor: 130-500 Hz
- Alto: 175-700 Hz
- Soprano: 250-1000 Hz
Applications:
- Auto-tune / pitch correction (T-Pain effect)
- Music transcription (audio to sheet music)
- Karaoke scoring
- Speech therapy (monitoring pitch for dysphonia)
- Voice training for singing or public speaking
- Lie detection (pitch changes under stress)
Common algorithms:
- YIN: Fast, accurate for monophonic audio
- PYIN: Probabilistic YIN (handles uncertainty)
- CREPE: Neural network approach (most accurate)
- Autocorrelation: Classic signal processing method
Properties
MaxPitch
Gets or sets the maximum detectable pitch in Hz.
double MaxPitch { get; set; }
Property Value
MinPitch
Gets or sets the minimum detectable pitch in Hz.
double MinPitch { get; set; }
Property Value
SampleRate
Gets the sample rate this detector operates at.
int SampleRate { get; }
Property Value
Methods
DetectPitch(Tensor<T>)
Detects the pitch of an audio frame.
(bool HasPitch, T Pitch) DetectPitch(Tensor<T> audioFrame)
Parameters
audioFrameTensor<T>Audio frame to analyze.
Returns
- (bool SameLanguage, T Confidence)
Result with detected pitch in Hz, or HasPitch=false if no pitch detected (silence/noise).
DetectPitchWithConfidence(Tensor<T>)
Detects pitch with confidence score.
(T Pitch, T Confidence)? DetectPitchWithConfidence(Tensor<T> audioFrame)
Parameters
audioFrameTensor<T>Audio frame to analyze.
Returns
- (T Pitch, T Confidence)?
Pitch in Hz and confidence (0-1), or null if unvoiced.
ExtractDetailedPitchContour(Tensor<T>, int)
Extracts pitch contour with voicing information.
IReadOnlyList<PitchFrame<T>> ExtractDetailedPitchContour(Tensor<T> audio, int hopSizeMs = 10)
Parameters
audioTensor<T>Full audio recording.
hopSizeMsintTime step in milliseconds.
Returns
- IReadOnlyList<PitchFrame<T>>
Array of (pitch, confidence, isVoiced) tuples.
ExtractPitchContour(Tensor<T>, int)
Extracts pitch contour from audio (F0 over time).
T[] ExtractPitchContour(Tensor<T> audio, int hopSizeMs = 10)
Parameters
audioTensor<T>Full audio recording.
hopSizeMsintTime step between pitch estimates in milliseconds.
Returns
- T[]
Array of pitch values (Hz), with 0 or NaN for unvoiced frames.
GetCentsDeviation(T)
Calculates cents deviation from nearest note.
double GetCentsDeviation(T pitchHz)
Parameters
pitchHzTPitch in Hz.
Returns
- double
Cents deviation (-50 to +50, where 100 cents = 1 semitone).
Remarks
Used for tuning. 0 cents = perfectly in tune. Positive = sharp, Negative = flat.
MidiToPitch(double)
Converts MIDI note number to pitch in Hz.
T MidiToPitch(double midiNote)
Parameters
midiNotedoubleMIDI note number.
Returns
- T
Pitch in Hz.
PitchToMidi(T)
Converts pitch in Hz to MIDI note number.
double PitchToMidi(T pitchHz)
Parameters
pitchHzTPitch in Hz.
Returns
- double
MIDI note number (69 = A4 = 440 Hz).
Remarks
For Beginners: MIDI notes are numbered 0-127. Middle C is 60, A4 (440 Hz) is 69. Each note is one semitone apart.
PitchToNoteName(T)
Gets the note name for a pitch.
string PitchToNoteName(T pitchHz)
Parameters
pitchHzTPitch in Hz.
Returns
- string
Note name like "A4", "C#5", "Bb3".