Interface IAudioEnhancer<T>
- Namespace
- AiDotNet.Interfaces
- Assembly
- AiDotNet.dll
Defines the contract for audio enhancement models that improve audio quality.
public interface IAudioEnhancer<T>
Type Parameters
TThe numeric type used for calculations.
Remarks
Audio enhancement encompasses various techniques to improve audio quality:
- Noise Reduction: Remove background noise while preserving speech/music
- Speech Enhancement: Improve speech intelligibility and quality
- Dereverberation: Remove room echo and reverb artifacts
- Echo Cancellation: Remove acoustic echo in communication systems
- Bandwidth Extension: Extend frequency range of narrowband audio
For Beginners: Audio enhancement is like photo editing for sound!
Common use cases:
- Cleaning up podcast recordings (removing AC hum, keyboard clicks)
- Improving phone call quality (reducing background noise)
- Restoring old recordings (removing tape hiss, crackle)
- Video conferencing (echo cancellation, noise suppression)
- Hearing aids (speech enhancement in noisy environments)
How it works (simplified):
- Analyze the audio to identify "noise" vs "signal"
- Create a filter that reduces noise while keeping the signal
- Apply the filter to produce cleaner audio
Modern approaches use neural networks that learn what clean audio should sound like, producing much better results than traditional methods.
Properties
EnhancementStrength
Gets or sets the enhancement strength (0.0 = no enhancement, 1.0 = maximum).
double EnhancementStrength { get; set; }
Property Value
Remarks
Higher values provide more noise reduction but may introduce artifacts. Start with 0.5-0.7 for natural-sounding results.
LatencySamples
Gets the processing latency in samples.
int LatencySamples { get; }
Property Value
Remarks
Important for real-time applications. Lower latency means faster response but potentially lower quality enhancement.
NumChannels
Gets the number of audio channels supported.
int NumChannels { get; }
Property Value
SampleRate
Gets the sample rate this enhancer operates at.
int SampleRate { get; }
Property Value
Methods
Enhance(Tensor<T>)
Enhances audio quality by reducing noise and artifacts.
Tensor<T> Enhance(Tensor<T> audio)
Parameters
audioTensor<T>Input audio tensor with shape [channels, samples] or [samples].
Returns
- Tensor<T>
Enhanced audio tensor with the same shape as input.
EnhanceWithReference(Tensor<T>, Tensor<T>)
Enhances audio with a reference signal for echo cancellation.
Tensor<T> EnhanceWithReference(Tensor<T> audio, Tensor<T> reference)
Parameters
audioTensor<T>Input audio (microphone signal).
referenceTensor<T>Reference audio (speaker playback signal).
Returns
- Tensor<T>
Enhanced audio with echo removed.
Remarks
For Beginners: This is for video calls!
The problem: Your microphone picks up sound from your speakers, creating an echo for the other person.
Solution: We know what's playing from the speakers (reference), so we can subtract it from what the microphone picks up.
EstimateNoiseProfile(Tensor<T>)
Estimates the noise profile from a segment of audio.
void EstimateNoiseProfile(Tensor<T> noiseOnlyAudio)
Parameters
noiseOnlyAudioTensor<T>Audio containing only noise (no signal).
Remarks
For Beginners: Some enhancers work better if you tell them what the noise sounds like. Record a few seconds of "silence" (just the background noise) and pass it here.
ProcessChunk(Tensor<T>)
Processes audio in real-time streaming mode.
Tensor<T> ProcessChunk(Tensor<T> audioChunk)
Parameters
audioChunkTensor<T>A small chunk of audio for real-time processing.
Returns
- Tensor<T>
Enhanced audio chunk (may have latency).
Remarks
For real-time applications like video calls. The enhancer maintains internal state between calls for continuity.
ResetState()
Resets internal state for streaming mode.
void ResetState()