Table of Contents

Interface IAudioEnhancer<T>

Namespace
AiDotNet.Interfaces
Assembly
AiDotNet.dll

Defines the contract for audio enhancement models that improve audio quality.

public interface IAudioEnhancer<T>

Type Parameters

T

The numeric type used for calculations.

Remarks

Audio enhancement encompasses various techniques to improve audio quality:

  • Noise Reduction: Remove background noise while preserving speech/music
  • Speech Enhancement: Improve speech intelligibility and quality
  • Dereverberation: Remove room echo and reverb artifacts
  • Echo Cancellation: Remove acoustic echo in communication systems
  • Bandwidth Extension: Extend frequency range of narrowband audio

For Beginners: Audio enhancement is like photo editing for sound!

Common use cases:

  • Cleaning up podcast recordings (removing AC hum, keyboard clicks)
  • Improving phone call quality (reducing background noise)
  • Restoring old recordings (removing tape hiss, crackle)
  • Video conferencing (echo cancellation, noise suppression)
  • Hearing aids (speech enhancement in noisy environments)

How it works (simplified):

  1. Analyze the audio to identify "noise" vs "signal"
  2. Create a filter that reduces noise while keeping the signal
  3. Apply the filter to produce cleaner audio

Modern approaches use neural networks that learn what clean audio should sound like, producing much better results than traditional methods.

Properties

EnhancementStrength

Gets or sets the enhancement strength (0.0 = no enhancement, 1.0 = maximum).

double EnhancementStrength { get; set; }

Property Value

double

Remarks

Higher values provide more noise reduction but may introduce artifacts. Start with 0.5-0.7 for natural-sounding results.

LatencySamples

Gets the processing latency in samples.

int LatencySamples { get; }

Property Value

int

Remarks

Important for real-time applications. Lower latency means faster response but potentially lower quality enhancement.

NumChannels

Gets the number of audio channels supported.

int NumChannels { get; }

Property Value

int

SampleRate

Gets the sample rate this enhancer operates at.

int SampleRate { get; }

Property Value

int

Methods

Enhance(Tensor<T>)

Enhances audio quality by reducing noise and artifacts.

Tensor<T> Enhance(Tensor<T> audio)

Parameters

audio Tensor<T>

Input audio tensor with shape [channels, samples] or [samples].

Returns

Tensor<T>

Enhanced audio tensor with the same shape as input.

EnhanceWithReference(Tensor<T>, Tensor<T>)

Enhances audio with a reference signal for echo cancellation.

Tensor<T> EnhanceWithReference(Tensor<T> audio, Tensor<T> reference)

Parameters

audio Tensor<T>

Input audio (microphone signal).

reference Tensor<T>

Reference audio (speaker playback signal).

Returns

Tensor<T>

Enhanced audio with echo removed.

Remarks

For Beginners: This is for video calls!

The problem: Your microphone picks up sound from your speakers, creating an echo for the other person.

Solution: We know what's playing from the speakers (reference), so we can subtract it from what the microphone picks up.

EstimateNoiseProfile(Tensor<T>)

Estimates the noise profile from a segment of audio.

void EstimateNoiseProfile(Tensor<T> noiseOnlyAudio)

Parameters

noiseOnlyAudio Tensor<T>

Audio containing only noise (no signal).

Remarks

For Beginners: Some enhancers work better if you tell them what the noise sounds like. Record a few seconds of "silence" (just the background noise) and pass it here.

ProcessChunk(Tensor<T>)

Processes audio in real-time streaming mode.

Tensor<T> ProcessChunk(Tensor<T> audioChunk)

Parameters

audioChunk Tensor<T>

A small chunk of audio for real-time processing.

Returns

Tensor<T>

Enhanced audio chunk (may have latency).

Remarks

For real-time applications like video calls. The enhancer maintains internal state between calls for continuity.

ResetState()

Resets internal state for streaming mode.

void ResetState()