Class AudioEnhancerBase<T>

Namespace: AiDotNet.Audio.Enhancement

Assembly: AiDotNet.dll

Base class for algorithmic audio enhancement (non-neural network based).

public abstract class AudioEnhancerBase<T> : IAudioEnhancer<T>

Type Parameters

T: The numeric type used for calculations.

Inheritance: object

AudioEnhancerBase<T>

Implements: IAudioEnhancer<T>

Derived: SpectralSubtractionEnhancer<T>

Inherited Members: object.Equals(object)

object.Equals(object, object)

object.GetHashCode()

object.GetType()

object.MemberwiseClone()

object.ReferenceEquals(object, object)

object.ToString()

Remarks

Provides common functionality for all audio enhancers including:

Frame-based processing with overlap-add
Streaming mode with state management
STFT-based analysis/synthesis

Constructors

AudioEnhancerBase(int, int, int, int, double)

Initializes a new instance of the AudioEnhancerBase class.

protected AudioEnhancerBase(int sampleRate = 16000, int numChannels = 1, int fftSize = 512, int hopSize = 128, double enhancementStrength = 0.7)

Parameters

sampleRate int: Audio sample rate in Hz.
numChannels int: Number of audio channels.
fftSize int: FFT size for spectral analysis.
hopSize int: Hop size between frames.
enhancementStrength double: Enhancement strength (0-1).

Fields

NumOps

Numeric operations for type T.

protected readonly INumericOperations<T> NumOps

Field Value

INumericOperations<T>

_bufferPosition

Current position in input buffer.

protected int _bufferPosition

Field Value

int

_fftSize

FFT size for spectral analysis.

protected readonly int _fftSize

Field Value

int

_hopSize

Hop size between frames.

protected readonly int _hopSize

Field Value

int

_inputBuffer

Input buffer for streaming mode.

protected T[]? _inputBuffer

Field Value

T[]

_noiseProfile

Estimated noise profile for spectral subtraction.

protected T[]? _noiseProfile

Field Value

T[]

_outputBuffer

Output buffer for overlap-add.

protected T[]? _outputBuffer

Field Value

T[]

_window

Window function coefficients.

protected readonly T[] _window

Field Value

T[]

Properties

EnhancementStrength

Gets or sets the enhancement strength (0.0 = no enhancement, 1.0 = maximum).

public double EnhancementStrength { get; set; }

Property Value

double

Remarks

Higher values provide more noise reduction but may introduce artifacts. Start with 0.5-0.7 for natural-sounding results.

LatencySamples

Gets the processing latency in samples.

public int LatencySamples { get; }

Property Value

int

Remarks

Important for real-time applications. Lower latency means faster response but potentially lower quality enhancement.

NumChannels

Gets the number of audio channels supported.

public int NumChannels { get; protected set; }

Property Value

int

SampleRate

Audio sample rate.

public int SampleRate { get; protected set; }

Property Value

int

Methods

ComputeFFT(T[])

Computes FFT of audio frame using FftSharp library (O(N log N) algorithm).

protected virtual (T[] Magnitudes, T[] Phases) ComputeFFT(T[] frame)

Parameters

frame T[]

Returns

(T[] Magnitudes, T[] Phases)

ComputeIFFT(T[], T[])

Computes inverse FFT using FftSharp library.

protected virtual T[] ComputeIFFT(T[] magnitudes, T[] phases)

Parameters

magnitudes T[]
phases T[]

Returns

T[]

CreateHannWindow(int)

Creates a Hann window of the specified size.

protected T[] CreateHannWindow(int size)

Parameters

size int

Returns

T[]

Enhance(Tensor<T>)

Enhances audio quality by reducing noise and artifacts.

public virtual Tensor<T> Enhance(Tensor<T> audio)

Parameters

audio Tensor<T>: Input audio tensor with shape [channels, samples] or [samples].

Returns

Tensor<T>: Enhanced audio tensor with the same shape as input.

EnhanceWithReference(Tensor<T>, Tensor<T>)

Enhances audio with a reference signal for echo cancellation.

public virtual Tensor<T> EnhanceWithReference(Tensor<T> audio, Tensor<T> reference)

Parameters

audio Tensor<T>: Input audio (microphone signal).
reference Tensor<T>: Reference audio (speaker playback signal).

Returns

Tensor<T>: Enhanced audio with echo removed.

Remarks

For Beginners: This is for video calls!

The problem: Your microphone picks up sound from your speakers, creating an echo for the other person.

Solution: We know what's playing from the speakers (reference), so we can subtract it from what the microphone picks up.

EstimateNoiseProfile(Tensor<T>)

Estimates the noise profile from a segment of audio.

public virtual void EstimateNoiseProfile(Tensor<T> noiseOnlyAudio)

Parameters

noiseOnlyAudio Tensor<T>: Audio containing only noise (no signal).

Remarks

For Beginners: Some enhancers work better if you tell them what the noise sounds like. Record a few seconds of "silence" (just the background noise) and pass it here.

EstimateNoiseSpectrum(T[])

Estimates noise spectrum from noise-only audio.

protected T[] EstimateNoiseSpectrum(T[] noiseAudio)

Parameters

noiseAudio T[]

Returns

T[]

ProcessChunk(Tensor<T>)

Processes audio in real-time streaming mode.

public virtual Tensor<T> ProcessChunk(Tensor<T> audioChunk)

Parameters

audioChunk Tensor<T>: A small chunk of audio for real-time processing.

Returns

Tensor<T>: Enhanced audio chunk (may have latency).

Remarks

For real-time applications like video calls. The enhancer maintains internal state between calls for continuity.

ProcessOverlapAdd(T[])

Processes audio using overlap-add method.

protected T[] ProcessOverlapAdd(T[] input)

Parameters

input T[]

Returns

T[]

ProcessSpectralFrame(T[], T[])

Processes a single spectral frame.

protected abstract T[] ProcessSpectralFrame(T[] magnitudes, T[] phases)

Parameters

magnitudes T[]: Magnitude spectrum.
phases T[]: Phase spectrum.

Returns

T[]: Enhanced magnitude spectrum.

ProcessStreamingChunk(T[])

Processes a streaming chunk of audio.

protected T[] ProcessStreamingChunk(T[] chunk)

Parameters

chunk T[]

Returns

T[]

ResetState()

Resets internal state for streaming mode.

public virtual void ResetState()

Table of Contents

Class AudioEnhancerBase<T>

Type Parameters

Remarks

Constructors

AudioEnhancerBase(int, int, int, int, double)

Parameters

Fields

NumOps

Field Value

_bufferPosition

Field Value

_fftSize

Field Value

_hopSize

Field Value

_inputBuffer

Field Value

_noiseProfile

Field Value

_outputBuffer

Field Value

_window

Field Value

Properties

EnhancementStrength

Property Value

Remarks

LatencySamples

Property Value

Remarks

NumChannels

Property Value

SampleRate

Property Value

Methods

ComputeFFT(T[])

Parameters

Returns

ComputeIFFT(T[], T[])

Parameters

Returns

CreateHannWindow(int)

Parameters

Returns

Enhance(Tensor<T>)

Parameters

Returns

EnhanceWithReference(Tensor<T>, Tensor<T>)

Parameters

Returns

Remarks

EstimateNoiseProfile(Tensor<T>)

Parameters

Remarks

EstimateNoiseSpectrum(T[])

Parameters

Returns

ProcessChunk(Tensor<T>)

Parameters

Returns

Remarks

ProcessOverlapAdd(T[])

Parameters

Returns

ProcessSpectralFrame(T[], T[])

Parameters

Returns

ProcessStreamingChunk(T[])

Parameters

Returns

ResetState()