Table of Contents

Class AudioHelper<T>

Namespace
AiDotNet.Helpers
Assembly
AiDotNet.dll

Helper class for loading and saving audio as tensors.

public static class AudioHelper<T>

Type Parameters

T

The numeric type for tensor values.

Inheritance
AudioHelper<T>
Inherited Members

Remarks

Supports common audio formats without external dependencies: - WAV: Uncompressed PCM audio (most common for ML) - RAW: Raw PCM samples with specified parameters

For Beginners: This class converts audio files into tensors for neural networks. Audio is loaded as [channels, samples] or [batch, channels, samples] tensors. Values are normalized to [-1, 1] range by default.

Methods

LoadAudio(string, bool, int?)

Loads an audio file and returns it as a tensor with metadata.

public static AudioHelper<T>.AudioLoadResult LoadAudio(string filePath, bool normalize = true, int? targetSampleRate = null)

Parameters

filePath string

Path to the audio file.

normalize bool

Whether to normalize to [-1, 1] range.

targetSampleRate int?

Optional target sample rate for resampling.

Returns

AudioHelper<T>.AudioLoadResult

Audio tensor and metadata.

Exceptions

FileNotFoundException

If the file does not exist.

NotSupportedException

If the audio format is not supported.

LoadRaw(string, int, int, int, bool)

Loads raw PCM audio data.

public static AudioHelper<T>.AudioLoadResult LoadRaw(string filePath, int sampleRate, int channels = 1, int bitsPerSample = 16, bool normalize = true)

Parameters

filePath string

Path to the raw audio file.

sampleRate int

Sample rate in Hz.

channels int

Number of channels.

bitsPerSample int

Bits per sample (8, 16, 24, 32).

normalize bool

Whether to normalize to [-1, 1].

Returns

AudioHelper<T>.AudioLoadResult

Audio tensor and metadata.

LoadWav(string, bool)

Loads a WAV audio file.

public static AudioHelper<T>.AudioLoadResult LoadWav(string filePath, bool normalize = true)

Parameters

filePath string

Path to the WAV file.

normalize bool

Whether to normalize to [-1, 1].

Returns

AudioHelper<T>.AudioLoadResult

Audio tensor and metadata.

Resample(Tensor<T>, int, int)

Resamples audio to a different sample rate using linear interpolation.

public static Tensor<T> Resample(Tensor<T> audio, int sourceSampleRate, int targetSampleRate)

Parameters

audio Tensor<T>

Input audio tensor [1, channels, samples].

sourceSampleRate int

Original sample rate.

targetSampleRate int

Target sample rate.

Returns

Tensor<T>

Resampled audio tensor.

SaveWav(Tensor<T>, string, int, int, bool)

Saves audio tensor as a WAV file.

public static void SaveWav(Tensor<T> audio, string filePath, int sampleRate, int bitsPerSample = 16, bool denormalize = true)

Parameters

audio Tensor<T>

Audio tensor [channels, samples] or [1, channels, samples].

filePath string

Output file path.

sampleRate int

Sample rate in Hz.

bitsPerSample int

Bits per sample (16 or 32).

denormalize bool

Whether to denormalize from [-1, 1].

ToMono(Tensor<T>)

Converts stereo audio to mono by averaging channels.

public static Tensor<T> ToMono(Tensor<T> audio)

Parameters

audio Tensor<T>

Stereo audio tensor [1, 2, samples].

Returns

Tensor<T>

Mono audio tensor [1, 1, samples].