Class AudioHelper<T>
Helper class for loading and saving audio as tensors.
public static class AudioHelper<T>
Type Parameters
TThe numeric type for tensor values.
- Inheritance
-
AudioHelper<T>
- Inherited Members
Remarks
Supports common audio formats without external dependencies: - WAV: Uncompressed PCM audio (most common for ML) - RAW: Raw PCM samples with specified parameters
For Beginners: This class converts audio files into tensors for neural networks. Audio is loaded as [channels, samples] or [batch, channels, samples] tensors. Values are normalized to [-1, 1] range by default.
Methods
LoadAudio(string, bool, int?)
Loads an audio file and returns it as a tensor with metadata.
public static AudioHelper<T>.AudioLoadResult LoadAudio(string filePath, bool normalize = true, int? targetSampleRate = null)
Parameters
filePathstringPath to the audio file.
normalizeboolWhether to normalize to [-1, 1] range.
targetSampleRateint?Optional target sample rate for resampling.
Returns
- AudioHelper<T>.AudioLoadResult
Audio tensor and metadata.
Exceptions
- FileNotFoundException
If the file does not exist.
- NotSupportedException
If the audio format is not supported.
LoadRaw(string, int, int, int, bool)
Loads raw PCM audio data.
public static AudioHelper<T>.AudioLoadResult LoadRaw(string filePath, int sampleRate, int channels = 1, int bitsPerSample = 16, bool normalize = true)
Parameters
filePathstringPath to the raw audio file.
sampleRateintSample rate in Hz.
channelsintNumber of channels.
bitsPerSampleintBits per sample (8, 16, 24, 32).
normalizeboolWhether to normalize to [-1, 1].
Returns
- AudioHelper<T>.AudioLoadResult
Audio tensor and metadata.
LoadWav(string, bool)
Loads a WAV audio file.
public static AudioHelper<T>.AudioLoadResult LoadWav(string filePath, bool normalize = true)
Parameters
Returns
- AudioHelper<T>.AudioLoadResult
Audio tensor and metadata.
Resample(Tensor<T>, int, int)
Resamples audio to a different sample rate using linear interpolation.
public static Tensor<T> Resample(Tensor<T> audio, int sourceSampleRate, int targetSampleRate)
Parameters
audioTensor<T>Input audio tensor [1, channels, samples].
sourceSampleRateintOriginal sample rate.
targetSampleRateintTarget sample rate.
Returns
- Tensor<T>
Resampled audio tensor.
SaveWav(Tensor<T>, string, int, int, bool)
Saves audio tensor as a WAV file.
public static void SaveWav(Tensor<T> audio, string filePath, int sampleRate, int bitsPerSample = 16, bool denormalize = true)
Parameters
audioTensor<T>Audio tensor [channels, samples] or [1, channels, samples].
filePathstringOutput file path.
sampleRateintSample rate in Hz.
bitsPerSampleintBits per sample (16 or 32).
denormalizeboolWhether to denormalize from [-1, 1].
ToMono(Tensor<T>)
Converts stereo audio to mono by averaging channels.
public static Tensor<T> ToMono(Tensor<T> audio)
Parameters
audioTensor<T>Stereo audio tensor [1, 2, samples].
Returns
- Tensor<T>
Mono audio tensor [1, 1, samples].