Table of Contents

Class OnnxModel<T>

Namespace
AiDotNet.Onnx
Assembly
AiDotNet.dll

A wrapper for ONNX models that provides easy-to-use inference with AiDotNet Tensor types.

public class OnnxModel<T> : IOnnxModel<T>, IDisposable

Type Parameters

T

The numeric type used for calculations.

Inheritance
OnnxModel<T>
Implements
Inherited Members

Remarks

This class wraps the ONNX Runtime InferenceSession and provides:

  • Automatic tensor conversion between AiDotNet and ONNX formats
  • Support for multiple execution providers (CPU, CUDA, TensorRT, DirectML)
  • Multi-input/multi-output model support
  • Warm-up and async inference

For Beginners: Use this class to run pre-trained ONNX models:

// Load a model
var model = new OnnxModel<float>("model.onnx");

// Run inference var input = new Tensor<float>([1, 3, 224, 224]); var output = model.Run(input);

// Don't forget to dispose model.Dispose();

Constructors

OnnxModel(byte[], OnnxModelOptions?)

Creates a new OnnxModel from a byte array.

public OnnxModel(byte[] modelBytes, OnnxModelOptions? options = null)

Parameters

modelBytes byte[]

The ONNX model as a byte array.

options OnnxModelOptions

Optional configuration options.

OnnxModel(string, OnnxModelOptions?)

Creates a new OnnxModel from a file path.

public OnnxModel(string modelPath, OnnxModelOptions? options = null)

Parameters

modelPath string

Path to the ONNX model file.

options OnnxModelOptions

Optional configuration options.

Properties

ExecutionProvider

Gets the execution provider currently being used (CPU, CUDA, TensorRT, DirectML).

public string ExecutionProvider { get; }

Property Value

string

IsLoaded

Gets whether the model has been successfully loaded and is ready for inference.

public bool IsLoaded { get; }

Property Value

bool

Metadata

Gets the metadata about the loaded ONNX model.

public IOnnxModelMetadata Metadata { get; }

Property Value

IOnnxModelMetadata

Methods

CreateAsync(string, OnnxModelOptions?, IProgress<double>?, CancellationToken)

Creates an OnnxModel asynchronously, optionally downloading from a URL.

public static Task<OnnxModel<T>> CreateAsync(string modelPath, OnnxModelOptions? options = null, IProgress<double>? progress = null, CancellationToken cancellationToken = default)

Parameters

modelPath string

Local path or URL to the ONNX model.

options OnnxModelOptions

Optional configuration options.

progress IProgress<double>

Optional download progress reporter.

cancellationToken CancellationToken

Cancellation token.

Returns

Task<OnnxModel<T>>

The loaded OnnxModel.

Dispose()

Disposes the ONNX session and releases resources.

public void Dispose()

Dispose(bool)

Disposes managed and unmanaged resources.

protected virtual void Dispose(bool disposing)

Parameters

disposing bool

True if called from Dispose(), false if from finalizer.

Run(Tensor<T>)

Runs inference with a single input tensor.

public Tensor<T> Run(Tensor<T> input)

Parameters

input Tensor<T>

The input tensor.

Returns

Tensor<T>

The output tensor from the model.

Run(IReadOnlyDictionary<string, Tensor<T>>)

Runs inference with named inputs.

public IReadOnlyDictionary<string, Tensor<T>> Run(IReadOnlyDictionary<string, Tensor<T>> inputs)

Parameters

inputs IReadOnlyDictionary<string, Tensor<T>>

Dictionary mapping input names to tensors.

Returns

IReadOnlyDictionary<string, Tensor<T>>

Dictionary mapping output names to tensors.

Run(IReadOnlyDictionary<string, Tensor<T>>, IEnumerable<string>)

Runs inference with specific output names.

public IReadOnlyDictionary<string, Tensor<T>> Run(IReadOnlyDictionary<string, Tensor<T>> inputs, IEnumerable<string> outputNames)

Parameters

inputs IReadOnlyDictionary<string, Tensor<T>>

The input tensors.

outputNames IEnumerable<string>

The names of outputs to retrieve.

Returns

IReadOnlyDictionary<string, Tensor<T>>

Dictionary of requested outputs.

RunAsync(Tensor<T>, CancellationToken)

Runs inference asynchronously with a single input tensor.

public Task<Tensor<T>> RunAsync(Tensor<T> input, CancellationToken cancellationToken = default)

Parameters

input Tensor<T>

The input tensor.

cancellationToken CancellationToken

Cancellation token.

Returns

Task<Tensor<T>>

The output tensor from the model.

RunAsync(IReadOnlyDictionary<string, Tensor<T>>, CancellationToken)

Runs inference asynchronously with named inputs.

public Task<IReadOnlyDictionary<string, Tensor<T>>> RunAsync(IReadOnlyDictionary<string, Tensor<T>> inputs, CancellationToken cancellationToken = default)

Parameters

inputs IReadOnlyDictionary<string, Tensor<T>>

Dictionary mapping input names to tensors.

cancellationToken CancellationToken

Cancellation token.

Returns

Task<IReadOnlyDictionary<string, Tensor<T>>>

Dictionary mapping output names to tensors.

RunWithLongInput(string, long[])

Runs inference with long integer inputs (useful for token IDs).

public Tensor<T> RunWithLongInput(string inputName, long[] tokenIds)

Parameters

inputName string

The input tensor name.

tokenIds long[]

The token IDs to process.

Returns

Tensor<T>

The output tensor.

WarmUp()

Warms up the model by running a single inference with dummy data. This helps ensure consistent inference times by initializing lazy resources.

public void WarmUp()

WarmUpAsync(CancellationToken)

Warms up the model asynchronously.

public Task WarmUpAsync(CancellationToken cancellationToken = default)

Parameters

cancellationToken CancellationToken

Cancellation token.

Returns

Task