Class OnnxModel<T>
A wrapper for ONNX models that provides easy-to-use inference with AiDotNet Tensor types.
public class OnnxModel<T> : IOnnxModel<T>, IDisposable
Type Parameters
TThe numeric type used for calculations.
- Inheritance
-
OnnxModel<T>
- Implements
-
IOnnxModel<T>
- Inherited Members
Remarks
This class wraps the ONNX Runtime InferenceSession and provides:
- Automatic tensor conversion between AiDotNet and ONNX formats
- Support for multiple execution providers (CPU, CUDA, TensorRT, DirectML)
- Multi-input/multi-output model support
- Warm-up and async inference
For Beginners: Use this class to run pre-trained ONNX models:
// Load a model
var model = new OnnxModel<float>("model.onnx");
// Run inference
var input = new Tensor<float>([1, 3, 224, 224]);
var output = model.Run(input);
// Don't forget to dispose
model.Dispose();
Constructors
OnnxModel(byte[], OnnxModelOptions?)
Creates a new OnnxModel from a byte array.
public OnnxModel(byte[] modelBytes, OnnxModelOptions? options = null)
Parameters
modelBytesbyte[]The ONNX model as a byte array.
optionsOnnxModelOptionsOptional configuration options.
OnnxModel(string, OnnxModelOptions?)
Creates a new OnnxModel from a file path.
public OnnxModel(string modelPath, OnnxModelOptions? options = null)
Parameters
modelPathstringPath to the ONNX model file.
optionsOnnxModelOptionsOptional configuration options.
Properties
ExecutionProvider
Gets the execution provider currently being used (CPU, CUDA, TensorRT, DirectML).
public string ExecutionProvider { get; }
Property Value
IsLoaded
Gets whether the model has been successfully loaded and is ready for inference.
public bool IsLoaded { get; }
Property Value
Metadata
Gets the metadata about the loaded ONNX model.
public IOnnxModelMetadata Metadata { get; }
Property Value
Methods
CreateAsync(string, OnnxModelOptions?, IProgress<double>?, CancellationToken)
Creates an OnnxModel asynchronously, optionally downloading from a URL.
public static Task<OnnxModel<T>> CreateAsync(string modelPath, OnnxModelOptions? options = null, IProgress<double>? progress = null, CancellationToken cancellationToken = default)
Parameters
modelPathstringLocal path or URL to the ONNX model.
optionsOnnxModelOptionsOptional configuration options.
progressIProgress<double>Optional download progress reporter.
cancellationTokenCancellationTokenCancellation token.
Returns
Dispose()
Disposes the ONNX session and releases resources.
public void Dispose()
Dispose(bool)
Disposes managed and unmanaged resources.
protected virtual void Dispose(bool disposing)
Parameters
disposingboolTrue if called from Dispose(), false if from finalizer.
Run(Tensor<T>)
Runs inference with a single input tensor.
public Tensor<T> Run(Tensor<T> input)
Parameters
inputTensor<T>The input tensor.
Returns
- Tensor<T>
The output tensor from the model.
Run(IReadOnlyDictionary<string, Tensor<T>>)
Runs inference with named inputs.
public IReadOnlyDictionary<string, Tensor<T>> Run(IReadOnlyDictionary<string, Tensor<T>> inputs)
Parameters
inputsIReadOnlyDictionary<string, Tensor<T>>Dictionary mapping input names to tensors.
Returns
- IReadOnlyDictionary<string, Tensor<T>>
Dictionary mapping output names to tensors.
Run(IReadOnlyDictionary<string, Tensor<T>>, IEnumerable<string>)
Runs inference with specific output names.
public IReadOnlyDictionary<string, Tensor<T>> Run(IReadOnlyDictionary<string, Tensor<T>> inputs, IEnumerable<string> outputNames)
Parameters
inputsIReadOnlyDictionary<string, Tensor<T>>The input tensors.
outputNamesIEnumerable<string>The names of outputs to retrieve.
Returns
- IReadOnlyDictionary<string, Tensor<T>>
Dictionary of requested outputs.
RunAsync(Tensor<T>, CancellationToken)
Runs inference asynchronously with a single input tensor.
public Task<Tensor<T>> RunAsync(Tensor<T> input, CancellationToken cancellationToken = default)
Parameters
inputTensor<T>The input tensor.
cancellationTokenCancellationTokenCancellation token.
Returns
- Task<Tensor<T>>
The output tensor from the model.
RunAsync(IReadOnlyDictionary<string, Tensor<T>>, CancellationToken)
Runs inference asynchronously with named inputs.
public Task<IReadOnlyDictionary<string, Tensor<T>>> RunAsync(IReadOnlyDictionary<string, Tensor<T>> inputs, CancellationToken cancellationToken = default)
Parameters
inputsIReadOnlyDictionary<string, Tensor<T>>Dictionary mapping input names to tensors.
cancellationTokenCancellationTokenCancellation token.
Returns
- Task<IReadOnlyDictionary<string, Tensor<T>>>
Dictionary mapping output names to tensors.
RunWithLongInput(string, long[])
Runs inference with long integer inputs (useful for token IDs).
public Tensor<T> RunWithLongInput(string inputName, long[] tokenIds)
Parameters
Returns
- Tensor<T>
The output tensor.
WarmUp()
Warms up the model by running a single inference with dummy data. This helps ensure consistent inference times by initializing lazy resources.
public void WarmUp()
WarmUpAsync(CancellationToken)
Warms up the model asynchronously.
public Task WarmUpAsync(CancellationToken cancellationToken = default)
Parameters
cancellationTokenCancellationTokenCancellation token.