Representation and Reference Lowering of ONNX Models in MLIR Compiler Infrastructure
View the Project on GitHub onnx/onnx-mlir
This project is maintained by onnx
Hosted on GitHub Pages — Theme by orderedlist
onnx-mlir
|
ONNX-MLIR project comes with an executable onnx-mlir
capable of compiling onnx models to a shared library. In this documentation, we demonstrate how to interact programmatically with the compiled shared library using ONNX-MLIR's Runtime API.
OMTensor
is the data structure used to describe the runtime information (rank, shape, data type, etc) associated with a tensor input or output.
OMTensorList
is the data structure used to hold a list of pointers to OMTensor so that they can be passed into and out of the compiled model as inputs and outputs.
OMEntryPoint
is the data structure used to return all entry point names in a model. These entry point names are the symbols of the inference functions in the model.
OMSignature
is the data structure used to return the output signature of the given entry point as a JSON string.
All compiled models will have the same exact C function signature equivalent to:
Intuitively, the model takes a list of tensors as input and returns a list of tensors as output.
API
We demonstrate using the API functions to run a simple ONNX model consisting of an add operation. To create such an onnx model, use this python script
To compile the above model, run onnx-mlir add.onnx
and a binary library "add.so" should appear. We can use the following C code to call into the compiled function computing the sum of two inputs:
Compile with gcc main.c add.so -o add
, you should see an executable add
appearing. Run it, and the output should be:
Exactly as it should be.
In general, if a caller creates a tensor object (omTensorCreate), they are responsible for deallocating the data buffer separately after the tensor is destroyed. If onnx-mlir creates the tensor (run_main_graph), then the tensor object owns the data buffer and it is freed automatically when the tensor is destroyed.
This default behavior can be changed. When creating a tensor, a user may use omTensorCreateWithOwnership to explicitly set data buffer ownership. Additionally, after a tenor is created, omTensorSetOwning can be used to change the ownership setting.
When omTensorDestroy is called, if the ownership flag is set to "true", then the destruction of the tensor will also free any associated data buffer memory. If the ownership flag is set to "false", then the user is responsible for freeing the data buffer memory after destroying the tensor.
For tensor list objects, when omTensorListDestory is called, omTensorDestory is called on all tensors the list contained. The data buffer of each tensor is freed based on each tensor's ownership setting.
To destroy a TensorList without automatically destorying the tensors it contained, use omTensorListDestroyShallow.
For full reference to available C Runtime API, refer to include/onnx-mlir/Runtime/OMTensor.h
and include/onnx-mlir/Runtime/OMTensorList.h
.