Representation and Reference Lowering of ONNX Models in MLIR Compiler Infrastructure

View the Project on GitHub onnx/onnx-mlir


Using PyRuntime
Perform Inference Using ONNX-MLIR Runtime API


ONNX Dialect
OMTensor C99 Runtime API
OMTensorList C99 Runtime API
Generate ONNX Dialect
About Documentation


Testing Guidelines


debug.py - Debug Numerical Errors
DocCheck - Handling Necessary Code Duplication

This project is maintained by onnx

Hosted on GitHub Pages — Theme by orderedlist

Using PyRuntime

onnx-mlir has a runtime utility to run ONNX models compiled as a shared library by onnx-mlir --EmitLib. The runtime is implemented in C++ by the ExecutionSession class (src/Runtime/ExecusionSession.hpp) and has an associated Python binding generated by pybind library.

PyRuntime Module

Using pybind, a C/C++ binary can be directly imported by the Python interpreter. For onnx-mlir, such binary is generated by PyExecutionSession (src/Runtime/PyExecutionSession.hpp) and built as a shared library to build/lib/PyRuntime.cpython-<target>.so.

Using PyRuntime

The module above can be imported normally by the Python interpreter as long as it is in your PYTHONPATH. Another alternative is to create a symbolic link to it in your working directory.

cd <working directory>
ln -s <path to PyRuntime>

Then, you can use it by:

from PyRuntime import ExecutionSession

The complete interface to ExecutionSession can be seen in the sources mentioned above. However, using the constructor and run method is enough to perform inferences.

def __init__(self, path: str, entry_point: str):
        path: relative or absolute path to your .so model.
        entry_point: function generated by onnx-mlir to call inferences.
            Use '_dyn_entry_point_main_graph'.

def run(self, input: ndarray) -> List[ndarray]:
        input: your model input tensor as a NumPy array.

        A list of NumPy arrays, the outputs of your model.

## Example: PyRuntime and LeNet

  import numpy as np
  from PyRuntime import ExecutionSession

  model = 'model.so' # LeNet from ONNX Zoo compiled with onnx-mlir
  session = ExecutionSession(model, "_dyn_entry_point_main_graph")
  input = np.full((1, 1, 28, 28), 1, np.dtype(np.float32))
  outputs = session.run(input)

  for output in outputs: