onnx-mlir has a runtime utility to run ONNX models compiled as a shared library
onnx-mlir --EmitLib. The runtime is implemented in C++ by the
(src/Runtime/ExecusionSession.hpp) and has an associated Python binding generated by
Using pybind, a C/C++ binary can be directly imported by the Python interpreter. For onnx-mlir,
such binary is generated by
PyExecutionSession (src/Runtime/PyExecutionSession.hpp) and built
as a shared library to
The module above can be imported normally by the Python interpreter as long as it is in your PYTHONPATH. Another alternative is to create a symbolic link to it in your working directory.
cd <working directory> ln -s <path to PyRuntime> python3
Then, you can use it by:
from PyRuntime import ExecutionSession
The complete interface to ExecutionSession can be seen in the sources mentioned above. However, using the constructor and run method is enough to perform inferences.
def __init__(self, path: str, entry_point: str): """ Args: path: relative or absolute path to your .so model. entry_point: function generated by onnx-mlir to call inferences. Use '_dyn_entry_point_main_graph'. """ def run(self, input: ndarray) -> List[ndarray]: """ Args: input: your model input tensor as a NumPy array. Returns: A list of NumPy arrays, the outputs of your model. """
## Example: PyRuntime and LeNet
import numpy as np from PyRuntime import ExecutionSession model = 'model.so' # LeNet from ONNX Zoo compiled with onnx-mlir session = ExecutionSession(model, "_dyn_entry_point_main_graph") input = np.full((1, 1, 28, 28), 1, np.dtype(np.float32)) outputs = session.run(input) for output in outputs: print(output.shape)