onnx-mlir

Logo

Representation and Reference Lowering of ONNX Models in MLIR Compiler Infrastructure

View the Project on GitHub onnx/onnx-mlir

How-Tos

Inference Using Python
Inference Using C/C++
Inference Using Java

References

ONNX Dialect
OMTensor C99 Runtime API
OMTensorList C99 Runtime API
OMTensor Java Runtime API
OMTensorList Java Runtime API
Generate ONNX Dialect
About Documentation

Development

Add an Operation
Testing Guidelines
Error Handling
Command-line Options
Instrumentation
Constant Propagation
Add an Accelerator

Tools

Tools

RunONNXModel.py
DocCheck

This project is maintained by onnx

Hosted on GitHub Pages — Theme by orderedlist

A guideline on adding a new custom accelerator

In general, onnx-mlir handles custom accelerators as pluggins which can be turned on/off when building onnx-mlir and compiling a model. The handling is mainly via cmake and we will outline its procedure in this document.

Besides this document, NNPA accelerator can be used as an example that has been deployed in onnx-mlir.

1. Code folder

In onnx-mlir, all code for an accelerator should be put inside a separate folder under src/Accelerators. Thus, the first step to support an accelerator is to create a folder for it inside src/Accelerators.

The folder name will be used as the accelerator name in onnx-mlir. In particular, it is used to

  1. instruct cmake to build the code inside the accelerator folder,
  2. compile a model for the accelerator when using onnx-mlir command, and
  3. enable passes related to the accelerator when using onnx-mlir-opt command.

The folder content is flexible depending on each accelerator. However, we recomment to follow the same structure as the root folder of onnx-mlir as much as possbile. This helps maintain the consitency across the whole project.

1.1 Build accelerators in onnx-mlir

To build accelerators in onnx-mlir, use the cmake variable ONNX_MLIR_ACCELERATORS when building onnx-mlir. ONNX_MLIR_ACCELERATORS accepts a comma-separated list of accelerator names. For example,

$ cd build
$ cmake .. -DONNX_MLIR_ACCELERATORS=accel1,accel2

1.2 Compile a model to run with selected accelerators.

The compiler command onnx-mlir has an option, i.e. --maccel, to compile a model for selected accelerators. --maccel accepts a comma-separated list of accelerator names. For example,

$ onnx-mlir --maccel=accel1,accel2 model.onnx

Only built accelerators can be used with --maccel.

Passes defined by an accelerator can be run or tested via onnx-mlir-opt command by using option --maccel which is similar to --maccel in onnx-mlir (See Sec. 1.2). For example, to call a pass --optimize-data-layout defined by accelerator accel1:

$ onnx-mlir-opt --maccel=accel1 --optimize-data-layout model.mlir

Only built accelerators can be used with --maccel.

2. Code integration

Writing code in MLIR typically involves desiging dialects and passes. So does supporting an accelerator. Thus, to integrate accelerator code into onnx-mlir is to register dialects and passes in onnx-mlir.

We provide a base class onnx_mlir::accel::Accelerator from which users can define an inherited class and write hooks to register dialects and passes.

//===--------------------------------------------------------------------===//
// Hooks for onnx-mlir driver
//===--------------------------------------------------------------------===//

/// Add the transformations necessary to support the accelerator.
virtual void addPasses(mlir::OwningOpRef<mlir::ModuleOp> &module,
    mlir::PassManager &pm,
    onnx_mlir::EmissionTargetType &emissionTarget) const = 0;

//===--------------------------------------------------------------------===//
// Hooks for onnx-mlir-opt driver
//===--------------------------------------------------------------------===//

/// Register the MLIR dialects required to support an accelerator.
virtual void registerDialects(mlir::DialectRegistry &registry) const = 0;

/// Register accelerator transformation passes to make available as
/// command line options.
virtual void registerPasses(int optLevel) const = 0;

//===--------------------------------------------------------------------===//
// Hooks for onnx-to-krnl pass
//===--------------------------------------------------------------------===//

/// Convert TensorType to MemRefType.
/// Acccelators may have special versions of TensorType. If not, override this
/// method and return nullptr.
virtual mlir::MemRefType convertTensorTypeToMemRefType(
    const mlir::TensorType tensorType) const = 0;

/// Define conversion target to be used with ONNXToKrnl.
virtual void conversionTargetONNXToKrnl(
    mlir::ConversionTarget &target) const = 0;

/// Define rewrite patterns to be used with ONNXToKrnl.
virtual void rewritePatternONNXToKrnl(mlir::RewritePatternSet &patterns,
    mlir::TypeConverter &typeConverter, mlir::MLIRContext *ctx) const = 0;

//===--------------------------------------------------------------------===//
// Hooks for krnl-to-llvm pass
//===--------------------------------------------------------------------===//

/// Define conversion target to be used with KrnlToLLVM.
virtual void conversionTargetKrnlToLLVM(
    mlir::ConversionTarget &target) const = 0;

/// Define rewrite patterns to be used with KrnlToLLVM.
virtual void rewritePatternKrnlToLLVM(mlir::RewritePatternSet &patterns,
    mlir::LLVMTypeConverter &typeConverter, mlir::MLIRContext *ctx) const = 0;

Though there are many passes in onnx-mlir, we provide hooks for two passes onnx-to-krnl and krnl-to-llvm only. The reason is that in principal they are the first and the last passes in onnx-mlir. Pass onnx-to-krnl is the place where we can decide which ONNX operators will be run on host (by lowering them to Krnl dialect) or on an accelerator (by lowering them to a dialect defined for the accelerator). Pass krnl-to-llvm is the place where we lower Krnl and accelerator operators to LLVM dialect, e.g. generate assembly code or simply call external APIs for the accelerator. There can have any dialects and passes for the accelerator between onnx-to-krnl and krnl-to-llvm.

For example, for NNPA acclerator, we define ZHigh dialect to be used in onnx-to-krnl and ZLow dialect to be used in krnl-to-llvm.

3. Testing

Tests for accelerators should be put inside the folder test. In particular,