External Data¶
Loading an ONNX Model with External Data¶
[Default] If the external data is under the same directory of the model, simply use
onnx.load()
import onnx
onnx_model = onnx.load("path/to/the/model.onnx")
If the external data is under another directory, use
load_external_data_for_model()
to specify the directory path and load after usingonnx.load()
import onnx
from onnx.external_data_helper import load_external_data_for_model
onnx_model = onnx.load("path/to/the/model.onnx", load_external_data=False)
load_external_data_for_model(onnx_model, "data/directory/path/")
# Then the onnx_model has loaded the external data from the specific directory
Converting an ONNX Model to External Data¶
import onnx
from onnx.external_data_helper import convert_model_to_external_data
onnx_model = ... # Your model in memory as ModelProto
convert_model_to_external_data(onnx_model, all_tensors_to_one_file=True, location="filename", size_threshold=1024, convert_attribute=False)
# Must be followed by save_model to save the converted model to a specific path
onnx.save_model(onnx_model, "path/to/save/the/model.onnx")
# Then the onnx_model has converted raw data as external data and saved to specific directory
Converting and Saving an ONNX Model to External Data¶
import onnx
onnx_model = ... # Your model in memory as ModelProto
onnx.save_model(onnx_model, "path/to/save/the/model.onnx", save_as_external_data=True, all_tensors_to_one_file=True, location="filename", size_threshold=1024, convert_attribute=False)
# Then the onnx_model has converted raw data as external data and saved to specific directory
onnx.checker for Models with External Data¶
Models with External Data (<2GB)¶
Current checker supports checking models with external data. Specify either loaded onnx model or model path to the checker.
Large models >2GB¶
However, for those models larger than 2GB, please use the model path for onnx.checker and the external data needs to be under the same directory.
import onnx
onnx.checker.check_model("path/to/the/model.onnx")
# onnx.checker.check_model(loaded_onnx_model) will fail if given >2GB model
TensorProto: data_location and external_data fields¶
There are two fields related to the external data in TensorProto message type.
data_location field¶
data_location
field stores the location of data for this tensor. Value MUST be one of:
MESSAGE
- data stored in type-specific fields inside the protobuf message.RAW
- data stored in raw_data field.EXTERNAL
- data stored in an external location as described by external_data field.value
not set - legacy value. Assume data is stored in raw_data (if set) otherwise in message.
external_data field¶
external_data
field stores key-value pairs of strings describing data location
Recognized keys are:
"location"
(required) - file path relative to the filesystem directory where the ONNX protobuf model was stored. Up-directory path components such as … are disallowed and should be stripped when parsing."offset"
(optional) - position of byte at which stored data begins. Integer stored as string. Offset values SHOULD be multiples of the page size (usually 4kb) to enable mmap support. On Windows, offset values SHOULD be multiples of the VirtualAlloc allocation granularity (usually 64kb) to enable memory mapping."length"
(optional) - number of bytes containing data. Integer stored as string."checksum"
(optional) - SHA1 digest of file specified in under ‘location’ key.
After an ONNX file is loaded, all external_data
fields may be updated with an additional key ("basepath")
, which stores the path to the directory from which he ONNX model file was loaded.
External data files¶
Data stored in external data files will be in the same binary bytes string format as is used by the raw_data
field in current ONNX implementations.
Reference https://github.com/onnx/onnx/pull/678