save_safetensors¶
- onnx_ir.save_safetensors(model, path, /, *, format=None, size_threshold_bytes=256, max_shard_size_bytes=None, callback=None)¶
Save an ONNX model to a file with external data in a safetensors file.
The model object is unmodified after this operation.
When sharding is enabled, multiple safetensors files will be created with names like “model-00001-of-00003.safetensors”, and an index file “model.safetensors.index.json” will be created to map tensors to their respective shard files. The shards will be created only if the total size of tensors exceeds the specified max_shard_size_bytes.
Note
Because the safetensors data format uses key-value mapping to store tensors, all initializer names in the model (across subgraphs) must be unique. Externalizing tensor attributes in nodes to safetensors files is currently not supported. If you have tensors from Constant nodes that you want to externalize, consider converting them to initializers first with
LiftConstantsToInitializersPass.Example:
import onnx_ir as ir model = ir.load("model.onnx") # Save model with tensors larger than 100 bytes to safetensors external data, # sharding files larger than 5GB. ir.save_safetensors( model, "model.onnx", size_threshold_bytes=100, max_shard_size_bytes=int(5 * 1000**3), # Shard safetensors files larger than 5GB )
Tip
A simple progress bar can be implemented by passing a callback function as the following:
import onnx_ir as ir import tqdm with tqdm.tqdm() as pbar: total_set = False def callback(tensor: ir.TensorProtocol, metadata: ir.external_data.CallbackInfo) -> None: nonlocal total_set if not total_set: pbar.total = metadata.total total_set = True pbar.update() pbar.set_description(f"Saving {metadata.filename}: {tensor.name} ({tensor.dtype}, {tensor.shape})") ir.save_safetensors( ..., callback=callback, )
Added in version 0.1.15.
- Parameters:
model (Model) – ONNX model to save.
path (str | PathLike) – Path to the ONNX model file. E.g. “model.onnx”.
format (str | None) – The format of the file (e.g.
protobuf,textproto,json, etc.). If None, the format is inferred from the file extension.size_threshold_bytes (int) – Save to external data if the tensor size in bytes is not smaller than this threshold.
max_shard_size_bytes (int | None) – Maximum size in bytes (as int) a safetensors file before being sharded. If None, no sharding is performed.
callback (Callable[[TensorProtocol, CallbackInfo], None] | None) – A callback function that is called after each tensor is saved. The callback must have signature
Callable[[ir.TensorProtocol, ir.external_data.CallbackInfo], None], where the first argument is the tensor being saved and the second contains metadata such as filename and progress.
- Raises:
ValueError – If duplicate initializer names are found in the model.
- Return type:
None