save_safetensors¶

onnx_ir.save_safetensors(model, path, /, *, format=None, size_threshold_bytes=256, max_shard_size_bytes=None, callback=None)¶

Save an ONNX model to a file with external data in a safetensors file.

The model object is unmodified after this operation.

When sharding is enabled, multiple safetensors files will be created with names like “model-00001-of-00003.safetensors”, and an index file “model.safetensors.index.json” will be created to map tensors to their respective shard files. The shards will be created only if the total size of tensors exceeds the specified max_shard_size_bytes.

Note

Because the safetensors data format uses key-value mapping to store tensors, all initializer names in the model (across subgraphs) must be unique. Externalizing tensor attributes in nodes to safetensors files is currently not supported. If you have tensors from Constant nodes that you want to externalize, consider converting them to initializers first with LiftConstantsToInitializersPass.

Example:

import onnx_ir as ir

model = ir.load("model.onnx")

# Save model with tensors larger than 100 bytes to safetensors external data,
# sharding files larger than 5GB.
ir.save_safetensors(
    model,
    "model.onnx",
    size_threshold_bytes=100,
    max_shard_size_bytes=int(5 * 1000**3),  # Shard safetensors files larger than 5GB
)

Tip

A simple progress bar can be implemented by passing a callback function as the following:

import onnx_ir as ir
import tqdm

with tqdm.tqdm() as pbar:
    total_set = False

    def callback(tensor: ir.TensorProtocol, metadata: ir.external_data.CallbackInfo) -> None:
        nonlocal total_set
        if not total_set:
            pbar.total = metadata.total
            total_set = True

        pbar.update()
        pbar.set_description(f"Saving {metadata.filename}: {tensor.name} ({tensor.dtype}, {tensor.shape})")

    ir.save_safetensors(
        ...,
        callback=callback,
    )

Added in version 0.1.15.

Parameters:
  • model (Model) – ONNX model to save.

  • path (str | PathLike) – Path to the ONNX model file. E.g. “model.onnx”.

  • format (str | None) – The format of the file (e.g. protobuf, textproto, json, etc.). If None, the format is inferred from the file extension.

  • size_threshold_bytes (int) – Save to external data if the tensor size in bytes is not smaller than this threshold.

  • max_shard_size_bytes (int | None) – Maximum size in bytes (as int) a safetensors file before being sharded. If None, no sharding is performed.

  • callback (Callable[[TensorProtocol, CallbackInfo], None] | None) – A callback function that is called after each tensor is saved. The callback must have signature Callable[[ir.TensorProtocol, ir.external_data.CallbackInfo], None], where the first argument is the tensor being saved and the second contains metadata such as filename and progress.

Raises:

ValueError – If duplicate initializer names are found in the model.

Return type:

None