Append onnx nodes to the converted model¶

This example show how to append some onnx nodes to the converted model to produce the desired output. In this case, it removes the second column of the output probabilies.

To be completly accurate, most of the code was generated using a LLM and modified to accomodate with the latest changes.

from sklearn.datasets import load_iris
from sklearn.linear_model import LogisticRegression
from sklearn.model_selection import train_test_split
from skl2onnx import convert_sklearn
from skl2onnx.common.data_types import FloatTensorType
import onnx

iris = load_iris()
X, y = iris.data, iris.target
X_train, X_test, y_train, y_test = train_test_split(X, y)
clr = LogisticRegression(max_iter=500)
clr.fit(X_train, y_train)

LogisticRegression(max_iter=500)

In a Jupyter environment, please rerun this cell to show the HTML representation or trust the notebook.
On GitHub, the HTML representation is unable to render, please try loading this page with nbviewer.org.

model_to_convert refers to the scikit-learn classifier to convert.

model_to_convert = clr  # model to convert
X_test = X_test[:1]  # data used to test or train, one row is enough

Set the output filename for the modified ONNX model

output_filename = "output_file.onnx"  # Replace with your desired output filename

Step 1: Convert the model to ONNX format, disabling the output of labels. Define the input type for the ONNX model. The input type is a float tensor with shape [None, X_test.shape[1]], where None indicates that the number of input samples can be flexible, and X_test.shape[1] is the number of features for each input sample. A “tensor” is essentially a multi-dimensional array, commonly used in machine learning to represent data. A “float tensor” specifically contains floating-point numbers, which are numbers with decimals.

initial_type = [("float_input", FloatTensorType([None, X_test.shape[1]]))]

Convert the model to ONNX format. - target_opset=18 specifies the version of ONNX operators to use. - options={…} sets parameters for the conversion:

“zipmap”: False ensures that the output is a raw array

of probabilities instead of a dictionary.

“output_class_labels”: False ensures that the output contains only probabilities, not class labels.

ONNX (Open Neural Network Exchange) is an open format for representing machine learning models. It allows interoperability between different machine learning frameworks, enabling the use of models across various platforms.

onx = convert_sklearn(
    model_to_convert,
    initial_types=initial_type,
    target_opset={"": 18, "ai.onnx.ml": 3},
    options={
        id(model_to_convert): {"zipmap": False, "output_class_labels": False}
    },  # Ensures the output is only probabilities, not labels
)

Step 2: Load the ONNX model for further modifications if needed Load the ONNX model from the serialized string representation. An ONNX file is essentially a serialized representation of a machine learning model that can be shared and used across different systems.

onnx_model = onnx.load_model_from_string(onx.SerializeToString())

Assuming the first output in this model should be the probability tensor Extract the name of the output tensor representing the probabilities. If there are multiple outputs, select the second one, otherwise, select the first.

prob_output_name = (
    onnx_model.graph.output[1].name
    if len(onnx_model.graph.output) > 1
    else onnx_model.graph.output[0].name
)

Add a Gather node to extract only the probability of the positive class (index 1) Create a tensor to specify the index to gather (index 1), which represents the positive class.

indices = onnx.helper.make_tensor(
    "indices", onnx.TensorProto.INT64, (1,), [1]
)  # Index 1 to gather positive class

Create a “Gather” node in the ONNX graph to extract the probability of the positive class.

inputs: [prob_output_name, “indices”] specify the inputs to this node (probability tensor and index tensor).
outputs: [“positive_class_prob”] specify the name of the output of this node.
axis=1 indicates gathering along the columns (features) of the probability tensor.

A “Gather” node is used to extract specific elements from a tensor. Here, it extracts the probability for the positive class.

gather_node = onnx.helper.make_node(
    "Gather",
    inputs=[prob_output_name, "indices"],
    outputs=["positive_class_prob"],
    axis=1,  # Gather along columns (axis 1)
)

Add the Gather node to the ONNX graph

onnx_model.graph.node.append(gather_node)

input: "probabilities"
input: "indices"
output: "positive_class_prob"
op_type: "Gather"
attribute {
  name: "axis"
  i: 1
  type: INT
}

Add the tensor initializer for indices (needed for the Gather node) Initializers in ONNX are used to define constant tensors that are used in the computation.

onnx_model.graph.initializer.append(indices)

dims: 1
data_type: 7
int64_data: 1
name: "indices"

Remove existing outputs and add only the new output for the positive class probability Clear the existing output definitions to replace them with the new output.

del onnx_model.graph.output[:]

Define new output for the positive class probability Create a new output tensor specification with the name “positive_class_prob”.

positive_class_output = onnx.helper.make_tensor_value_info(
    "positive_class_prob", onnx.TensorProto.FLOAT, [None, 1]
)
onnx_model.graph.output.append(positive_class_output)

name: "positive_class_prob"
type {
  tensor_type {
    elem_type: 1
    shape {
      dim {
      }
      dim {
        dim_value: 1
      }
    }
  }
}

Step 3: Save the modified ONNX model Save the modified ONNX model to the specified output filename. The resulting ONNX file can then be loaded and used in different environments that support ONNX, such as inference servers or other machine learning frameworks.

onnx.save(onnx_model, output_filename)

The model can be printed as follows.

print(onnx.printer.to_text(onnx_model))

<
   ir_version: 8,
   opset_import: ["ai.onnx.ml" : 1, "" : 18, "" : 18],
   producer_name: "skl2onnx",
   producer_version: "1.20.0",
   domain: "ai.onnx",
   model_version: 0,
   doc_string: ""
>
ce422d5209a64a689ea77bed1467b796 (float[?,4] float_input) => (float[?,1] positive_class_prob)
   <int64[1] indices =  {1}>
{
   [LinearClassifier] label, probability_tensor = ai.onnx.ml.LinearClassifier <classlabels_ints: ints = [0, 1, 2], coefficients: floats = [-0.438737, 0.820108, -2.31364, -0.985893, 0.401269, -0.301364, -0.148288, -0.811027, 0.0374686, -0.518744, 2.46193, 1.79692], intercepts: floats = [9.60193, 2.34478, -11.9467], multi_class: int = 1, post_transform: string = "SOFTMAX"> (float_input)
   [Normalizer] probabilities = ai.onnx.ml.Normalizer <norm: string = "L1"> (probability_tensor)
   positive_class_prob = Gather <axis: int = 1> (probabilities, indices)
}

Total running time of the script: (0 minutes 0.080 seconds)

Gallery generated by Sphinx-Gallery

	penalty penalty: {'l1', 'l2', 'elasticnet', None}, default='l2' Specify the norm of the penalty: - `None`: no penalty is added; - `'l2'`: add a L2 penalty term and it is the default choice; - `'l1'`: add a L1 penalty term; - `'elasticnet'`: both L1 and L2 penalty terms are added. .. warning:: Some penalties may not work with some solvers. See the parameter `solver` below, to know the compatibility between the penalty and solver. .. versionadded:: 0.19 l1 penalty with SAGA solver (allowing 'multinomial' + L1) .. deprecated:: 1.8 `penalty` was deprecated in version 1.8 and will be removed in 1.10. Use `l1_ratio` instead. `l1_ratio=0` for `penalty='l2'`, `l1_ratio=1` for `penalty='l1'` and `l1_ratio` set to any float between 0 and 1 for `'penalty='elasticnet'`.	'deprecated'
	C C: float, default=1.0 Inverse of regularization strength; must be a positive float. Like in support vector machines, smaller values specify stronger regularization. `C=np.inf` results in unpenalized logistic regression. For a visual example on the effect of tuning the `C` parameter with an L1 penalty, see: :ref:`sphx_glr_auto_examples_linear_model_plot_logistic_path.py`.	1.0
	l1_ratio l1_ratio: float, default=0.0 The Elastic-Net mixing parameter, with `0 <= l1_ratio <= 1`. Setting `l1_ratio=1` gives a pure L1-penalty, setting `l1_ratio=0` a pure L2-penalty. Any value between 0 and 1 gives an Elastic-Net penalty of the form `l1_ratio * L1 + (1 - l1_ratio) * L2`. .. warning:: Certain values of `l1_ratio`, i.e. some penalties, may not work with some solvers. See the parameter `solver` below, to know the compatibility between the penalty and solver. .. versionchanged:: 1.8 Default value changed from None to 0.0. .. deprecated:: 1.8 `None` is deprecated and will be removed in version 1.10. Always use `l1_ratio` to specify the penalty type.	0.0
	dual dual: bool, default=False Dual (constrained) or primal (regularized, see also :ref:`this equation `) formulation. Dual formulation is only implemented for l2 penalty with liblinear solver. Prefer `dual=False` when n_samples > n_features.	False
	tol tol: float, default=1e-4 Tolerance for stopping criteria.	0.0001
	fit_intercept fit_intercept: bool, default=True Specifies if a constant (a.k.a. bias or intercept) should be added to the decision function.	True
	intercept_scaling intercept_scaling: float, default=1 Useful only when the solver `liblinear` is used and `self.fit_intercept` is set to `True`. In this case, `x` becomes `[x, self.intercept_scaling]`, i.e. a "synthetic" feature with constant value equal to `intercept_scaling` is appended to the instance vector. The intercept becomes ``intercept_scaling * synthetic_feature_weight``. .. note:: The synthetic feature weight is subject to L1 or L2 regularization as all other features. To lessen the effect of regularization on synthetic feature weight (and therefore on the intercept) `intercept_scaling` has to be increased.	1
	class_weight class_weight: dict or 'balanced', default=None Weights associated with classes in the form ``{class_label: weight}``. If not given, all classes are supposed to have weight one. The "balanced" mode uses the values of y to automatically adjust weights inversely proportional to class frequencies in the input data as ``n_samples / (n_classes * np.bincount(y))``. Note that these weights will be multiplied with sample_weight (passed through the fit method) if sample_weight is specified. .. versionadded:: 0.17 class_weight='balanced'	None
	random_state random_state: int, RandomState instance, default=None Used when ``solver`` == 'sag', 'saga' or 'liblinear' to shuffle the data. See :term:`Glossary ` for details.	None
	solver solver: {'lbfgs', 'liblinear', 'newton-cg', 'newton-cholesky', 'sag', 'saga'}, default='lbfgs' Algorithm to use in the optimization problem. Default is 'lbfgs'. To choose a solver, you might want to consider the following aspects: - 'lbfgs' is a good default solver because it works reasonably well for a wide class of problems. - For :term:`multiclass` problems (`n_classes >= 3`), all solvers except 'liblinear' minimize the full multinomial loss, 'liblinear' will raise an error. - 'newton-cholesky' is a good choice for `n_samples` >> `n_features * n_classes`, especially with one-hot encoded categorical features with rare categories. Be aware that the memory usage of this solver has a quadratic dependency on `n_features * n_classes` because it explicitly computes the full Hessian matrix. - For small datasets, 'liblinear' is a good choice, whereas 'sag' and 'saga' are faster for large ones; - 'liblinear' can only handle binary classification by default. To apply a one-versus-rest scheme for the multiclass setting one can wrap it with the :class:`~sklearn.multiclass.OneVsRestClassifier`. .. warning:: The choice of the algorithm depends on the penalty chosen (`l1_ratio=0` for L2-penalty, `l1_ratio=1` for L1-penalty and `0 < l1_ratio < 1` for Elastic-Net) and on (multinomial) multiclass support: ================= ======================== ====================== solver l1_ratio multinomial multiclass ================= ======================== ====================== 'lbfgs' l1_ratio=0 yes 'liblinear' l1_ratio=1 or l1_ratio=0 no 'newton-cg' l1_ratio=0 yes 'newton-cholesky' l1_ratio=0 yes 'sag' l1_ratio=0 yes 'saga' 0<=l1_ratio<=1 yes ================= ======================== ====================== .. note:: 'sag' and 'saga' fast convergence is only guaranteed on features with approximately the same scale. You can preprocess the data with a scaler from :mod:`sklearn.preprocessing`. .. seealso:: Refer to the :ref:`User Guide ` for more information regarding :class:`LogisticRegression` and more specifically the :ref:`Table ` summarizing solver/penalty supports. .. versionadded:: 0.17 Stochastic Average Gradient (SAG) descent solver. Multinomial support in version 0.18. .. versionadded:: 0.19 SAGA solver. .. versionchanged:: 0.22 The default solver changed from 'liblinear' to 'lbfgs' in 0.22. .. versionadded:: 1.2 newton-cholesky solver. Multinomial support in version 1.6.	'lbfgs'
	max_iter max_iter: int, default=100 Maximum number of iterations taken for the solvers to converge.	500
	verbose verbose: int, default=0 For the liblinear and lbfgs solvers set verbose to any positive number for verbosity.	0
	warm_start warm_start: bool, default=False When set to True, reuse the solution of the previous call to fit as initialization, otherwise, just erase the previous solution. Useless for liblinear solver. See :term:`the Glossary `. .. versionadded:: 0.17 warm_start to support lbfgs, newton-cg, sag, saga solvers.	False
	n_jobs n_jobs: int, default=None Does not have any effect. .. deprecated:: 1.8 `n_jobs` is deprecated in version 1.8 and will be removed in 1.10.	None