.. DO NOT EDIT.
.. THIS FILE WAS AUTOMATICALLY GENERATED BY SPHINX-GALLERY.
.. TO MAKE CHANGES, EDIT THE SOURCE PYTHON FILE:
.. "auto_examples/plot_convert_zipmap.py"
.. LINE NUMBERS ARE GIVEN BELOW.

.. only:: html

    .. note::
        :class: sphx-glr-download-link-note

        :ref:`Go to the end <sphx_glr_download_auto_examples_plot_convert_zipmap.py>`
        to download the full example code.

.. rst-class:: sphx-glr-example-title

.. _sphx_glr_auto_examples_plot_convert_zipmap.py:


.. _l-rf-example-zipmap:

Probabilities as a vector or as a ZipMap
========================================

A classifier usually returns a matrix of probabilities.
By default, *sklearn-onnx* converts that matrix
into a list of dictionaries where each probabily is mapped
to its class id or name. That mechanism retains the class names.
This conversion increases the prediction time and is not
always needed. Let's see how to deactivate this behaviour
on the Iris example.

Train a model and convert it
++++++++++++++++++++++++++++

.. GENERATED FROM PYTHON SOURCE LINES 22-45

.. code-block:: Python


    from timeit import repeat
    import numpy
    import sklearn
    from sklearn.datasets import load_iris
    from sklearn.model_selection import train_test_split
    import onnxruntime as rt
    import onnx
    import skl2onnx
    from skl2onnx.common.data_types import FloatTensorType
    from skl2onnx import convert_sklearn
    from sklearn.linear_model import LogisticRegression

    iris = load_iris()
    X, y = iris.data, iris.target
    X_train, X_test, y_train, y_test = train_test_split(X, y)
    clr = LogisticRegression(max_iter=500)
    clr.fit(X_train, y_train)
    print(clr)

    initial_type = [("float_input", FloatTensorType([None, 4]))]
    onx = convert_sklearn(clr, initial_types=initial_type, target_opset=12)


.. rst-class:: sphx-glr-script-out

 .. code-block:: none

    LogisticRegression(max_iter=500)


.. GENERATED FROM PYTHON SOURCE LINES 46-51

Output type
+++++++++++

Let's confirm the output type of the probabilities
is a list of dictionaries with onnxruntime.

.. GENERATED FROM PYTHON SOURCE LINES 51-58

.. code-block:: Python


    sess = rt.InferenceSession(onx.SerializeToString(), providers=["CPUExecutionProvider"])
    res = sess.run(None, {"float_input": X_test.astype(numpy.float32)})
    print(res[1][:2])
    print("probabilities type:", type(res[1]))
    print("type for the first observations:", type(res[1][0]))


.. rst-class:: sphx-glr-script-out

 .. code-block:: none

    [{0: 0.001847358187660575, 1: 0.694525957107544, 2: 0.3036267161369324}, {0: 5.262259037408512e-06, 1: 0.027987273409962654, 2: 0.9720074534416199}]
    probabilities type: <class 'list'>
    type for the first observations: <class 'dict'>


.. GENERATED FROM PYTHON SOURCE LINES 59-63

Without ZipMap
++++++++++++++

Let's remove the ZipMap operator.

.. GENERATED FROM PYTHON SOURCE LINES 63-78

.. code-block:: Python


    initial_type = [("float_input", FloatTensorType([None, 4]))]
    options = {id(clr): {"zipmap": False}}
    onx2 = convert_sklearn(
        clr, initial_types=initial_type, options=options, target_opset=12
    )

    sess2 = rt.InferenceSession(
        onx2.SerializeToString(), providers=["CPUExecutionProvider"]
    )
    res2 = sess2.run(None, {"float_input": X_test.astype(numpy.float32)})
    print(res2[1][:2])
    print("probabilities type:", type(res2[1]))
    print("type for the first observations:", type(res2[1][0]))


.. rst-class:: sphx-glr-script-out

 .. code-block:: none

    [[1.8473582e-03 6.9452596e-01 3.0362672e-01]
     [5.2622590e-06 2.7987273e-02 9.7200745e-01]]
    probabilities type: <class 'numpy.ndarray'>
    type for the first observations: <class 'numpy.ndarray'>


.. GENERATED FROM PYTHON SOURCE LINES 79-85

One output per class
++++++++++++++++++++

This options removes the final operator ZipMap and splits
the probabilities into columns. The final model produces
one output for the label, and one output per class.

.. GENERATED FROM PYTHON SOURCE LINES 85-103

.. code-block:: Python


    options = {id(clr): {"zipmap": "columns"}}
    onx3 = convert_sklearn(
        clr, initial_types=initial_type, options=options, target_opset=12
    )

    sess3 = rt.InferenceSession(
        onx3.SerializeToString(), providers=["CPUExecutionProvider"]
    )
    res3 = sess3.run(None, {"float_input": X_test.astype(numpy.float32)})
    for i, out in enumerate(sess3.get_outputs()):
        print(
            "output: '{}' shape={} values={}...".format(
                out.name, res3[i].shape, res3[i][:2]
            )
        )


.. rst-class:: sphx-glr-script-out

 .. code-block:: none

    output: 'output_label' shape=(38,) values=[1 2]...
    output: 'i0' shape=(38,) values=[1.8473582e-03 5.2622590e-06]...
    output: 'i1' shape=(38,) values=[0.69452596 0.02798727]...
    output: 'i2' shape=(38,) values=[0.30362672 0.97200745]...


.. GENERATED FROM PYTHON SOURCE LINES 104-106

Let's compare prediction time
+++++++++++++++++++++++++++++

.. GENERATED FROM PYTHON SOURCE LINES 106-125

.. code-block:: Python


    X32 = X_test.astype(numpy.float32)

    print("Time with ZipMap:")
    print(repeat(lambda: sess.run(None, {"float_input": X32}), number=100, repeat=10))

    print("Time without ZipMap:")
    print(repeat(lambda: sess2.run(None, {"float_input": X32}), number=100, repeat=10))

    print("Time without ZipMap but with columns:")
    print(repeat(lambda: sess3.run(None, {"float_input": X32}), number=100, repeat=10))

    # The prediction is much faster without ZipMap
    # on this example.
    # The optimisation is even faster when the classes
    # are described with strings and not integers
    # as the final result (list of dictionaries) may copy
    # many times the same information with onnxruntime.


.. rst-class:: sphx-glr-script-out

 .. code-block:: none

    Time with ZipMap:
    [0.006723120997776277, 0.0033048049990611617, 0.002702183999645058, 0.00259514600111288, 0.002283028999954695, 0.002864047000912251, 0.001961175999895204, 0.0019503219991747756, 0.0020229100009601098, 0.0034202739989268593]
    Time without ZipMap:
    [0.00287627999932738, 0.002934403000836028, 0.002290534997882787, 0.0010132009992958046, 0.0009289869994972833, 0.0010291920007148292, 0.0008970989983936306, 0.0009155870029644575, 0.000863607998326188, 0.0012434320015017875]
    Time without ZipMap but with columns:
    [0.0062263910003821366, 0.005719844000850571, 0.005481091000547167, 0.005109049001475796, 0.002306667000084417, 0.0021617760030494537, 0.0019600490013544913, 0.001912022999022156, 0.0018453459997544996, 0.0017368619992339518]


.. GENERATED FROM PYTHON SOURCE LINES 126-127

**Versions used for this example**

.. GENERATED FROM PYTHON SOURCE LINES 127-133

.. code-block:: Python


    print("numpy:", numpy.__version__)
    print("scikit-learn:", sklearn.__version__)
    print("onnx: ", onnx.__version__)
    print("onnxruntime: ", rt.__version__)
    print("skl2onnx: ", skl2onnx.__version__)


.. rst-class:: sphx-glr-script-out

 .. code-block:: none

    numpy: 2.2.0
    scikit-learn: 1.6.0
    onnx:  1.18.0
    onnxruntime:  1.21.0+cu126
    skl2onnx:  1.18.0


.. rst-class:: sphx-glr-timing

   **Total running time of the script:** (0 minutes 0.199 seconds)


.. _sphx_glr_download_auto_examples_plot_convert_zipmap.py:

.. only:: html

  .. container:: sphx-glr-footer sphx-glr-footer-example

    .. container:: sphx-glr-download sphx-glr-download-jupyter

      :download:`Download Jupyter notebook: plot_convert_zipmap.ipynb <plot_convert_zipmap.ipynb>`

    .. container:: sphx-glr-download sphx-glr-download-python

      :download:`Download Python source code: plot_convert_zipmap.py <plot_convert_zipmap.py>`

    .. container:: sphx-glr-download sphx-glr-download-zip

      :download:`Download zipped: plot_convert_zipmap.zip <plot_convert_zipmap.zip>`


.. only:: html

 .. rst-class:: sphx-glr-signature

    `Gallery generated by Sphinx-Gallery <https://sphinx-gallery.github.io>`_