Benchmark a pipeline
====================

The following example checks up on every step in a pipeline, compares
and benchmarks the predictions.

Create a pipeline
+++++++++++++++++

We reuse the pipeline implemented in example
`Pipelining: chaining a PCA and a logistic regression
<https://scikit-learn.org/stable/auto_examples/compose/plot_digits_pipe.html>`_.
There is one change because `ONNX-ML Imputer
<https://github.com/onnx/onnx/blob/master/docs/
Operators-ml.md#ai.onnx.ml.Imputer>`_
does not handle string type. This cannot be part of the final ONNX
pipeline and must be removed. Look for comment starting with ``---``
below.

.. code-block:: Python

    import skl2onnx
    import onnx
    import sklearn
    import numpy
    from skl2onnx.helpers import collect_intermediate_steps
    from timeit import timeit
    from skl2onnx.helpers import compare_objects
    import onnxruntime as rt
    from onnxconverter_common.data_types import FloatTensorType
    from skl2onnx import convert_sklearn
    import numpy as np
    import pandas as pd
    from sklearn import datasets
    from sklearn.decomposition import PCA
    from sklearn.linear_model import LogisticRegression
    from sklearn.pipeline import Pipeline

    logistic = LogisticRegression()
    pca = PCA()
    pipe = Pipeline(steps=[("pca", pca), ("logistic", logistic)])

    digits = datasets.load_digits()
    X_digits = digits.data[:1000]
    y_digits = digits.target[:1000]

    pipe.fit(X_digits, y_digits)
Pipeline(steps=[('pca', PCA()), ('logistic', LogisticRegression())])
Conversion to ONNX
++++++++++++++++++

.. code-block:: Python

    initial_types = [("input", FloatTensorType((None, X_digits.shape[1])))]
    model_onnx = convert_sklearn(pipe, initial_types=initial_types, target_opset=12)

    sess = rt.InferenceSession(
        model_onnx.SerializeToString(), providers=["CPUExecutionProvider"]
    )

    print("skl predict_proba")
    print(pipe.predict_proba(X_digits[:2]))
    onx_pred = sess.run(None, {"input": X_digits[:2].astype(np.float32)})[1]
    df = pd.DataFrame(onx_pred)
    print("onnx predict_proba")
    print(df.values)

.. code-block:: none

    skl predict_proba
    [[9.99998530e-01 7.81608916e-19 4.87445989e-10 1.79842282e-08
      3.58700554e-10 1.18138025e-06 4.14411051e-08 1.48275027e-07
      2.50162860e-08 5.51240034e-08]
     [1.37889361e-14 9.99999324e-01 9.17867392e-11 8.30390364e-13
      2.57277805e-07 8.84035071e-12 5.11781429e-11 2.83346408e-11
      4.18965301e-07 1.32796353e-13]]
    onnx predict_proba
    [[9.99998569e-01 7.81611026e-19 4.87444585e-10 1.79842026e-08
      3.58700042e-10 1.18137689e-06 4.14409520e-08 1.48274751e-07
      2.50162131e-08 5.51239410e-08]
     [1.37888807e-14 9.99999344e-01 9.17865159e-11 8.30387679e-13
      2.57277748e-07 8.84032951e-12 5.11779785e-11 2.83345725e-11
      4.18964021e-07 1.32796280e-13]]

Comparing outputs
+++++++++++++++++

.. code-block:: Python

    compare_objects(pipe.predict_proba(X_digits[:2]), onx_pred)
    # No exception so they are the same.

Benchmarks
++++++++++

.. code-block:: Python

    print("scikit-learn")
    print(timeit("pipe.predict_proba(X_digits[:1])", number=10000, globals=globals()))

    print("onnxruntime")
    print(
        timeit(
            "sess.run(None, {'input': X_digits[:1].astype(np.float32)})[1]",
            number=10000,
            globals=globals(),
        )
    )

.. code-block:: none

    scikit-learn
    2.0426334000003408
    onnxruntime
    0.2637577000004967

Intermediate steps
++++++++++++++++++

Let's imagine the final output is wrong and we need to look into each
component of the pipeline which one is failing. The following method
modifies the scikit-learn pipeline to steal the intermediate outputs
and produces an smaller ONNX graph for every operator.

.. code-block:: Python

    steps = collect_intermediate_steps(pipe, "pipeline", initial_types)
    assert len(steps) == 2

    pipe.predict_proba(X_digits[:2])
    for i, step in enumerate(steps):
        onnx_step = step["onnx_step"]
        sess = rt.InferenceSession(
            onnx_step.SerializeToString(), providers=["CPUExecutionProvider"]
        )
        onnx_outputs = sess.run(None, {"input": X_digits[:2].astype(np.float32)})
        skl_outputs = step["model"]._debug.outputs
        if "transform" in skl_outputs:
            compare_objects(skl_outputs["transform"], onnx_outputs[0])
            print("benchmark", step["model"].__class__)
            print("scikit-learn")
            print(
                timeit(
                    "step['model'].transform(X_digits[:1])",
                    number=10000,
                    globals=globals()
                )
            )
        else:
            compare_objects(skl_outputs["predict_proba"], onnx_outputs[1])
            print("benchmark", step["model"].__class__)
            print("scikit-learn")
            print(
                timeit(
                    "step['model'].predict_proba(X_digits[:1])",
                    number=10000,
                    globals=globals(),
                )
            )
        print("onnxruntime")
        print(
            timeit(
                "sess.run(None, {'input': X_digits[:1].astype(np.float32)})",
                number=10000,
                globals=globals(),
            )
        )

.. code-block:: none

    benchmark <class 'sklearn.decomposition._pca.PCA'>
    scikit-learn
    0.8991796999998769
    onnxruntime
    0.25503109999954177
    benchmark <class 'sklearn.linear_model._logistic.LogisticRegression'>
    scikit-learn
    1.1041783999990002
    onnxruntime
    0.1891211000001931

**Versions used for this example**

.. code-block:: Python

    print("numpy:", numpy.__version__)
    print("scikit-learn:", sklearn.__version__)
    print("onnx: ", onnx.__version__)
    print("onnxruntime: ", rt.__version__)
    print("skl2onnx: ", skl2onnx.__version__)

.. code-block:: none

    numpy: 1.26.4
    scikit-learn: 1.6.dev0
    onnx: 1.17.0
    onnxruntime: 1.18.0+cu118
    skl2onnx: 1.17.0

**Total running time of the script:** (0 minutes 4.925 seconds)