.. DO NOT EDIT. .. THIS FILE WAS AUTOMATICALLY GENERATED BY SPHINX-GALLERY. .. TO MAKE CHANGES, EDIT THE SOURCE PYTHON FILE: .. "auto_examples/plot_pipeline_xgboost.py" .. LINE NUMBERS ARE GIVEN BELOW. .. only:: html .. note:: :class: sphx-glr-download-link-note :ref:`Go to the end ` to download the full example code. .. rst-class:: sphx-glr-example-title .. _sphx_glr_auto_examples_plot_pipeline_xgboost.py: Convert a pipeline with a XGBoost model ======================================== .. index:: XGBoost *sklearn-onnx* only converts *scikit-learn* models into *ONNX* but many libraries implement *scikit-learn* API so that their models can be included in a *scikit-learn* pipeline. This example considers a pipeline including a *XGBoost* model. *sklearn-onnx* can convert the whole pipeline as long as it knows the converter associated to a *XGBClassifier*. Let's see how to do it. Train a XGBoost classifier ++++++++++++++++++++++++++ .. GENERATED FROM PYTHON SOURCE LINES 20-77 .. code-block:: Python import os import numpy import matplotlib.pyplot as plt import onnx from onnx.tools.net_drawer import GetPydotGraph, GetOpNodeProducer import onnxruntime as rt import sklearn from sklearn.datasets import load_iris from sklearn.pipeline import Pipeline from sklearn.preprocessing import StandardScaler import xgboost from xgboost import XGBClassifier import skl2onnx from skl2onnx.common.data_types import FloatTensorType from skl2onnx import convert_sklearn, update_registered_converter from skl2onnx.common.shape_calculator import ( calculate_linear_classifier_output_shapes, ) import onnxmltools from onnxmltools.convert.xgboost.operator_converters.XGBoost import ( convert_xgboost, ) import onnxmltools.convert.common.data_types data = load_iris() X = data.data[:, :2] y = data.target ind = numpy.arange(X.shape[0]) numpy.random.shuffle(ind) X = X[ind, :].copy() y = y[ind].copy() pipe = Pipeline([("scaler", StandardScaler()), ("lgbm", XGBClassifier(n_estimators=3))]) pipe.fit(X, y) # The conversion fails but it is expected. try: convert_sklearn( pipe, "pipeline_xgboost", [("input", FloatTensorType([None, 2]))], target_opset={"": 12, "ai.onnx.ml": 2}, ) except Exception as e: print(e) # The error message tells no converter was found # for XGBoost models. By default, *sklearn-onnx* # only handles models from *scikit-learn* but it can # be extended to every model following *scikit-learn* # API as long as the module knows there exists a converter # for every model used in a pipeline. That's why # we need to register a converter. .. rst-class:: sphx-glr-script-out .. code-block:: none 'super' object has no attribute '__sklearn_tags__' .. GENERATED FROM PYTHON SOURCE LINES 78-89 Register the converter for XGBClassifier ++++++++++++++++++++++++++++++++++++++++ The converter is implemented in *onnxmltools*: `onnxmltools...XGBoost.py `_. and the shape calculator: `onnxmltools...Classifier.py `_. .. GENERATED FROM PYTHON SOURCE LINES 91-92 Then we import the converter and shape calculator. .. GENERATED FROM PYTHON SOURCE LINES 94-95 Let's register the new converter. .. GENERATED FROM PYTHON SOURCE LINES 95-103 .. code-block:: Python update_registered_converter( XGBClassifier, "XGBoostXGBClassifier", calculate_linear_classifier_output_shapes, convert_xgboost, options={"nocl": [True, False], "zipmap": [True, False, "columns"]}, ) .. GENERATED FROM PYTHON SOURCE LINES 104-106 Convert again +++++++++++++ .. GENERATED FROM PYTHON SOURCE LINES 106-118 .. code-block:: Python model_onnx = convert_sklearn( pipe, "pipeline_xgboost", [("input", FloatTensorType([None, 2]))], target_opset={"": 12, "ai.onnx.ml": 2}, ) # And save. with open("pipeline_xgboost.onnx", "wb") as f: f.write(model_onnx.SerializeToString()) .. rst-class:: sphx-glr-script-out .. code-block:: pytb Traceback (most recent call last): File "/home/xadupre/github/sklearn-onnx/docs/examples/plot_pipeline_xgboost.py", line 107, in model_onnx = convert_sklearn( ^^^^^^^^^^^^^^^^ File "/home/xadupre/github/sklearn-onnx/skl2onnx/convert.py", line 192, in convert_sklearn topology = parse_sklearn_model( ^^^^^^^^^^^^^^^^^^^^ File "/home/xadupre/github/sklearn-onnx/skl2onnx/_parse.py", line 847, in parse_sklearn_model outputs = parse_sklearn( ^^^^^^^^^^^^^^ File "/home/xadupre/github/sklearn-onnx/skl2onnx/_parse.py", line 757, in parse_sklearn res = _parse_sklearn(scope, model, inputs, custom_parsers=custom_parsers) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/home/xadupre/github/sklearn-onnx/skl2onnx/_parse.py", line 688, in _parse_sklearn outputs = sklearn_parsers_map[tmodel]( ^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/home/xadupre/github/sklearn-onnx/skl2onnx/_parse.py", line 295, in _parse_sklearn_pipeline ) and is_classifier(step[1]): ^^^^^^^^^^^^^^^^^^^^^^ File "/home/xadupre/vv/this312/lib/python3.12/site-packages/sklearn/base.py", line 1237, in is_classifier return get_tags(estimator).estimator_type == "classifier" ^^^^^^^^^^^^^^^^^^^ File "/home/xadupre/vv/this312/lib/python3.12/site-packages/sklearn/utils/_tags.py", line 405, in get_tags sklearn_tags_provider[klass] = klass.__sklearn_tags__(estimator) # type: ignore[attr-defined] ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/home/xadupre/vv/this312/lib/python3.12/site-packages/sklearn/base.py", line 540, in __sklearn_tags__ tags = super().__sklearn_tags__() ^^^^^^^^^^^^^^^^^^^^^^^^ AttributeError: 'super' object has no attribute '__sklearn_tags__' .. GENERATED FROM PYTHON SOURCE LINES 119-123 Compare the predictions +++++++++++++++++++++++ Predictions with XGBoost. .. GENERATED FROM PYTHON SOURCE LINES 123-127 .. code-block:: Python print("predict", pipe.predict(X[:5])) print("predict_proba", pipe.predict_proba(X[:1])) .. GENERATED FROM PYTHON SOURCE LINES 128-129 Predictions with onnxruntime. .. GENERATED FROM PYTHON SOURCE LINES 129-135 .. code-block:: Python sess = rt.InferenceSession("pipeline_xgboost.onnx", providers=["CPUExecutionProvider"]) pred_onx = sess.run(None, {"input": X[:5].astype(numpy.float32)}) print("predict", pred_onx[0]) print("predict_proba", pred_onx[1][:1]) .. GENERATED FROM PYTHON SOURCE LINES 136-138 Display the ONNX graph ++++++++++++++++++++++ .. GENERATED FROM PYTHON SOURCE LINES 138-156 .. code-block:: Python pydot_graph = GetPydotGraph( model_onnx.graph, name=model_onnx.graph.name, rankdir="TB", node_producer=GetOpNodeProducer( "docstring", color="yellow", fillcolor="yellow", style="filled" ), ) pydot_graph.write_dot("pipeline.dot") os.system("dot -O -Gdpi=300 -Tpng pipeline.dot") image = plt.imread("pipeline.dot.png") fig, ax = plt.subplots(figsize=(40, 20)) ax.imshow(image) ax.axis("off") .. GENERATED FROM PYTHON SOURCE LINES 157-158 **Versions used for this example** .. GENERATED FROM PYTHON SOURCE LINES 158-166 .. code-block:: Python print("numpy:", numpy.__version__) print("scikit-learn:", sklearn.__version__) print("onnx: ", onnx.__version__) print("onnxruntime: ", rt.__version__) print("skl2onnx: ", skl2onnx.__version__) print("onnxmltools: ", onnxmltools.__version__) print("xgboost: ", xgboost.__version__) .. rst-class:: sphx-glr-timing **Total running time of the script:** (0 minutes 0.161 seconds) .. _sphx_glr_download_auto_examples_plot_pipeline_xgboost.py: .. only:: html .. container:: sphx-glr-footer sphx-glr-footer-example .. container:: sphx-glr-download sphx-glr-download-jupyter :download:`Download Jupyter notebook: plot_pipeline_xgboost.ipynb ` .. container:: sphx-glr-download sphx-glr-download-python :download:`Download Python source code: plot_pipeline_xgboost.py ` .. container:: sphx-glr-download sphx-glr-download-zip :download:`Download zipped: plot_pipeline_xgboost.zip ` .. only:: html .. rst-class:: sphx-glr-signature `Gallery generated by Sphinx-Gallery `_