.. DO NOT EDIT. .. THIS FILE WAS AUTOMATICALLY GENERATED BY SPHINX-GALLERY. .. TO MAKE CHANGES, EDIT THE SOURCE PYTHON FILE: .. "auto_tutorial/plot_woe_transformer.py" .. LINE NUMBERS ARE GIVEN BELOW. .. only:: html .. note:: :class: sphx-glr-download-link-note :ref:`Go to the end ` to download the full example code .. rst-class:: sphx-glr-example-title .. _sphx_glr_auto_tutorial_plot_woe_transformer.py: .. _example-woe-transformer: Converter for WOE ================= WOE means Weights of Evidence. It consists in checking that a feature X belongs to a series of regions - intervals -. The results is the label of every intervals containing the feature. .. index:: WOE, WOETransformer A simple example ++++++++++++++++ X is a vector made of the first ten integers. Class :class:`WOETransformer ` checks that every of them belongs to two intervals, `]1, 3[` (leftright-opened) and `[5, 7]` (left-right-closed). The first interval is associated to weight 55 and and the second one to 107. .. GENERATED FROM PYTHON SOURCE LINES 25-48 .. code-block:: default import os import numpy as np import pandas as pd from onnx.tools.net_drawer import GetPydotGraph, GetOpNodeProducer from onnxruntime import InferenceSession import matplotlib.pyplot as plt from skl2onnx import to_onnx from skl2onnx.sklapi import WOETransformer # automatically registers the converter for WOETransformer import skl2onnx.sklapi.register # noqa X = np.arange(10).astype(np.float32).reshape((-1, 1)) intervals = [[(1.0, 3.0, False, False), (5.0, 7.0, True, True)]] weights = [[55, 107]] woe1 = WOETransformer(intervals, onehot=False, weights=weights) woe1.fit(X) prd = woe1.transform(X) df = pd.DataFrame({"X": X.ravel(), "woe": prd.ravel()}) df .. raw:: html
X woe
0 0.0 0.0
1 1.0 0.0
2 2.0 55.0
3 3.0 0.0
4 4.0 0.0
5 5.0 107.0
6 6.0 107.0
7 7.0 107.0
8 8.0 0.0
9 9.0 0.0


.. GENERATED FROM PYTHON SOURCE LINES 49-54 One Hot +++++++ The transformer outputs one column with the weights. But it could return one column per interval. .. GENERATED FROM PYTHON SOURCE LINES 54-63 .. code-block:: default woe2 = WOETransformer(intervals, onehot=True, weights=weights) woe2.fit(X) prd = woe2.transform(X) df = pd.DataFrame(prd) df.columns = ["I1", "I2"] df["X"] = X df .. raw:: html
I1 I2 X
0 0.0 0.0 0.0
1 0.0 0.0 1.0
2 55.0 0.0 2.0
3 0.0 0.0 3.0
4 0.0 0.0 4.0
5 0.0 107.0 5.0
6 0.0 107.0 6.0
7 0.0 107.0 7.0
8 0.0 0.0 8.0
9 0.0 0.0 9.0


.. GENERATED FROM PYTHON SOURCE LINES 64-66 In that case, weights can be omitted. The output is binary. .. GENERATED FROM PYTHON SOURCE LINES 66-75 .. code-block:: default woe = WOETransformer(intervals, onehot=True) woe.fit(X) prd = woe.transform(X) df = pd.DataFrame(prd) df.columns = ["I1", "I2"] df["X"] = X df .. raw:: html
I1 I2 X
0 0.0 0.0 0.0
1 0.0 0.0 1.0
2 1.0 0.0 2.0
3 0.0 0.0 3.0
4 0.0 0.0 4.0
5 0.0 1.0 5.0
6 0.0 1.0 6.0
7 0.0 1.0 7.0
8 0.0 0.0 8.0
9 0.0 0.0 9.0


.. GENERATED FROM PYTHON SOURCE LINES 76-82 Conversion to ONNX ++++++++++++++++++ *skl2onnx* implements a converter for all cases. onehot=False .. GENERATED FROM PYTHON SOURCE LINES 82-86 .. code-block:: default onx1 = to_onnx(woe1, X) sess = InferenceSession(onx1.SerializeToString(), providers=["CPUExecutionProvider"]) print(sess.run(None, {"X": X})[0]) .. rst-class:: sphx-glr-script-out .. code-block:: none [[ 0.] [ 0.] [ 55.] [ 0.] [ 0.] [107.] [107.] [107.] [ 0.] [ 0.]] .. GENERATED FROM PYTHON SOURCE LINES 87-88 onehot=True .. GENERATED FROM PYTHON SOURCE LINES 88-93 .. code-block:: default onx2 = to_onnx(woe2, X) sess = InferenceSession(onx2.SerializeToString(), providers=["CPUExecutionProvider"]) print(sess.run(None, {"X": X})[0]) .. rst-class:: sphx-glr-script-out .. code-block:: none [[ 0. 0.] [ 0. 0.] [ 55. 0.] [ 0. 0.] [ 0. 0.] [ 0. 107.] [ 0. 107.] [ 0. 107.] [ 0. 0.] [ 0. 0.]] .. GENERATED FROM PYTHON SOURCE LINES 94-98 ONNX Graphs +++++++++++ onehot=False .. GENERATED FROM PYTHON SOURCE LINES 98-116 .. code-block:: default pydot_graph = GetPydotGraph( onx1.graph, name=onx1.graph.name, rankdir="TB", node_producer=GetOpNodeProducer( "docstring", color="yellow", fillcolor="yellow", style="filled" ), ) pydot_graph.write_dot("woe1.dot") os.system("dot -O -Gdpi=300 -Tpng woe1.dot") image = plt.imread("woe1.dot.png") fig, ax = plt.subplots(figsize=(10, 10)) ax.imshow(image) ax.axis("off") .. image-sg:: /auto_tutorial/images/sphx_glr_plot_woe_transformer_001.png :alt: plot woe transformer :srcset: /auto_tutorial/images/sphx_glr_plot_woe_transformer_001.png :class: sphx-glr-single-img .. rst-class:: sphx-glr-script-out .. code-block:: none (-0.5, 2674.5, 3321.5, -0.5) .. GENERATED FROM PYTHON SOURCE LINES 117-118 onehot=True .. GENERATED FROM PYTHON SOURCE LINES 118-136 .. code-block:: default pydot_graph = GetPydotGraph( onx2.graph, name=onx2.graph.name, rankdir="TB", node_producer=GetOpNodeProducer( "docstring", color="yellow", fillcolor="yellow", style="filled" ), ) pydot_graph.write_dot("woe2.dot") os.system("dot -O -Gdpi=300 -Tpng woe2.dot") image = plt.imread("woe2.dot.png") fig, ax = plt.subplots(figsize=(10, 10)) ax.imshow(image) ax.axis("off") .. image-sg:: /auto_tutorial/images/sphx_glr_plot_woe_transformer_002.png :alt: plot woe transformer :srcset: /auto_tutorial/images/sphx_glr_plot_woe_transformer_002.png :class: sphx-glr-single-img .. rst-class:: sphx-glr-script-out .. code-block:: none (-0.5, 2743.5, 5696.5, -0.5) .. GENERATED FROM PYTHON SOURCE LINES 137-142 Half-line +++++++++ An interval may have only one extremity defined and the other can be infinite. .. GENERATED FROM PYTHON SOURCE LINES 142-152 .. code-block:: default intervals = [[(-np.inf, 3.0, True, True), (5.0, np.inf, True, True)]] weights = [[55, 107]] woe1 = WOETransformer(intervals, onehot=False, weights=weights) woe1.fit(X) prd = woe1.transform(X) df = pd.DataFrame({"X": X.ravel(), "woe": prd.ravel()}) df .. raw:: html
X woe
0 0.0 55.0
1 1.0 55.0
2 2.0 55.0
3 3.0 55.0
4 4.0 0.0
5 5.0 107.0
6 6.0 107.0
7 7.0 107.0
8 8.0 107.0
9 9.0 107.0


.. GENERATED FROM PYTHON SOURCE LINES 153-154 And the conversion to ONNX using the same instruction. .. GENERATED FROM PYTHON SOURCE LINES 154-158 .. code-block:: default onxinf = to_onnx(woe1, X) sess = InferenceSession(onxinf.SerializeToString(), providers=["CPUExecutionProvider"]) print(sess.run(None, {"X": X})[0]) .. rst-class:: sphx-glr-script-out .. code-block:: none [[ 55.] [ 55.] [ 55.] [ 55.] [ 0.] [107.] [107.] [107.] [107.] [107.]] .. rst-class:: sphx-glr-timing **Total running time of the script:** (0 minutes 3.955 seconds) .. _sphx_glr_download_auto_tutorial_plot_woe_transformer.py: .. only:: html .. container:: sphx-glr-footer sphx-glr-footer-example .. container:: sphx-glr-download sphx-glr-download-python :download:`Download Python source code: plot_woe_transformer.py ` .. container:: sphx-glr-download sphx-glr-download-jupyter :download:`Download Jupyter notebook: plot_woe_transformer.ipynb ` .. only:: html .. rst-class:: sphx-glr-signature `Gallery generated by Sphinx-Gallery `_