.. DO NOT EDIT. .. THIS FILE WAS AUTOMATICALLY GENERATED BY SPHINX-GALLERY. .. TO MAKE CHANGES, EDIT THE SOURCE PYTHON FILE: .. "auto_tutorial/plot_woe_transformer.py" .. LINE NUMBERS ARE GIVEN BELOW. .. only:: html .. note:: :class: sphx-glr-download-link-note :ref:`Go to the end ` to download the full example code. .. rst-class:: sphx-glr-example-title .. _sphx_glr_auto_tutorial_plot_woe_transformer.py: .. _example-woe-transformer: Converter for WOE ================= WOE means Weights of Evidence. It consists in checking that a feature X belongs to a series of regions - intervals -. The results is the label of every intervals containing the feature. .. index:: WOE, WOETransformer A simple example ++++++++++++++++ X is a vector made of the first ten integers. Class :class:`WOETransformer ` checks that every of them belongs to two intervals, `]1, 3[` (leftright-opened) and `[5, 7]` (left-right-closed). The first interval is associated to weight 55 and and the second one to 107. .. GENERATED FROM PYTHON SOURCE LINES 25-49 .. code-block:: Python import os import numpy as np import pandas as pd from onnx.tools.net_drawer import GetPydotGraph, GetOpNodeProducer from onnxruntime import InferenceSession import matplotlib.pyplot as plt from skl2onnx import to_onnx from skl2onnx.sklapi import WOETransformer # automatically registers the converter for WOETransformer import skl2onnx.sklapi.register # noqa: F401 X = np.arange(10).astype(np.float32).reshape((-1, 1)) intervals = [[(1.0, 3.0, False, False), (5.0, 7.0, True, True)]] weights = [[55, 107]] woe1 = WOETransformer(intervals, onehot=False, weights=weights) woe1.fit(X) prd = woe1.transform(X) df = pd.DataFrame({"X": X.ravel(), "woe": prd.ravel()}) df .. raw:: html

	X	woe
0	0.0	0.0
1	1.0	0.0
2	2.0	55.0
3	3.0	0.0
4	4.0	0.0
5	5.0	107.0
6	6.0	107.0
7	7.0	107.0
8	8.0	0.0
9	9.0	0.0

.. GENERATED FROM PYTHON SOURCE LINES 50-55 One Hot +++++++ The transformer outputs one column with the weights. But it could return one column per interval. .. GENERATED FROM PYTHON SOURCE LINES 55-64 .. code-block:: Python woe2 = WOETransformer(intervals, onehot=True, weights=weights) woe2.fit(X) prd = woe2.transform(X) df = pd.DataFrame(prd) df.columns = ["I1", "I2"] df["X"] = X df .. raw:: html

	I1	I2	X
0	0.0	0.0	0.0
1	0.0	0.0	1.0
2	55.0	0.0	2.0
3	0.0	0.0	3.0
4	0.0	0.0	4.0
5	0.0	107.0	5.0
6	0.0	107.0	6.0
7	0.0	107.0	7.0
8	0.0	0.0	8.0
9	0.0	0.0	9.0

.. GENERATED FROM PYTHON SOURCE LINES 65-67 In that case, weights can be omitted. The output is binary. .. GENERATED FROM PYTHON SOURCE LINES 67-76 .. code-block:: Python woe = WOETransformer(intervals, onehot=True) woe.fit(X) prd = woe.transform(X) df = pd.DataFrame(prd) df.columns = ["I1", "I2"] df["X"] = X df .. raw:: html

	I1	I2	X
0	0.0	0.0	0.0
1	0.0	0.0	1.0
2	1.0	0.0	2.0
3	0.0	0.0	3.0
4	0.0	0.0	4.0
5	0.0	1.0	5.0
6	0.0	1.0	6.0
7	0.0	1.0	7.0
8	0.0	0.0	8.0
9	0.0	0.0	9.0

.. GENERATED FROM PYTHON SOURCE LINES 77-83 Conversion to ONNX ++++++++++++++++++ *skl2onnx* implements a converter for all cases. onehot=False .. GENERATED FROM PYTHON SOURCE LINES 83-87 .. code-block:: Python onx1 = to_onnx(woe1, X) sess = InferenceSession(onx1.SerializeToString(), providers=["CPUExecutionProvider"]) print(sess.run(None, {"X": X})[0]) .. rst-class:: sphx-glr-script-out .. code-block:: none [[ 0.] [ 0.] [ 55.] [ 0.] [ 0.] [107.] [107.] [107.] [ 0.] [ 0.]] .. GENERATED FROM PYTHON SOURCE LINES 88-89 onehot=True .. GENERATED FROM PYTHON SOURCE LINES 89-94 .. code-block:: Python onx2 = to_onnx(woe2, X) sess = InferenceSession(onx2.SerializeToString(), providers=["CPUExecutionProvider"]) print(sess.run(None, {"X": X})[0]) .. rst-class:: sphx-glr-script-out .. code-block:: none [[ 0. 0.] [ 0. 0.] [ 55. 0.] [ 0. 0.] [ 0. 0.] [ 0. 107.] [ 0. 107.] [ 0. 107.] [ 0. 0.] [ 0. 0.]] .. GENERATED FROM PYTHON SOURCE LINES 95-99 ONNX Graphs +++++++++++ onehot=False .. GENERATED FROM PYTHON SOURCE LINES 99-117 .. code-block:: Python pydot_graph = GetPydotGraph( onx1.graph, name=onx1.graph.name, rankdir="TB", node_producer=GetOpNodeProducer( "docstring", color="yellow", fillcolor="yellow", style="filled" ), ) pydot_graph.write_dot("woe1.dot") os.system("dot -O -Gdpi=300 -Tpng woe1.dot") image = plt.imread("woe1.dot.png") fig, ax = plt.subplots(figsize=(10, 10)) ax.imshow(image) ax.axis("off") .. image-sg:: /auto_tutorial/images/sphx_glr_plot_woe_transformer_001.png :alt: plot woe transformer :srcset: /auto_tutorial/images/sphx_glr_plot_woe_transformer_001.png :class: sphx-glr-single-img .. rst-class:: sphx-glr-script-out .. code-block:: none (np.float64(-0.5), np.float64(2674.5), np.float64(3321.5), np.float64(-0.5)) .. GENERATED FROM PYTHON SOURCE LINES 118-119 onehot=True .. GENERATED FROM PYTHON SOURCE LINES 119-137 .. code-block:: Python pydot_graph = GetPydotGraph( onx2.graph, name=onx2.graph.name, rankdir="TB", node_producer=GetOpNodeProducer( "docstring", color="yellow", fillcolor="yellow", style="filled" ), ) pydot_graph.write_dot("woe2.dot") os.system("dot -O -Gdpi=300 -Tpng woe2.dot") image = plt.imread("woe2.dot.png") fig, ax = plt.subplots(figsize=(10, 10)) ax.imshow(image) ax.axis("off") .. image-sg:: /auto_tutorial/images/sphx_glr_plot_woe_transformer_002.png :alt: plot woe transformer :srcset: /auto_tutorial/images/sphx_glr_plot_woe_transformer_002.png :class: sphx-glr-single-img .. rst-class:: sphx-glr-script-out .. code-block:: none (np.float64(-0.5), np.float64(2743.5), np.float64(5696.5), np.float64(-0.5)) .. GENERATED FROM PYTHON SOURCE LINES 138-143 Half-line +++++++++ An interval may have only one extremity defined and the other can be infinite. .. GENERATED FROM PYTHON SOURCE LINES 143-153 .. code-block:: Python intervals = [[(-np.inf, 3.0, True, True), (5.0, np.inf, True, True)]] weights = [[55, 107]] woe1 = WOETransformer(intervals, onehot=False, weights=weights) woe1.fit(X) prd = woe1.transform(X) df = pd.DataFrame({"X": X.ravel(), "woe": prd.ravel()}) df .. raw:: html

	X	woe
0	0.0	55.0
1	1.0	55.0
2	2.0	55.0
3	3.0	55.0
4	4.0	0.0
5	5.0	107.0
6	6.0	107.0
7	7.0	107.0
8	8.0	107.0
9	9.0	107.0

.. GENERATED FROM PYTHON SOURCE LINES 154-155 And the conversion to ONNX using the same instruction. .. GENERATED FROM PYTHON SOURCE LINES 155-159 .. code-block:: Python onxinf = to_onnx(woe1, X) sess = InferenceSession(onxinf.SerializeToString(), providers=["CPUExecutionProvider"]) print(sess.run(None, {"X": X})[0]) .. rst-class:: sphx-glr-script-out .. code-block:: none [[ 55.] [ 55.] [ 55.] [ 55.] [ 0.] [107.] [107.] [107.] [107.] [107.]] .. rst-class:: sphx-glr-timing **Total running time of the script:** (0 minutes 2.590 seconds) .. _sphx_glr_download_auto_tutorial_plot_woe_transformer.py: .. only:: html .. container:: sphx-glr-footer sphx-glr-footer-example .. container:: sphx-glr-download sphx-glr-download-jupyter :download:`Download Jupyter notebook: plot_woe_transformer.ipynb ` .. container:: sphx-glr-download sphx-glr-download-python :download:`Download Python source code: plot_woe_transformer.py ` .. container:: sphx-glr-download sphx-glr-download-zip :download:`Download zipped: plot_woe_transformer.zip ` .. only:: html .. rst-class:: sphx-glr-signature `Gallery generated by Sphinx-Gallery `_

	I1	I2	X
0	0.0	0.0	0.0
1	0.0	0.0	1.0
2	1.0	0.0	2.0
3	0.0	0.0	3.0
4	0.0	0.0	4.0
5	0.0	1.0	5.0
6	0.0	1.0	6.0
7	0.0	1.0	7.0
8	0.0	0.0	8.0
9	0.0	0.0	9.0

	I1	I2	X
0	0.0	0.0	0.0
1	0.0	0.0	1.0
2	1.0	0.0	2.0
3	0.0	0.0	3.0
4	0.0	0.0	4.0
5	0.0	1.0	5.0
6	0.0	1.0	6.0
7	0.0	1.0	7.0
8	0.0	0.0	8.0
9	0.0	0.0	9.0

	I1	I2	X
0	0.0	0.0	0.0
1	0.0	0.0	1.0
2	1.0	0.0	2.0
3	0.0	0.0	3.0
4	0.0	0.0	4.0
5	0.0	1.0	5.0
6	0.0	1.0	6.0
7	0.0	1.0	7.0
8	0.0	0.0	8.0
9	0.0	0.0	9.0