ai.onnx.ml - LabelEncoder¶
LabelEncoder - 4 (ai.onnx.ml)¶
Version¶
name: LabelEncoder (GitHub)
domain:
ai.onnx.mlsince_version:
4function:
Falsesupport_level:
SupportType.COMMONshape inference:
True
This version of the operator has been available since version 4 of domain ai.onnx.ml.
Summary¶
Maps each element in the input tensor to another value.<br> The mapping is determined by the two parallel attributes, ‘keys_’ and ‘values_’ attribute. The i-th value in the specified ‘keys_’ attribute would be mapped to the i-th value in the specified ‘values_’ attribute. It implies that input’s element type and the element type of the specified ‘keys_’ should be identical while the output type is identical to the specified ‘values_’ attribute. Note that the ‘keys_’ and ‘values_’ attributes must have the same length. If an input element can not be found in the specified ‘keys_’ attribute, the ‘default_’ that matches the specified ‘values_’ attribute may be used as its output value. The type of the ‘default_’ attribute must match the ‘values_’ attribute chosen. <br> Let’s consider an example which maps a string tensor to an integer tensor. Assume and ‘keys_strings’ is [“Amy”, “Sally”], ‘values_int64s’ is [5, 6], and ‘default_int64’ is ‘-1’. The input [“Dori”, “Amy”, “Amy”, “Sally”, “Sally”] would be mapped to [-1, 5, 5, 6, 6].<br> Since this operator is an one-to-one mapping, its input and output shapes are the same. Notice that only one of ‘keys_’/’values_*’ can be set.<br> Float keys with value ‘NaN’ match any input ‘NaN’ value regardless of bit value. If a key is repeated, the last key takes precedence.
Attributes¶
default_float - FLOAT (default is
'-0.0'):A float.
default_int64 - INT (default is
'-1'):An integer.
default_string - STRING (default is
'_Unused'):A string.
default_tensor - TENSOR :
A default tensor. {”Unused”} if values* has string type, {-1} if values_* has integral type, and {-0.f} if values_* has float type.
keys_floats - FLOATS :
A list of floats.
keys_int64s - INTS :
A list of ints.
keys_strings - STRINGS :
A list of strings.
keys_tensor - TENSOR :
Keys encoded as a 1D tensor. One and only one of ‘keys_*’s should be set.
values_floats - FLOATS :
A list of floats.
values_int64s - INTS :
A list of ints.
values_strings - STRINGS :
A list of strings.
values_tensor - TENSOR :
Values encoded as a 1D tensor. One and only one of ‘values_*’s should be set.
Inputs¶
X (heterogeneous) - T1:
Input data. It must have the same element type as the keys_* attribute set.
Outputs¶
Y (heterogeneous) - T2:
Output data. This tensor’s element type is based on the values_* attribute set.
Type Constraints¶
T1 in (
tensor(double),tensor(float),tensor(int16),tensor(int32),tensor(int64),tensor(string)):The input type is a tensor of any shape.
T2 in (
tensor(double),tensor(float),tensor(int16),tensor(int32),tensor(int64),tensor(string)):Output type is determined by the specified ‘values_*’ attribute.
Examples¶
_string_int_label_encoder¶
import numpy as np
import onnx
node = onnx.helper.make_node(
"LabelEncoder",
inputs=["X"],
outputs=["Y"],
domain="ai.onnx.ml",
keys_strings=["a", "b", "c"],
values_int64s=[0, 1, 2],
default_int64=42,
)
x = np.array(["a", "b", "d", "c", "g"]).astype(object)
y = np.array([0, 1, 42, 2, 42]).astype(np.int64)
expect(
node,
inputs=[x],
outputs=[y],
name="test_ai_onnx_ml_label_encoder_string_int",
)
node = onnx.helper.make_node(
"LabelEncoder",
inputs=["X"],
outputs=["Y"],
domain="ai.onnx.ml",
keys_strings=["a", "b", "c"],
values_int64s=[0, 1, 2],
)
x = np.array(["a", "b", "d", "c", "g"]).astype(object)
y = np.array([0, 1, -1, 2, -1]).astype(np.int64)
expect(
node,
inputs=[x],
outputs=[y],
name="test_ai_onnx_ml_label_encoder_string_int_no_default",
)
_tensor_based_label_encoder¶
import numpy as np
import onnx
tensor_keys = make_tensor(
"keys_tensor", onnx.TensorProto.STRING, (3,), ["a", "b", "c"]
)
repeated_string_keys = ["a", "b", "c"]
x = np.array(["a", "b", "d", "c", "g"]).astype(object)
y = np.array([0, 1, 42, 2, 42]).astype(np.int16)
node = onnx.helper.make_node(
"LabelEncoder",
inputs=["X"],
outputs=["Y"],
domain="ai.onnx.ml",
keys_tensor=tensor_keys,
values_tensor=make_tensor(
"values_tensor", onnx.TensorProto.INT16, (3,), [0, 1, 2]
),
default_tensor=make_tensor(
"default_tensor", onnx.TensorProto.INT16, (1,), [42]
),
)
expect(
node,
inputs=[x],
outputs=[y],
name="test_ai_onnx_ml_label_encoder_tensor_mapping",
)
node = onnx.helper.make_node(
"LabelEncoder",
inputs=["X"],
outputs=["Y"],
domain="ai.onnx.ml",
keys_strings=repeated_string_keys,
values_tensor=make_tensor(
"values_tensor", onnx.TensorProto.INT16, (3,), [0, 1, 2]
),
default_tensor=make_tensor(
"default_tensor", onnx.TensorProto.INT16, (1,), [42]
),
)
expect(
node,
inputs=[x],
outputs=[y],
name="test_ai_onnx_ml_label_encoder_tensor_value_only_mapping",
)
LabelEncoder - 2 (ai.onnx.ml)¶
Version¶
name: LabelEncoder (GitHub)
domain:
ai.onnx.mlsince_version:
2function:
Falsesupport_level:
SupportType.COMMONshape inference:
True
This version of the operator has been available since version 2 of domain ai.onnx.ml.
Summary¶
Maps each element in the input tensor to another value.<br> The mapping is determined by the two parallel attributes, ‘keys_’ and ‘values_’ attribute. The i-th value in the specified ‘keys_’ attribute would be mapped to the i-th value in the specified ‘values_’ attribute. It implies that input’s element type and the element type of the specified ‘keys_’ should be identical while the output type is identical to the specified ‘values_’ attribute. If an input element can not be found in the specified ‘keys_’ attribute, the ‘default_’ that matches the specified ‘values_’ attribute may be used as its output value.<br> Let’s consider an example which maps a string tensor to an integer tensor. Assume and ‘keys_strings’ is [“Amy”, “Sally”], ‘values_int64s’ is [5, 6], and ‘default_int64’ is ‘-1’. The input [“Dori”, “Amy”, “Amy”, “Sally”, “Sally”] would be mapped to [-1, 5, 5, 6, 6].<br> Since this operator is an one-to-one mapping, its input and output shapes are the same. Notice that only one of ‘keys_’/’values_’ can be set.<br> For key look-up, bit-wise comparison is used so even a float NaN can be mapped to a value in ‘values_’ attribute.<br>
Attributes¶
default_float - FLOAT (default is
'-0.0'):A float.
default_int64 - INT (default is
'-1'):An integer.
default_string - STRING (default is
'_Unused'):A string.
keys_floats - FLOATS :
A list of floats.
keys_int64s - INTS :
A list of ints.
keys_strings - STRINGS :
A list of strings. One and only one of ‘keys_*’s should be set.
values_floats - FLOATS :
A list of floats.
values_int64s - INTS :
A list of ints.
values_strings - STRINGS :
A list of strings. One and only one of ‘value_*’s should be set.
Inputs¶
X (heterogeneous) - T1:
Input data. It can be either tensor or scalar.
Outputs¶
Y (heterogeneous) - T2:
Output data.
Type Constraints¶
T1 in (
tensor(float),tensor(int64),tensor(string)):The input type is a tensor of any shape.
T2 in (
tensor(float),tensor(int64),tensor(string)):Output type is determined by the specified ‘values_*’ attribute.
LabelEncoder - 1 (ai.onnx.ml)¶
Version¶
name: LabelEncoder (GitHub)
domain:
ai.onnx.mlsince_version:
1function:
Falsesupport_level:
SupportType.COMMONshape inference:
True
This version of the operator has been available since version 1 of domain ai.onnx.ml.
Summary¶
Converts strings to integers and vice versa.<br> If the string default value is set, it will convert integers to strings. If the int default value is set, it will convert strings to integers.<br> Each operator converts either integers to strings or strings to integers, depending on which default value attribute is provided. Only one default value attribute should be defined.<br> When converting from integers to strings, the string is fetched from the ‘classes_strings’ list, by simple indexing.<br> When converting from strings to integers, the string is looked up in the list and the index at which it is found is used as the converted value.
Attributes¶
classes_strings - STRINGS :
A list of labels.
default_int64 - INT (default is
'-1'):An integer to use when an input string value is not found in the map.<br>One and only one of the ‘default_*’ attributes must be defined.
default_string - STRING (default is
'_Unused'):A string to use when an input integer value is not found in the map.<br>One and only one of the ‘default_*’ attributes must be defined.
Inputs¶
X (heterogeneous) - T1:
Input data.
Outputs¶
Y (heterogeneous) - T2:
Output data. If strings are input, the output values are integers, and vice versa.
Type Constraints¶
T1 in (
tensor(int64),tensor(string)):The input type must be a tensor of integers or strings, of any shape.
T2 in (
tensor(int64),tensor(string)):The output type will be a tensor of strings or integers, and will have the same shape as the input.