(l-onnx-doc-DequantizeLinear)= # DequantizeLinear (l-onnx-op-dequantizelinear-23)= ## DequantizeLinear - 23 ### Version - **name**: [DequantizeLinear (GitHub)](https://github.com/onnx/onnx/blob/main/docs/Operators.md#DequantizeLinear) - **domain**: `main` - **since_version**: `23` - **function**: `False` - **support_level**: `SupportType.COMMON` - **shape inference**: `True` This version of the operator has been available **since version 23**. ### Summary The linear dequantization operator. It consumes a quantized tensor, a scale, and a zero point to compute the full-precision tensor. The dequantization formula is `y = (x - x_zero_point) * x_scale`. `x_scale` and `x_zero_point` must have the same shape, determining the quantization's granularity: a scalar for per-tensor/per-layer quantization, a 1-D tensor for per-axis quantization, or have a rank identical to the input for blocked quantization. See QuantizeLinear for details on quantization granularity. `x_zero_point` and `x` must have the same type. `x` and `y` must have the same shape. In the case of dequantizing `int32`, there's no zero point (zero point is supposed to be 0). `zero-point` is usually not used in the case of float8 types quantization, but the dequantization formula remains the same for consistency, and `x_scale` still determines the output type. ### Attributes * **axis - INT** (default is `'1'`): (Optional) The axis of the dequantizing dimension of the input tensor. Used for per-axis and blocked quantization. Negative value means counting dimensions from the back. Accepted range is `[-r, r-1]` where `r = rank(input)`. * **block_size - INT** (default is `'0'`): (Optional) The size of the quantization block (number of times every scale is replicated). Used only for blocked quantization. The block size is a positive integer. Given `x` shape `(D0, ..., Di, ..., Dn)`, `y_scale` shape `(S0, ... Si, ...Sn)` and `axis=i`, the accepted range is `[ceil(Di/Si), ceil(Di/(Si-1))-1]` ### Inputs Between 2 and 3 inputs. - **x** (heterogeneous) - **T1**: N-D quantized input tensor to be de-quantized. - **x_scale** (heterogeneous) - **T2**: Scale for input `x`. For per-tensor/layer dequantization the scale is a scalar, for per per-axis dequantization it is a 1-D Tensor and for blocked dequantization it has the same shape as the input, except for one dimension in which blocking is performed. - **x_zero_point** (optional, heterogeneous) - **T1**: Zero point for input `x`. Shape must match x_scale. It's optional. Zero point is 0 when it's not specified. ### Outputs - **y** (heterogeneous) - **T2**: N-D full precision output tensor. It has same shape as input `x`. ### Type Constraints * **T1** in ( `tensor(float4e2m1)`, `tensor(float8e4m3fn)`, `tensor(float8e4m3fnuz)`, `tensor(float8e5m2)`, `tensor(float8e5m2fnuz)`, `tensor(int16)`, `tensor(int32)`, `tensor(int4)`, `tensor(int8)`, `tensor(uint16)`, `tensor(uint4)`, `tensor(uint8)` ): The type of the inputs 'x_zero_point' and 'x'. * **T2** in ( `tensor(bfloat16)`, `tensor(float)`, `tensor(float16)` ): 'x_scale' determines the output type. ```{toctree} text_diff_DequantizeLinear_21_23 ``` (l-onnx-op-dequantizelinear-21)= ## DequantizeLinear - 21 ### Version - **name**: [DequantizeLinear (GitHub)](https://github.com/onnx/onnx/blob/main/docs/Operators.md#DequantizeLinear) - **domain**: `main` - **since_version**: `21` - **function**: `False` - **support_level**: `SupportType.COMMON` - **shape inference**: `True` This version of the operator has been available **since version 21**. ### Summary The linear dequantization operator. It consumes a quantized tensor, a scale, and a zero point to compute the full-precision tensor. The dequantization formula is `y = (x - x_zero_point) * x_scale`. `x_scale` and `x_zero_point` must have the same shape, determining the quantization's granularity: a scalar for per-tensor/per-layer quantization, a 1-D tensor for per-axis quantization, or have a rank identical to the input for blocked quantization. See QuantizeLinear for details on quantization granularity. `x_zero_point` and `x` must have the same type. `x` and `y` must have the same shape. In the case of dequantizing `int32`, there's no zero point (zero point is supposed to be 0). `zero-point` is usually not used in the case of float8 types quantization, but the dequantization formula remains the same for consistency, and `x_scale` still determines the output type. ### Attributes * **axis - INT** (default is `'1'`): (Optional) The axis of the dequantizing dimension of the input tensor. Used for per-axis and blocked quantization. Negative value means counting dimensions from the back. Accepted range is `[-r, r-1]` where `r = rank(input)`. * **block_size - INT** (default is `'0'`): (Optional) The size of the quantization block (number of times every scale is replicated). Used only for blocked quantization. The block size is a positive integer. Given `x` shape `(D0, ..., Di, ..., Dn)`, `y_scale` shape `(S0, ... Si, ...Sn)` and `axis=i`, the accepted range is `[ceil(Di/Si), ceil(Di/(Si-1))-1]` ### Inputs Between 2 and 3 inputs. - **x** (heterogeneous) - **T1**: N-D quantized input tensor to be de-quantized. - **x_scale** (heterogeneous) - **T2**: Scale for input `x`. For per-tensor/layer dequantization the scale is a scalar, for per per-axis dequantization it is a 1-D Tensor and for blocked dequantization it has the same shape as the input, except for one dimension in which blocking is performed. - **x_zero_point** (optional, heterogeneous) - **T1**: Zero point for input `x`. Shape must match x_scale. It's optional. Zero point is 0 when it's not specified. ### Outputs - **y** (heterogeneous) - **T2**: N-D full precision output tensor. It has same shape as input `x`. ### Type Constraints * **T1** in ( `tensor(float8e4m3fn)`, `tensor(float8e4m3fnuz)`, `tensor(float8e5m2)`, `tensor(float8e5m2fnuz)`, `tensor(int16)`, `tensor(int32)`, `tensor(int4)`, `tensor(int8)`, `tensor(uint16)`, `tensor(uint4)`, `tensor(uint8)` ): The type of the inputs 'x_zero_point' and 'x'. * **T2** in ( `tensor(bfloat16)`, `tensor(float)`, `tensor(float16)` ): 'x_scale' determines the output type. ```{toctree} text_diff_DequantizeLinear_19_23 text_diff_DequantizeLinear_19_21 ``` (l-onnx-op-dequantizelinear-19)= ## DequantizeLinear - 19 ### Version - **name**: [DequantizeLinear (GitHub)](https://github.com/onnx/onnx/blob/main/docs/Operators.md#DequantizeLinear) - **domain**: `main` - **since_version**: `19` - **function**: `False` - **support_level**: `SupportType.COMMON` - **shape inference**: `True` This version of the operator has been available **since version 19**. ### Summary The linear dequantization operator. It consumes a quantized tensor, a scale, and a zero point to compute the full precision tensor. The dequantization formula is `y = (x - x_zero_point) * x_scale`. `x_scale` and `x_zero_point` must have same shape, and can be either a scalar for per-tensor / per layer quantization, or a 1-D tensor for per-axis quantization. `x_zero_point` and `x` must have same type. `x` and `y` must have same shape. In the case of dequantizing int32, there's no zero point (zero point is supposed to be 0). `zero-point` is usually not used in the case of float8e4m3fn, float8e4m3fnuz, float8e5m2, float8e5m2fnuz quantization, but the dequantization formula remains the same for consistency and 'x_scale' still determines the output type. ### Attributes * **axis - INT** (default is `'1'`): (Optional) The axis of the dequantizing dimension of the input tensor. Used only for per-axis quantization. Negative value means counting dimensions from the back. Accepted range is `[-r, r-1]` where `r = rank(input)`. When the rank of the input is 1, per-tensor quantization is applied, rendering the axis unnecessary in this scenario. ### Inputs Between 2 and 3 inputs. - **x** (heterogeneous) - **T1**: N-D quantized input tensor to be de-quantized. - **x_scale** (heterogeneous) - **T2**: Scale for input 'x'. It can be a scalar, which means a per-tensor/layer dequantization, or a 1-D tensor for per-axis dequantization. - **x_zero_point** (optional, heterogeneous) - **T1**: Zero point for input 'x'. Shape must match x_scale. It's optional. Zero point is 0 when it's not specified. ### Outputs - **y** (heterogeneous) - **T2**: N-D full precision output tensor. It has same shape as input 'x'. ### Type Constraints * **T1** in ( `tensor(float8e4m3fn)`, `tensor(float8e4m3fnuz)`, `tensor(float8e5m2)`, `tensor(float8e5m2fnuz)`, `tensor(int32)`, `tensor(int8)`, `tensor(uint8)` ): Constrain 'x_zero_point' and 'x' to 8-bit integer or float, or /32-bit integer tensor. * **T2** in ( `tensor(bfloat16)`, `tensor(float)`, `tensor(float16)` ): 'x_scale' determines the output type. ```{toctree} text_diff_DequantizeLinear_13_23 text_diff_DequantizeLinear_13_21 text_diff_DequantizeLinear_13_19 ``` (l-onnx-op-dequantizelinear-13)= ## DequantizeLinear - 13 ### Version - **name**: [DequantizeLinear (GitHub)](https://github.com/onnx/onnx/blob/main/docs/Operators.md#DequantizeLinear) - **domain**: `main` - **since_version**: `13` - **function**: `False` - **support_level**: `SupportType.COMMON` - **shape inference**: `True` This version of the operator has been available **since version 13**. ### Summary The linear dequantization operator. It consumes a quantized tensor, a scale, and a zero point to compute the full precision tensor. The dequantization formula is `y = (x - x_zero_point) * x_scale`. `x_scale` and `x_zero_point` must have same shape, and can be either a scalar for per-tensor / per layer quantization, or a 1-D tensor for per-axis quantization. `x_zero_point` and `x` must have same type. `x` and `y` must have same shape. In the case of dequantizing int32, there's no zero point (zero point is supposed to be 0). ### Attributes * **axis - INT** (default is `'1'`): (Optional) The axis of the dequantizing dimension of the input tensor. Ignored for per-tensor quantization. Negative value means counting dimensions from the back. Accepted range is [-r, r-1] where r = rank(input). ### Inputs Between 2 and 3 inputs. - **x** (heterogeneous) - **T**: N-D quantized input tensor to be de-quantized. - **x_scale** (heterogeneous) - **tensor(float)**: Scale for input 'x'. It can be a scalar, which means a per-tensor/layer dequantization, or a 1-D tensor for per-axis dequantization. - **x_zero_point** (optional, heterogeneous) - **T**: Zero point for input 'x'. Shape must match x_scale. It's optional. Zero point is 0 when it's not specified. ### Outputs - **y** (heterogeneous) - **tensor(float)**: N-D full precision output tensor. It has same shape as input 'x'. ### Type Constraints * **T** in ( `tensor(int32)`, `tensor(int8)`, `tensor(uint8)` ): Constrain 'x_zero_point' and 'x' to 8-bit/32-bit integer tensor. ```{toctree} text_diff_DequantizeLinear_10_23 text_diff_DequantizeLinear_10_21 text_diff_DequantizeLinear_10_19 text_diff_DequantizeLinear_10_13 ``` (l-onnx-op-dequantizelinear-10)= ## DequantizeLinear - 10 ### Version - **name**: [DequantizeLinear (GitHub)](https://github.com/onnx/onnx/blob/main/docs/Operators.md#DequantizeLinear) - **domain**: `main` - **since_version**: `10` - **function**: `False` - **support_level**: `SupportType.COMMON` - **shape inference**: `True` This version of the operator has been available **since version 10**. ### Summary The linear dequantization operator. It consumes a quantized tensor, a scale, a zero point to compute the full precision tensor. The dequantization formula is y = (x - x_zero_point) * x_scale. 'x_scale' and 'x_zero_point' are both scalars. 'x_zero_point' and 'x' must have same type. 'x' and 'y' must have same shape. In the case of dequantizing int32, there's no zero point (zero point is supposed to be 0). ### Inputs Between 2 and 3 inputs. - **x** (heterogeneous) - **T**: N-D quantized input tensor to be de-quantized. - **x_scale** (heterogeneous) - **tensor(float)**: Scale for input 'x'. It's a scalar, which means a per-tensor/layer quantization. - **x_zero_point** (optional, heterogeneous) - **T**: Zero point for input 'x'. It's a scalar, which means a per-tensor/layer quantization. It's optional. 0 is the default value when it's not specified. ### Outputs - **y** (heterogeneous) - **tensor(float)**: N-D full precision output tensor. It has same shape as input 'x'. ### Type Constraints * **T** in ( `tensor(int32)`, `tensor(int8)`, `tensor(uint8)` ): Constrain 'x_zero_point' and 'x' to 8-bit/32-bit integer tensor.