(l-onnx-doc-DequantizeLinear)=

# DequantizeLinear


(l-onnx-op-dequantizelinear-24)=

## DequantizeLinear - 24

### Version

- **name**: [DequantizeLinear (GitHub)](https://github.com/onnx/onnx/blob/main/docs/Operators.md#DequantizeLinear)
- **domain**: `main`
- **since_version**: `24`
- **function**: `False`
- **support_level**: `SupportType.COMMON`
- **shape inference**: `True`

This version of the operator has been available
**since version 24**.

### Summary

The linear dequantization operator. It consumes a quantized tensor, a scale, and a zero point to compute the
full-precision tensor. The dequantization formula is `y = (x - x_zero_point) * x_scale`. `x_scale` and `x_zero_point`
must have the same shape, determining the quantization's granularity: a scalar for per-tensor/per-layer quantization,
a 1-D tensor for per-axis quantization, or have a rank identical to the input for blocked quantization.
See QuantizeLinear for details on quantization granularity.

`x_zero_point` and `x` must have the same type. `x` and `y` must have the same shape. In the case of dequantizing
`int32`, there's no zero point (zero point is supposed to be 0).
`zero-point` is usually not used in the case of float8 and 4-bit types quantization, but the dequantization formula remains the same
for consistency. The output type is determined by the attribute `output_dtype`. If `output_dtype` is not supplied then the output type
is the same as `x_scale`. The output type also determines the precision of the multiplication operation.

### Attributes

* **axis - INT** (default is `'1'`):

  (Optional) The axis of the dequantizing dimension of the input tensor. Used for per-axis and blocked quantization. Negative value means counting dimensions from the back. Accepted range is `[-r, r-1]` where `r = rank(input)`.

* **block_size - INT** (default is `'0'`):

  (Optional) The size of the quantization block (number of times every scale is replicated). Used only for blocked quantization. The block size is a positive integer. Given `x` shape `(D0, ..., Di, ..., Dn)`, `y_scale` shape `(S0, ... Si, ...Sn)` and `axis=i`, the accepted range is `[ceil(Di/Si), ceil(Di/(Si-1))-1]`

* **output_dtype - INT** (default is `'0'`):

  (Optional) The output data type. If not supplied, the output data type is inferred from `x_scale` data type (`T2`)

### Inputs

Between 2 and 3 inputs.

- **x** (heterogeneous) - **T1**:

  N-D quantized input tensor to be de-quantized.
- **x_scale** (heterogeneous) - **T2**:

  Scale for input `x`. For per-tensor/layer dequantization the scale is a scalar, for per per-axis dequantization it is a 1-D Tensor and for blocked dequantization it has the same shape as the input, except for one dimension in which blocking is performed.
- **x_zero_point** (optional, heterogeneous) - **T1**:

  Zero point for input `x`. Shape must match x_scale. It's optional. Zero point is 0 when it's not specified.

### Outputs

- **y** (heterogeneous) - **T3**:

  N-D full precision output tensor. It has the same shape as input `x`. The data type is specified by the `output_dtype` attribute or, in its absence, the type of `x_scale`.

### Type Constraints

* **T1** in ( `tensor(float4e2m1)`, `tensor(float8e4m3fn)`, `tensor(float8e4m3fnuz)`, `tensor(float8e5m2)`, `tensor(float8e5m2fnuz)`, `tensor(int16)`, `tensor(int32)`, `tensor(int4)`, `tensor(int8)`, `tensor(uint16)`, `tensor(uint4)`, `tensor(uint8)` ):

  The type of the inputs 'x_zero_point' and 'x'.
* **T2** in ( `tensor(bfloat16)`, `tensor(float)`, `tensor(float16)`, `tensor(float8e8m0)` ):

  The type of the input 'x_scale'.
* **T3** in ( `tensor(bfloat16)`, `tensor(float)`, `tensor(float16)` ):

  The type of the output 'y'.

```{toctree}
text_diff_DequantizeLinear_23_24
```

(l-onnx-op-dequantizelinear-23)=

## DequantizeLinear - 23

### Version

- **name**: [DequantizeLinear (GitHub)](https://github.com/onnx/onnx/blob/main/docs/Operators.md#DequantizeLinear)
- **domain**: `main`
- **since_version**: `23`
- **function**: `False`
- **support_level**: `SupportType.COMMON`
- **shape inference**: `True`

This version of the operator has been available
**since version 23**.

### Summary

The linear dequantization operator. It consumes a quantized tensor, a scale, and a zero point to compute the
full-precision tensor. The dequantization formula is `y = (x - x_zero_point) * x_scale`. `x_scale` and `x_zero_point`
must have the same shape, determining the quantization's granularity: a scalar for per-tensor/per-layer quantization,
a 1-D tensor for per-axis quantization, or have a rank identical to the input for blocked quantization.
See QuantizeLinear for details on quantization granularity.

`x_zero_point` and `x` must have the same type. `x` and `y` must have the same shape. In the case of dequantizing
`int32`, there's no zero point (zero point is supposed to be 0).
`zero-point` is usually not used in the case of float8 and 4-bit types quantization, but the dequantization formula remains the same
for consistency. The output type is determined by the attribute `output_dtype`. If `output_dtype` is not supplied then the output type
is the same as `x_scale`. The output type also determines the precision of the multiplication operation.

### Attributes

* **axis - INT** (default is `'1'`):

  (Optional) The axis of the dequantizing dimension of the input tensor. Used for per-axis and blocked quantization. Negative value means counting dimensions from the back. Accepted range is `[-r, r-1]` where `r = rank(input)`.

* **block_size - INT** (default is `'0'`):

  (Optional) The size of the quantization block (number of times every scale is replicated). Used only for blocked quantization. The block size is a positive integer. Given `x` shape `(D0, ..., Di, ..., Dn)`, `y_scale` shape `(S0, ... Si, ...Sn)` and `axis=i`, the accepted range is `[ceil(Di/Si), ceil(Di/(Si-1))-1]`

* **output_dtype - INT** (default is `'0'`):

  (Optional) The output data type. If not supplied, the output data type is inferred from `x_scale` data type (`T2`)

### Inputs

Between 2 and 3 inputs.

- **x** (heterogeneous) - **T1**:

  N-D quantized input tensor to be de-quantized.
- **x_scale** (heterogeneous) - **T2**:

  Scale for input `x`. For per-tensor/layer dequantization the scale is a scalar, for per per-axis dequantization it is a 1-D Tensor and for blocked dequantization it has the same shape as the input, except for one dimension in which blocking is performed.
- **x_zero_point** (optional, heterogeneous) - **T1**:

  Zero point for input `x`. Shape must match x_scale. It's optional. Zero point is 0 when it's not specified.

### Outputs

- **y** (heterogeneous) - **T3**:

  N-D full precision output tensor. It has the same shape as input `x`. The data type is specified by the `output_dtype` attribute or, in its absence, the type of `x_scale`.

### Type Constraints

* **T1** in ( `tensor(float4e2m1)`, `tensor(float8e4m3fn)`, `tensor(float8e4m3fnuz)`, `tensor(float8e5m2)`, `tensor(float8e5m2fnuz)`, `tensor(int16)`, `tensor(int32)`, `tensor(int4)`, `tensor(int8)`, `tensor(uint16)`, `tensor(uint4)`, `tensor(uint8)` ):

  The type of the inputs 'x_zero_point' and 'x'.
* **T2** in ( `tensor(bfloat16)`, `tensor(float)`, `tensor(float16)` ):

  The type of the input 'x_scale'.
* **T3** in ( `tensor(bfloat16)`, `tensor(float)`, `tensor(float16)` ):

  The type of the output 'y'.

```{toctree}
text_diff_DequantizeLinear_21_24
text_diff_DequantizeLinear_21_23
```

(l-onnx-op-dequantizelinear-21)=

## DequantizeLinear - 21

### Version

- **name**: [DequantizeLinear (GitHub)](https://github.com/onnx/onnx/blob/main/docs/Operators.md#DequantizeLinear)
- **domain**: `main`
- **since_version**: `21`
- **function**: `False`
- **support_level**: `SupportType.COMMON`
- **shape inference**: `True`

This version of the operator has been available
**since version 21**.

### Summary

The linear dequantization operator. It consumes a quantized tensor, a scale, and a zero point to compute the
full-precision tensor. The dequantization formula is `y = (x - x_zero_point) * x_scale`. `x_scale` and `x_zero_point`
must have the same shape, determining the quantization's granularity: a scalar for per-tensor/per-layer quantization,
a 1-D tensor for per-axis quantization, or have a rank identical to the input for blocked quantization.
See QuantizeLinear for details on quantization granularity.
`x_zero_point` and `x` must have the same type. `x` and `y` must have the same shape. In the case of dequantizing
`int32`, there's no zero point (zero point is supposed to be 0).
`zero-point` is usually not used in the case of float8 types quantization, but the dequantization formula remains the same
for consistency, and `x_scale` still determines the output type.

### Attributes

* **axis - INT** (default is `'1'`):

  (Optional) The axis of the dequantizing dimension of the input tensor. Used for per-axis and blocked quantization. Negative value means counting dimensions from the back. Accepted range is `[-r, r-1]` where `r = rank(input)`.

* **block_size - INT** (default is `'0'`):

  (Optional) The size of the quantization block (number of times every scale is replicated). Used only for blocked quantization. The block size is a positive integer. Given `x` shape `(D0, ..., Di, ..., Dn)`, `y_scale` shape `(S0, ... Si, ...Sn)` and `axis=i`, the accepted range is `[ceil(Di/Si), ceil(Di/(Si-1))-1]`

### Inputs

Between 2 and 3 inputs.

- **x** (heterogeneous) - **T1**:

  N-D quantized input tensor to be de-quantized.
- **x_scale** (heterogeneous) - **T2**:

  Scale for input `x`. For per-tensor/layer dequantization the scale is a scalar, for per per-axis dequantization it is a 1-D Tensor and for blocked dequantization it has the same shape as the input, except for one dimension in which blocking is performed.
- **x_zero_point** (optional, heterogeneous) - **T1**:

  Zero point for input `x`. Shape must match x_scale. It's optional. Zero point is 0 when it's not specified.

### Outputs

- **y** (heterogeneous) - **T2**:

  N-D full precision output tensor. It has same shape as input `x`.

### Type Constraints

* **T1** in ( `tensor(float8e4m3fn)`, `tensor(float8e4m3fnuz)`, `tensor(float8e5m2)`, `tensor(float8e5m2fnuz)`, `tensor(int16)`, `tensor(int32)`, `tensor(int4)`, `tensor(int8)`, `tensor(uint16)`, `tensor(uint4)`, `tensor(uint8)` ):

  The type of the inputs 'x_zero_point' and 'x'.
* **T2** in ( `tensor(bfloat16)`, `tensor(float)`, `tensor(float16)` ):

  'x_scale' determines the output type.

```{toctree}
text_diff_DequantizeLinear_19_24
text_diff_DequantizeLinear_19_23
text_diff_DequantizeLinear_19_21
```

(l-onnx-op-dequantizelinear-19)=

## DequantizeLinear - 19

### Version

- **name**: [DequantizeLinear (GitHub)](https://github.com/onnx/onnx/blob/main/docs/Operators.md#DequantizeLinear)
- **domain**: `main`
- **since_version**: `19`
- **function**: `False`
- **support_level**: `SupportType.COMMON`
- **shape inference**: `True`

This version of the operator has been available
**since version 19**.

### Summary

The linear dequantization operator. It consumes a quantized tensor, a scale, and a zero point to compute the full precision tensor.
The dequantization formula is `y = (x - x_zero_point) * x_scale`. `x_scale` and `x_zero_point` must have same shape, and can be either a scalar
for per-tensor / per layer quantization, or a 1-D tensor for per-axis quantization.
`x_zero_point` and `x` must have same type. `x` and `y` must have same shape. In the case of dequantizing int32,
there's no zero point (zero point is supposed to be 0).
`zero-point` is usually not used in the case of float8e4m3fn, float8e4m3fnuz, float8e5m2, float8e5m2fnuz quantization,
but the dequantization formula remains the same for consistency and 'x_scale' still determines the output type.

### Attributes

* **axis - INT** (default is `'1'`):

  (Optional) The axis of the dequantizing dimension of the input tensor. Used only for per-axis quantization. Negative value means counting dimensions from the back. Accepted range is `[-r, r-1]` where `r = rank(input)`. When the rank of the input is 1, per-tensor quantization is applied, rendering the axis unnecessary in this scenario.

### Inputs

Between 2 and 3 inputs.

- **x** (heterogeneous) - **T1**:

  N-D quantized input tensor to be de-quantized.
- **x_scale** (heterogeneous) - **T2**:

  Scale for input 'x'. It can be a scalar, which means a per-tensor/layer dequantization, or a 1-D tensor for per-axis dequantization.
- **x_zero_point** (optional, heterogeneous) - **T1**:

  Zero point for input 'x'. Shape must match x_scale. It's optional. Zero point is 0 when it's not specified.

### Outputs

- **y** (heterogeneous) - **T2**:

  N-D full precision output tensor. It has same shape as input 'x'.

### Type Constraints

* **T1** in ( `tensor(float8e4m3fn)`, `tensor(float8e4m3fnuz)`, `tensor(float8e5m2)`, `tensor(float8e5m2fnuz)`, `tensor(int32)`, `tensor(int8)`, `tensor(uint8)` ):

  Constrain 'x_zero_point' and 'x' to 8-bit integer or float, or /32-bit integer tensor.
* **T2** in ( `tensor(bfloat16)`, `tensor(float)`, `tensor(float16)` ):

  'x_scale' determines the output type.

```{toctree}
text_diff_DequantizeLinear_13_24
text_diff_DequantizeLinear_13_23
text_diff_DequantizeLinear_13_21
text_diff_DequantizeLinear_13_19
```

(l-onnx-op-dequantizelinear-13)=

## DequantizeLinear - 13

### Version

- **name**: [DequantizeLinear (GitHub)](https://github.com/onnx/onnx/blob/main/docs/Operators.md#DequantizeLinear)
- **domain**: `main`
- **since_version**: `13`
- **function**: `False`
- **support_level**: `SupportType.COMMON`
- **shape inference**: `True`

This version of the operator has been available
**since version 13**.

### Summary

The linear dequantization operator. It consumes a quantized tensor, a scale, and a zero point to compute the full precision tensor.
The dequantization formula is `y = (x - x_zero_point) * x_scale`. `x_scale` and `x_zero_point` must have same shape, and can be either a scalar
for per-tensor / per layer quantization, or a 1-D tensor for per-axis quantization.
`x_zero_point` and `x` must have same type. `x` and `y` must have same shape. In the case of dequantizing int32,
there's no zero point (zero point is supposed to be 0).

### Attributes

* **axis - INT** (default is `'1'`):

  (Optional) The axis of the dequantizing dimension of the input tensor. Ignored for per-tensor quantization. Negative value means counting dimensions from the back. Accepted range is [-r, r-1] where r = rank(input).

### Inputs

Between 2 and 3 inputs.

- **x** (heterogeneous) - **T**:

  N-D quantized input tensor to be de-quantized.
- **x_scale** (heterogeneous) - **tensor(float)**:

  Scale for input 'x'. It can be a scalar, which means a per-tensor/layer dequantization, or a 1-D tensor for per-axis dequantization.
- **x_zero_point** (optional, heterogeneous) - **T**:

  Zero point for input 'x'. Shape must match x_scale. It's optional. Zero point is 0 when it's not specified.

### Outputs

- **y** (heterogeneous) - **tensor(float)**:

  N-D full precision output tensor. It has same shape as input 'x'.

### Type Constraints

* **T** in ( `tensor(int32)`, `tensor(int8)`, `tensor(uint8)` ):

  Constrain 'x_zero_point' and 'x' to 8-bit/32-bit integer tensor.

```{toctree}
text_diff_DequantizeLinear_10_24
text_diff_DequantizeLinear_10_23
text_diff_DequantizeLinear_10_21
text_diff_DequantizeLinear_10_19
text_diff_DequantizeLinear_10_13
```

(l-onnx-op-dequantizelinear-10)=

## DequantizeLinear - 10

### Version

- **name**: [DequantizeLinear (GitHub)](https://github.com/onnx/onnx/blob/main/docs/Operators.md#DequantizeLinear)
- **domain**: `main`
- **since_version**: `10`
- **function**: `False`
- **support_level**: `SupportType.COMMON`
- **shape inference**: `True`

This version of the operator has been available
**since version 10**.

### Summary

The linear dequantization operator. It consumes a quantized tensor, a scale, a zero point to compute the full precision tensor.
The dequantization formula is y = (x - x_zero_point) * x_scale. 'x_scale' and 'x_zero_point' are both scalars.
'x_zero_point' and 'x' must have same type. 'x' and 'y' must have same shape. In the case of dequantizing int32,
there's no zero point (zero point is supposed to be 0).

### Inputs

Between 2 and 3 inputs.

- **x** (heterogeneous) - **T**:

  N-D quantized input tensor to be de-quantized.
- **x_scale** (heterogeneous) - **tensor(float)**:

  Scale for input 'x'. It's a scalar, which means a per-tensor/layer quantization.
- **x_zero_point** (optional, heterogeneous) - **T**:

  Zero point for input 'x'. It's a scalar, which means a per-tensor/layer quantization. It's optional. 0 is the default value when it's not specified.

### Outputs

- **y** (heterogeneous) - **tensor(float)**:

  N-D full precision output tensor. It has same shape as input 'x'.

### Type Constraints

* **T** in ( `tensor(int32)`, `tensor(int8)`, `tensor(uint8)` ):

  Constrain 'x_zero_point' and 'x' to 8-bit/32-bit integer tensor.