# DequantizeLinear¶

## DequantizeLinear - 23¶

### Version¶

• domain: main

• since_version: 23

• function: False

• support_level: SupportType.COMMON

• shape inference: True

This version of the operator has been available since version 23.

### Summary¶

The linear dequantization operator. It consumes a quantized tensor, a scale, and a zero point to compute the full-precision tensor. The dequantization formula is y = (x - x_zero_point) * x_scale. x_scale and x_zero_point must have the same shape, determining the quantization’s granularity: a scalar for per-tensor/per-layer quantization, a 1-D tensor for per-axis quantization, or have a rank identical to the input for blocked quantization. See QuantizeLinear for details on quantization granularity.

x_zero_point and x must have the same type. x and y must have the same shape. In the case of dequantizing int32, there’s no zero point (zero point is supposed to be 0). zero-point is usually not used in the case of float8 types quantization, but the dequantization formula remains the same for consistency, and x_scale still determines the output type.

### Attributes¶

• axis - INT (default is '1'):

(Optional) The axis of the dequantizing dimension of the input tensor. Used for per-axis and blocked quantization. Negative value means counting dimensions from the back. Accepted range is [-r, r-1] where r = rank(input).

• block_size - INT (default is '0'):

(Optional) The size of the quantization block (number of times every scale is replicated). Used only for blocked quantization. The block size is a positive integer. Given x shape (D0, ..., Di, ..., Dn), y_scale shape (S0, ... Si, ...Sn) and axis=i, the accepted range is [ceil(Di/Si), ceil(Di/(Si-1))-1]

### Inputs¶

Between 2 and 3 inputs.

• x (heterogeneous) - T1:

N-D quantized input tensor to be de-quantized.

• x_scale (heterogeneous) - T2:

Scale for input x. For per-tensor/layer dequantization the scale is a scalar, for per per-axis dequantization it is a 1-D Tensor and for blocked dequantization it has the same shape as the input, except for one dimension in which blocking is performed.

• x_zero_point (optional, heterogeneous) - T1:

Zero point for input x. Shape must match x_scale. It’s optional. Zero point is 0 when it’s not specified.

### Outputs¶

• y (heterogeneous) - T2:

N-D full precision output tensor. It has same shape as input x.

### Type Constraints¶

• T1 in ( tensor(float4e2m1), tensor(float8e4m3fn), tensor(float8e4m3fnuz), tensor(float8e5m2), tensor(float8e5m2fnuz), tensor(int16), tensor(int32), tensor(int4), tensor(int8), tensor(uint16), tensor(uint4), tensor(uint8) ):

The type of the inputs ‘x_zero_point’ and ‘x’.

• T2 in ( tensor(bfloat16), tensor(float), tensor(float16) ):

‘x_scale’ determines the output type.

## DequantizeLinear - 21¶

### Version¶

• domain: main

• since_version: 21

• function: False

• support_level: SupportType.COMMON

• shape inference: True

This version of the operator has been available since version 21.

### Summary¶

The linear dequantization operator. It consumes a quantized tensor, a scale, and a zero point to compute the full-precision tensor. The dequantization formula is y = (x - x_zero_point) * x_scale. x_scale and x_zero_point must have the same shape, determining the quantization’s granularity: a scalar for per-tensor/per-layer quantization, a 1-D tensor for per-axis quantization, or have a rank identical to the input for blocked quantization. See QuantizeLinear for details on quantization granularity. x_zero_point and x must have the same type. x and y must have the same shape. In the case of dequantizing int32, there’s no zero point (zero point is supposed to be 0). zero-point is usually not used in the case of float8 types quantization, but the dequantization formula remains the same for consistency, and x_scale still determines the output type.

### Attributes¶

• axis - INT (default is '1'):

(Optional) The axis of the dequantizing dimension of the input tensor. Used for per-axis and blocked quantization. Negative value means counting dimensions from the back. Accepted range is [-r, r-1] where r = rank(input).

• block_size - INT (default is '0'):

(Optional) The size of the quantization block (number of times every scale is replicated). Used only for blocked quantization. The block size is a positive integer. Given x shape (D0, ..., Di, ..., Dn), y_scale shape (S0, ... Si, ...Sn) and axis=i, the accepted range is [ceil(Di/Si), ceil(Di/(Si-1))-1]

### Inputs¶

Between 2 and 3 inputs.

• x (heterogeneous) - T1:

N-D quantized input tensor to be de-quantized.

• x_scale (heterogeneous) - T2:

Scale for input x. For per-tensor/layer dequantization the scale is a scalar, for per per-axis dequantization it is a 1-D Tensor and for blocked dequantization it has the same shape as the input, except for one dimension in which blocking is performed.

• x_zero_point (optional, heterogeneous) - T1:

Zero point for input x. Shape must match x_scale. It’s optional. Zero point is 0 when it’s not specified.

### Outputs¶

• y (heterogeneous) - T2:

N-D full precision output tensor. It has same shape as input x.

### Type Constraints¶

• T1 in ( tensor(float8e4m3fn), tensor(float8e4m3fnuz), tensor(float8e5m2), tensor(float8e5m2fnuz), tensor(int16), tensor(int32), tensor(int4), tensor(int8), tensor(uint16), tensor(uint4), tensor(uint8) ):

The type of the inputs ‘x_zero_point’ and ‘x’.

• T2 in ( tensor(bfloat16), tensor(float), tensor(float16) ):

‘x_scale’ determines the output type.

## DequantizeLinear - 19¶

### Version¶

• domain: main

• since_version: 19

• function: False

• support_level: SupportType.COMMON

• shape inference: True

This version of the operator has been available since version 19.

### Summary¶

The linear dequantization operator. It consumes a quantized tensor, a scale, and a zero point to compute the full precision tensor. The dequantization formula is y = (x - x_zero_point) * x_scale. x_scale and x_zero_point must have same shape, and can be either a scalar for per-tensor / per layer quantization, or a 1-D tensor for per-axis quantization. x_zero_point and x must have same type. x and y must have same shape. In the case of dequantizing int32, there’s no zero point (zero point is supposed to be 0). zero-point is usually not used in the case of float8e4m3fn, float8e4m3fnuz, float8e5m2, float8e5m2fnuz quantization, but the dequantization formula remains the same for consistency and ‘x_scale’ still determines the output type.

### Attributes¶

• axis - INT (default is '1'):

(Optional) The axis of the dequantizing dimension of the input tensor. Used only for per-axis quantization. Negative value means counting dimensions from the back. Accepted range is [-r, r-1] where r = rank(input). When the rank of the input is 1, per-tensor quantization is applied, rendering the axis unnecessary in this scenario.

### Inputs¶

Between 2 and 3 inputs.

• x (heterogeneous) - T1:

N-D quantized input tensor to be de-quantized.

• x_scale (heterogeneous) - T2:

Scale for input ‘x’. It can be a scalar, which means a per-tensor/layer dequantization, or a 1-D tensor for per-axis dequantization.

• x_zero_point (optional, heterogeneous) - T1:

Zero point for input ‘x’. Shape must match x_scale. It’s optional. Zero point is 0 when it’s not specified.

### Outputs¶

• y (heterogeneous) - T2:

N-D full precision output tensor. It has same shape as input ‘x’.

### Type Constraints¶

• T1 in ( tensor(float8e4m3fn), tensor(float8e4m3fnuz), tensor(float8e5m2), tensor(float8e5m2fnuz), tensor(int32), tensor(int8), tensor(uint8) ):

Constrain ‘x_zero_point’ and ‘x’ to 8-bit integer or float, or /32-bit integer tensor.

• T2 in ( tensor(bfloat16), tensor(float), tensor(float16) ):

‘x_scale’ determines the output type.

## DequantizeLinear - 13¶

### Version¶

• domain: main

• since_version: 13

• function: False

• support_level: SupportType.COMMON

• shape inference: True

This version of the operator has been available since version 13.

### Summary¶

The linear dequantization operator. It consumes a quantized tensor, a scale, and a zero point to compute the full precision tensor. The dequantization formula is y = (x - x_zero_point) * x_scale. x_scale and x_zero_point must have same shape, and can be either a scalar for per-tensor / per layer quantization, or a 1-D tensor for per-axis quantization. x_zero_point and x must have same type. x and y must have same shape. In the case of dequantizing int32, there’s no zero point (zero point is supposed to be 0).

### Attributes¶

• axis - INT (default is '1'):

(Optional) The axis of the dequantizing dimension of the input tensor. Ignored for per-tensor quantization. Negative value means counting dimensions from the back. Accepted range is [-r, r-1] where r = rank(input).

### Inputs¶

Between 2 and 3 inputs.

• x (heterogeneous) - T:

N-D quantized input tensor to be de-quantized.

• x_scale (heterogeneous) - tensor(float):

Scale for input ‘x’. It can be a scalar, which means a per-tensor/layer dequantization, or a 1-D tensor for per-axis dequantization.

• x_zero_point (optional, heterogeneous) - T:

Zero point for input ‘x’. Shape must match x_scale. It’s optional. Zero point is 0 when it’s not specified.

### Outputs¶

• y (heterogeneous) - tensor(float):

N-D full precision output tensor. It has same shape as input ‘x’.

### Type Constraints¶

• T in ( tensor(int32), tensor(int8), tensor(uint8) ):

Constrain ‘x_zero_point’ and ‘x’ to 8-bit/32-bit integer tensor.

## DequantizeLinear - 10¶

### Version¶

• domain: main

• since_version: 10

• function: False

• support_level: SupportType.COMMON

• shape inference: True

This version of the operator has been available since version 10.

### Summary¶

The linear dequantization operator. It consumes a quantized tensor, a scale, a zero point to compute the full precision tensor. The dequantization formula is y = (x - x_zero_point) * x_scale. ‘x_scale’ and ‘x_zero_point’ are both scalars. ‘x_zero_point’ and ‘x’ must have same type. ‘x’ and ‘y’ must have same shape. In the case of dequantizing int32, there’s no zero point (zero point is supposed to be 0).

### Inputs¶

Between 2 and 3 inputs.

• x (heterogeneous) - T:

N-D quantized input tensor to be de-quantized.

• x_scale (heterogeneous) - tensor(float):

Scale for input ‘x’. It’s a scalar, which means a per-tensor/layer quantization.

• x_zero_point (optional, heterogeneous) - T:

Zero point for input ‘x’. It’s a scalar, which means a per-tensor/layer quantization. It’s optional. 0 is the default value when it’s not specified.

### Outputs¶

• y (heterogeneous) - tensor(float):

N-D full precision output tensor. It has same shape as input ‘x’.

### Type Constraints¶

• T in ( tensor(int32), tensor(int8), tensor(uint8) ):

Constrain ‘x_zero_point’ and ‘x’ to 8-bit/32-bit integer tensor.