DequantizeLinear - 19 vs 23¶
Next section compares an older to a newer version of the same operator after both definition are converted into markdown text. Green means an addition to the newer version, red means a deletion. Anything else is unchanged.
DequantizeLinear19 → DequantizeLinear23
RENAMED
@@ -1 +1 @@
|
|
1
|
-
The linear dequantization operator. It consumes a quantized tensor, a scale, and a zero point to compute the
|
1
|
+
The linear dequantization operator. It consumes a quantized tensor, a scale, and a zero point to compute the
|
2
|
-
The dequantization formula is y = (x - x_zero_point) * x_scale. x_scale and x_zero_point
|
2
|
+
full-precision tensor. The dequantization formula is y = (x - x_zero_point) * x_scale. x_scale and x_zero_point
|
3
|
+
must have the same shape, determining the quantization's granularity: a scalar for per-tensor/per-layer quantization,
|
3
|
-
|
4
|
+
a 1-D tensor for per-axis quantization, or have a rank identical to the input for blocked quantization.
|
5
|
+
See QuantizeLinear for details on quantization granularity.
|
6
|
+
|
4
|
-
x_zero_point and x must have same type. x and y must have same shape. In the case of dequantizing
|
7
|
+
x_zero_point and x must have the same type. x and y must have the same shape. In the case of dequantizing
|
5
|
-
there's no zero point (zero point is supposed to be 0).
|
8
|
+
int32, there's no zero point (zero point is supposed to be 0).
|
6
|
-
zero-point is usually not used in the case of
|
9
|
+
zero-point is usually not used in the case of float8 and 4-bit types quantization, but the dequantization formula remains the same
|
7
|
-
|
10
|
+
for consistency. The output type is determined by the attribute output_dtype. If output_dtype is not supplied then the output type
|
11
|
+
is the same as x_scale. The output type also determines the precision of the multiplication operation.
|
8
12
|
### Attributes
|
9
13
|
* **axis - INT** (default is '1'):
|
10
|
-
(Optional) The axis of the dequantizing dimension of the input tensor. Used
|
14
|
+
(Optional) The axis of the dequantizing dimension of the input tensor. Used for per-axis and blocked quantization. Negative value means counting dimensions from the back. Accepted range is [-r, r-1] where r = rank(input).
|
15
|
+
|
16
|
+
* **block_size - INT** (default is '0'):
|
17
|
+
|
18
|
+
(Optional) The size of the quantization block (number of times every scale is replicated). Used only for blocked quantization. The block size is a positive integer. Given x shape (D0, ..., Di, ..., Dn), y_scale shape (S0, ... Si, ...Sn) and axis=i, the accepted range is [ceil(Di/Si), ceil(Di/(Si-1))-1]
|
19
|
+
|
20
|
+
* **output_dtype - INT** (default is '0'):
|
21
|
+
|
22
|
+
(Optional) The output data type. If not supplied, the output data type is inferred from x_scale data type (T2)
|
11
23
|
### Inputs
|
12
24
|
Between 2 and 3 inputs.
|
13
25
|
- **x** (heterogeneous) - **T1**:
|
14
26
|
N-D quantized input tensor to be de-quantized.
|
15
27
|
- **x_scale** (heterogeneous) - **T2**:
|
16
|
-
Scale for input
|
28
|
+
Scale for input x. For per-tensor/layer dequantization the scale is a scalar, for per per-axis dequantization it is a 1-D Tensor and for blocked dequantization it has the same shape as the input, except for one dimension in which blocking is performed.
|
17
29
|
- **x_zero_point** (optional, heterogeneous) - **T1**:
|
18
|
-
Zero point for input
|
30
|
+
Zero point for input x. Shape must match x_scale. It's optional. Zero point is 0 when it's not specified.
|
19
31
|
### Outputs
|
20
|
-
- **y** (heterogeneous) - **
|
32
|
+
- **y** (heterogeneous) - **T3**:
|
21
|
-
N-D full precision output tensor. It has same shape as input
|
33
|
+
N-D full precision output tensor. It has the same shape as input x. The data type is specified by the output_dtype attribute or, in its absence, the type of x_scale.
|
22
34
|
### Type Constraints
|
23
|
-
* **T1** in ( tensor(float8e4m3fn), tensor(float8e4m3fnuz), tensor(float8e5m2), tensor(float8e5m2fnuz), tensor(int32), tensor(int8), tensor(uint8) ):
|
35
|
+
* **T1** in ( tensor(float4e2m1), tensor(float8e4m3fn), tensor(float8e4m3fnuz), tensor(float8e5m2), tensor(float8e5m2fnuz), tensor(int16), tensor(int32), tensor(int4), tensor(int8), tensor(uint16), tensor(uint4), tensor(uint8) ):
|
24
|
-
|
36
|
+
The type of the inputs 'x_zero_point' and 'x'.
|
25
37
|
* **T2** in ( tensor(bfloat16), tensor(float), tensor(float16) ):
|
38
|
+
* **T3** in ( tensor(bfloat16), tensor(float), tensor(float16) ):
|
39
|
+
|
26
|
-
|
40
|
+
The type of the output 'y'.
|