# MatMulInteger¶

## MatMulInteger - 10¶

### Version¶

• domain: main

• since_version: 10

• function: False

• support_level: SupportType.COMMON

• shape inference: True

This version of the operator has been available since version 10.

### Summary¶

Matrix product that behaves like numpy.matmul. The production MUST never overflow. The accumulation may overflow if and only if in 32 bits.

### Inputs¶

Between 2 and 4 inputs.

• A (heterogeneous) - T1:

N-dimensional matrix A

• B (heterogeneous) - T2:

N-dimensional matrix B

• a_zero_point (optional, heterogeneous) - T1:

Zero point tensor for input ‘A’. It’s optional and default value is 0. It could be a scalar or N-D tensor. Scalar refers to per tensor quantization whereas N-D refers to per row quantization. If the input is 2D of shape [M, K] then zero point tensor may be an M element vector [zp_1, zp_2, …, zp_M]. If the input is N-D tensor with shape [D1, D2, M, K] then zero point tensor may have shape [D1, D2, M, 1].

• b_zero_point (optional, heterogeneous) - T2:

Zero point tensor for input ‘B’. It’s optional and default value is 0. It could be a scalar or a N-D tensor, Scalar refers to per tensor quantization whereas N-D refers to per col quantization. If the input is 2D of shape [K, N] then zero point tensor may be an N element vector [zp_1, zp_2, …, zp_N]. If the input is N-D tensor with shape [D1, D2, K, N] then zero point tensor may have shape [D1, D2, 1, N].

### Outputs¶

• Y (heterogeneous) - T3:

Matrix multiply results from A * B

### Type Constraints¶

• T1 in ( tensor(int8), tensor(uint8) ):

Constrain input A data type to 8-bit integer tensor.

• T2 in ( tensor(int8), tensor(uint8) ):

Constrain input B data type to 8-bit integer tensor.

• T3 in ( tensor(int32) ):

Constrain output Y data type as 32-bit integer tensor.