Supported Operators#

The following table lists the ONNX operators supported in AMD Vitis™ AI 6.2. The operators are categorized by their support for different quantization types: FP32 (automatically converted to BF16 within the compiler), INT8 and FP16. A “Y” indicates that Vitis AI provides broad coverage for that operator and for that specific quantization type for CNN models. Some specific configurations of that operator might not be fully supported, however.

Operator

FP32

FP16

INT8

Abs

Y

Y

Y

Add

Y

Y

Y

And

Y

Y

Y

Attention

Y

N/A

N/A

AveragePool

Y

Y

Y

BatchNormalization

Y

Y

Y

BitwiseAnd

Y

Y

Y

BitwiseNot

Y

Y

Y

BitwiseOr

Y

Y

Y

BitwiseXor

Y

Y

Y

Cast

Y

Y

Y

Ceil

Y

Y

Y

Clip

Y

Y

Y

Concat

Y

Y

Y

ConstantOfShape

Y

Y

Y

ConvTranspose

Y

N/A

Y

Cos

Y

Y

Y

CumSum

Y

Y

N/A

DepthToSpace

Y

Y

Y

Div

Y

Y

Y

Einsum

Y

Y

Y

Elu

Y

Y

Y

Equal

Y

Y

Y

Erf

Y

Y

Y

Exp

Y

Y

Y

Expand

Y

Y

Y

Flatten

Y

Y

Y

Floor

Y

Y

Y

Gather

Y

Y

Y

Gelu

Y

Y

Y

GlobalAveragePool

Y

Y

Y

Greater

Y

Y

Y

GreaterOrEqual

Y

Y

Y

GridSample

Y

Y

N/A

HardSigmoid

Y

Y

Y

HardSwish

Y

Y

Y

Identity

Y

Y

Y

InstanceNormalization

Y

Y

Y

LayerNormalization

Y

Y

Y

LeakyRelu

Y

Y

Y

Less

Y

Y

Y

LessOrEqual

Y

Y

Y

Log

Y

Y

Y

LogSoftmax

Y

Y

Y

Max

Y

Y

Y

MaxPool

Y

Y

Y

Min

Y

Y

Y

Mish

Y

Y

Y

Mul

Y

Y

Y

Neg

Y

Y

Y

Not

Y

Y

Y

Or

Y

Y

Y

PRelu

Y

Y

N/A

Pad

Y

Y

Y

Pow

Y

Y

Y

QuickGelu

Y

Y

Y

RMSLayerNormalization

Y

Y

Y

Range

Y

Y

Y

Reciprocal

Y

Y

Y

ReduceL1

Y

Y

Y

ReduceL2

Y

Y

N/A

ReduceLogSum

Y

Y

Y

ReduceLogSumExp

Y

Y

Y

ReduceMax

Y

Y

Y

ReduceMean

Y

Y

Y

ReduceMin

Y

Y

Y

ReduceSum

Y

Y

Y

ReduceSumSquare

Y

Y

Y

Relu

Y

Y

Y

Reshape

Y

Y

Y

Resize

Y

Y

Y

Round

Y

Y

Y

Scaler

Y

Y

Y

ScatterND

Y

Y

Y

Selu

Y

Y

N/A

Shape

Y

Y

Y

SiLU

Y

Y

Y

Sigmoid

Y

Y

Y

Sign

Y

Y

Y

Sin

Y

Y

Y

Slice

Y

Y

Y

Softmax

Y

Y

Y

Softplus

Y

Y

N/A

SpaceToDepth

Y

Y

Y

Split

Y

Y

Y

Sqrt

Y

Y

Y

Squeeze

Y

Y

Y

Sub

Y

Y

Y

Sum

Y

Y

Y

Tanh

Y

Y

Y

ThresholdedRelu

Y

Y

N/A

Tile

Y

Y

Y

TopK

Y

Y

N/A

Transpose

Y

Y

Y

Unsqueeze

Y

Y

Y

Upsample

Y

Y

Y

Xor

Y

Y

Y

Note

N/A: Not applicable (the operator is not supported in ONNX)

Operator Shape Support Matrix#

The information below outlines a subset of the operators listed above with some precisions on the supported shapes and configurations for each operator.

AveragePool#

Implements average pooling over a 2D input.

Important

The following values describe the hardware kernel’s internal subvolume dimensions operating in L1 memory. They represent base shapes, not upper or lower limits on the kernel shapes accepted by the Vitis AI compiler. Building from these base values, the compiler might add support for a broader range of shapes through automatic padding, tiling, and other transforms applied at compile time.

The Vitis AI compiler automatically handles:
  • Tiling of large tensors across L3 → L2 → L1 memory

  • Padding of dimensions to meet hardware alignment

  • Decomposition and recomposition of operators

  • Layer fusion

Supported Features:

Data Types: bf16, fp16, int8

ksize_h, ksize_w: Pooling kernel height and width (1 to 20, step size 1)

stride_h, stride_w: Kernel strides for height and width (1 to 20, step size 1)

Concat#

Implements the concatenation operator.

Supported Features:

Data Types: bf16, fp16, int8/uint8

concat_axis: Specifies the axis along which concatenation is performed (allowed: 0 to 3, step size 1)

Conv2D (FP32, FP16)#

Provides a 2D convolution operator in FP32 (automatically converted to BF16 within the compiler) and FP16 formats.

Important

The following values describe the hardware kernel’s internal subvolume dimensions operating in L1 memory. They represent base shapes, not upper or lower limits on the kernel shapes accepted by the Vitis AI compiler. Building from these base values, the compiler might add support for a broader range of shapes through automatic padding, tiling, and other transforms applied at compile time.

The Vitis AI compiler automatically handles:
  • Tiling of large tensors across L3 → L2 → L1 memory

  • Padding of dimensions to meet hardware alignment

  • Decomposition and recomposition of operators

  • Layer fusion

Supported Features:

Data Types: Input, weights, and output must be in fp32 (bf16) or fp16 format.

Kernel Size:

Width: 1, 2, 3, 4, 5, 7, 15, 16, 128

Height: 1, 2, 3, 4, 5, 7, 9, 15

If kernel height is 3, kernel width must be either 3 or 15.

If kernel width is 15, kernel height must be 3.

If kernel height is 4, kernel width must be 16.

1xN or Nx1 kernels can be automatically converted to NxN if N is a supported width.

Dilation: Dilation is supported.

Groups: Groups are supported.

Stride:

Height: 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15

Width: 1, 2, 4

If stride width is 4, stride height must also be 4 (and vice versa).

Conv2D (INT8)#

Provides a 2D convolution operator in INT8 format, supporting various activation functions.

Important

The following values describe the hardware kernel’s internal subvolume dimensions operating in L1 memory. They represent base shapes, not upper or lower limits on the kernel shapes accepted by the Vitis AI compiler. Building from these base values, the compiler might add support for a broader range of shapes through automatic padding, tiling, and other transforms applied at compile time.

The Vitis AI compiler automatically handles:
  • Tiling of large tensors across L3 → L2 → L1 memory

  • Padding of dimensions to meet hardware alignment

  • Decomposition and recomposition of operators

  • Layer fusion

Supported Features:

Data Types: Input, weights, and output must be in int8 format.

Kernel Size:

Width: 1 to 16 (step size 1)

Height: 1 to 16 (step size 1)

Dilation: Dilation is supported.

Groups: Groups are supported.

Stride:

Height: 1 to 16 (step size 1)

Width: 1 or 2

DepthToSpace#

Rearranges data from the depth (channel) dimension into spatial blocks.

Input Tensor: (N, C, H, W) -> Output Tensor: (N, C//(B*B), H*B, W*B)

Important

The following values describe the hardware kernel’s internal subvolume dimensions operating in L1 memory. They represent base shapes, not upper or lower limits on the kernel shapes accepted by the Vitis AI compiler. Building from these base values, the compiler might add support for a broader range of shapes through automatic padding, tiling, and other transforms applied at compile time.

The Vitis AI compiler automatically handles:
  • Tiling of large tensors across L3 → L2 → L1 memory

  • Padding of dimensions to meet hardware alignment

  • Decomposition and recomposition of operators

  • Layer fusion

Supported Features:

Data Types: fp32 (bf16), fp16, int8, uint8

Mode: DCR, CRD

Block Size: 2, 3, 4, 8

Gemm (FP32 (BF16), FP16)#

General Matrix Multiplication operator supporting matrix product, bias addition, and optional transposition of weights.

Important

The following values describe the hardware kernel’s internal subvolume dimensions operating in L1 memory. They represent base shapes, not upper or lower limits on the kernel shapes accepted by the Vitis AI compiler. Building from these base values, the compiler might add support for a broader range of shapes through automatic padding, tiling, and other transforms applied at compile time.

The Vitis AI compiler automatically handles:
  • Tiling of large tensors across L3 → L2 → L1 memory

  • Padding of dimensions to meet hardware alignment

  • Decomposition and recomposition of operators

  • Layer fusion

Supported Features:

Data Types: fp32 (bf16), fp16

Computation:

C(M, N) = A(M, K) x B(K, N) + bias

Ranges:

M: 8 to 96 (step 8)

K: 64 to 128 (step 16); extended K: 1 to 65535 (step 1, by iteration)

N: 16 to 96 (step 16)

Bias Handling:

When used, bias is padded to a multiple of 32 and is concatenated with the weights: Data layout: {bias_padded, weight}

Gemm (INT8)#

General Matrix Multiplication operator supporting matrix product, bias addition, and optional transposition of weights.

Important

The following values describe the hardware kernel’s internal subvolume dimensions operating in L1 memory. They represent base shapes, not upper or lower limits on the kernel shapes accepted by the Vitis AI compiler. Building from these base values, the compiler might add support for a broader range of shapes through automatic padding, tiling, and other transforms applied at compile time.

The Vitis AI compiler automatically handles:
  • Tiling of large tensors across L3 → L2 → L1 memory

  • Padding of dimensions to meet hardware alignment

  • Decomposition and recomposition of operators

  • Layer fusion

Supported Features:

Data Types: int8

Computation:

C(M, N) = A(M, K) x B(K, N) + bias

Ranges:

M: 16 to 240 (step 16)

K: 40 to 248 (step 8); extended K: 1 to 65535 (step 1, by iteration)

N: 16 to 240 (step 16)

Bias Handling:

When used, bias is padded to a multiple of 32 and is concatenated with the weights: Data layout: {bias_padded, weight}

GlobalAveragePool#

Implements 2D global average pooling.

Important

The following values describe the hardware kernel’s internal subvolume dimensions operating in L1 memory. They represent base shapes, not upper or lower limits on the kernel shapes accepted by the Vitis AI compiler. Building from these base values, the compiler might add support for a broader range of shapes through automatic padding, tiling, and other transforms applied at compile time.

The Vitis AI compiler automatically handles:
  • Tiling of large tensors across L3 → L2 → L1 memory

  • Padding of dimensions to meet hardware alignment

  • Decomposition and recomposition of operators

  • Layer fusion

Supported Features:

Data Types: fp32 (bf16), fp16, int8

stride: 1 to 20, step size 1

kernel size: 1 to 20, step size 1

MaxPool#

Implements 2D max pooling.

Important

The following values describe the hardware kernel’s internal subvolume dimensions operating in L1 memory. They represent base shapes, not upper or lower limits on the kernel shapes accepted by the Vitis AI compiler. Building from these base values, the compiler might add support for a broader range of shapes through automatic padding, tiling, and other transforms applied at compile time.

The Vitis AI compiler automatically handles:
  • Tiling of large tensors across L3 → L2 → L1 memory

  • Padding of dimensions to meet hardware alignment

  • Decomposition and recomposition of operators

  • Layer fusion

Supported Features:

Data Types: fp32 (bf16), fp16, int8

stride: 1 to 20, step size 1

kernel size: 1 to 20, step size 1

Pad#

Implements 2D padding. The operator fills the edges of a tensor with the specified padding value. Supported modes: TOP, BOT, LEFT, RIGHT, BOTRIGHT, TOPLEFT.

Important

The following values describe the hardware kernel’s internal subvolume dimensions operating in L1 memory. They represent base shapes, not upper or lower limits on the kernel shapes accepted by the Vitis AI compiler. Building from these base values, the compiler might add support for a broader range of shapes through automatic padding, tiling, and other transforms applied at compile time.

The Vitis AI compiler automatically handles:
  • Tiling of large tensors across L3 → L2 → L1 memory

  • Padding of dimensions to meet hardware alignment

  • Decomposition and recomposition of operators

  • Layer fusion

Supported Features#

  • Data Types: bf16, fp16, int8, uint8

  • pad_left, pad_right, pad_top, pad_bot: Positions padded on each edge (0 to 255, step size 1)

Supported Padding Modes#

Mode

Supported

Description

constant

Yes

Pads with a constant value (for example, zero padding or a specified fill value)

reflect

Yes

Pads by reflecting the input tensor values at the border

edge

No

Not supported

wrap

No

Not supported

Resize#

Implements tensor resizing with configurable interpolation.

Important

The following values describe the hardware kernel’s internal subvolume dimensions operating in L1 memory. They represent base shapes, not upper or lower limits on the kernel shapes accepted by the Vitis AI compiler. Building from these base values, the compiler might add support for a broader range of shapes through automatic padding, tiling, and other transforms applied at compile time.

The Vitis AI compiler automatically handles:
  • Tiling of large tensors across L3 → L2 → L1 memory

  • Padding of dimensions to meet hardware alignment

  • Decomposition and recomposition of operators

  • Layer fusion

Supported Features:

Data Types: fp32 (bf16), fp16, int8

Interpolation modes:

mode: 0 (nearest), 1 (linear)

nearest_mode: 0 (floor), 1 (round_prefer_ceil), 2 (round_prefer_floor; default)

Scale Factor: Maximum scale factor less than 32

Slice#

Implements slicing of a tensor.

Important

The following values describe the hardware kernel’s internal subvolume dimensions operating in L1 memory. They represent base shapes, not upper or lower limits on the kernel shapes accepted by the Vitis AI compiler. Building from these base values, the compiler might add support for a broader range of shapes through automatic padding, tiling, and other transforms applied at compile time.

The Vitis AI compiler automatically handles:
  • Tiling of large tensors across L3 → L2 → L1 memory

  • Padding of dimensions to meet hardware alignment

  • Decomposition and recomposition of operators

  • Layer fusion

Supported Features:

Data Types: fp32 (bf16), fp16, int8

Slice indices:

height_start, width_start, depth_start, height_end, width_end, depth_end (0 to 2048, step size 1)

height_step, width_step, depth_step (1 to 2048, step size 1)

Split#

Implements tensor splitting along a specified axis.

Supported Features:

Data Types: fp32 (bf16), fp16, int8, uint8

split_axis: Axis to split on (allowed values: 0 to 2, step size 1)

CPU Because#

Supported operators might still fall back to CPU for specific instances. The compiler displays a diagnostic message explaining why a specific layer was rejected and mapped to CPU. Two common reasons are:

  • Unsupported configuration: The operator is supported in general, but the specific configuration of that operator in the model is not supported. For example, a Conv2D operator with a kernel size of 17x17 or a large stride would fall back to CPU because it does not meet the supported kernel size configurations for Conv2D. The error message would look like this:

YAML constraints failed [MLLIB]: [0: 'Conv2d': 'while specializing value:
'config.stride_w' configuration value is set to
'Allowed(data_type=uint8, values=[1, 2], default_value={})' by the library,
but parameterization is trying to override with '4', which is incompatible'
  • Memory limitations: The operator is supported, but the required memory is not available on the NPU for the specific instance. The compiler cannot tile the tensor so that both Input activations, Output activations and Weights fit simultaneously in the L1 data memory. In such cases, the operator is executed on the CPU emitting a message like this:

Unwrapping layer with 'Conv2DBf16' kernel(s) that cannot be implemented
due to 'Insufficient L1 buffer space for IFM, OFM and WTS [WTS_FM_NOT_FIT_L1MEM]'

These operators are listed in the “CPU Because” table in the AI Analyzer Partitioning Summary page, along with the reason for CPU fallback (Partitioning).

Note

Vitis AI compiler doesn’t create a log file to store all displayed messages. If you want to save the messages, you can redirect the output of the compiler, both stdout and stderr, to a file using the following command:

python compile.py 2>&1 | tee vaicompiler.log

This command still displays the messages in the terminal while also saving them to vaicompiler.log for later reference.