VART (Vitis AI Runtime) ML inference API namespace. More...

Classes
struct	NpuTensorInfo
	Metadata structure describing a tensor used in VART. More...

class	NpuTensor
	This class represents a tensor in the VART API. More...

struct	QuantParameters
	Struct representing quantization parameters for a tensor. More...

struct	JobHandle
	Struct representing a job handle for asynchronous execution. More...

class	Runner
	Abstract base class for executing model inference operations. More...

class	RunnerFactory
	Factory class for creating Runner instances. More...

Enumerations
enum class	DataType { UNKNOWN , BOOLEAN , INT8 , UINT8 , INT16 , UINT16 , BF16 , FP16 , INT32 , UINT32 , FLOAT32 , INT64 , UINT64 }
	Enumerates the supported data types for tensors in the VART API. More...

enum class	MemoryLayout { UNKNOWN , NC , NCH , NHC , NHW , NWC , NHWC , NCHW , NHWC4 , NHWC8 , NC4HW4 , NC8HW8 , HCWNC4 , HCWNC8 , HCWNC16 , NHW16C4WC , NHW16WC4C , GENERIC }
	Enumerates the supported memory layouts for tensors in the VART API. More...

enum class	MemoryType { UNKNOWN , XRT_BO , DMA_FD , USER_POINTER_CMA , USER_POINTER_NON_CMA }
	Enumerates the various memory types utilized for tensors in the VART API. More...

enum class	TensorDirection { INPUT , OUTPUT }
	Enumerates the supported tensor directions in the VART API. More...

enum class	TensorType { CPU , HW }
	Specifies the tensor types supported in the VART API. More...

enum class	RunnerType { VAIML }
	Enumerates the types of runner implementations supported. More...

enum class	RoundingMode { UNKNOWN , ROUND_TO_NEAREST_EVEN , ROUND_TOWARD_ZERO }
	Enumerates the rounding modes used in quantization. More...

enum class	StatusCode { SUCCESS = 0 , FAILURE , INVALID_INPUT , INVALID_OUTPUT , OUT_OF_MEMORY , RUNTIME_ERROR , JOB_PENDING , INVALID_JOB_ID , RESOURCE_UNAVAILABLE }
	Enumerates the status codes used in the VART. More...

Detailed Description

VART (Vitis AI Runtime) ML inference API namespace.

Provides ML inference APIs for AMD NPU hardware, including tensor management (NpuTensor), synchronous and asynchronous inference execution (Runner), and runner instantiation (RunnerFactory).

Enumeration Type Documentation

◆ DataType

enum vart::DataType

strong

Enumerates the supported data types for tensors in the VART API.

This enum defines the various data types that can be used to represent tensor elements. It includes integer and floating-point formats, as well as specialized types such as BF16.

The set of data types actually supported is RunnerType-specific. For RunnerType::VAIML, see RunnerType::VAIML Configuration.

Enumerator
UNKNOWN	Unknown data type.
BOOLEAN	Boolean type.
INT8	8-bit signed integer.
UINT8	8-bit unsigned integer.
INT16	16-bit signed integer.
UINT16	16-bit unsigned integer.
BF16	16-bit Brain Floating Point format.
FP16	16-bit floating point.
INT32	32-bit signed integer.
UINT32	32-bit unsigned integer.
FLOAT32	32-bit floating point.
INT64	64-bit signed integer.
UINT64	64-bit unsigned integer.

◆ MemoryLayout

enum vart::MemoryLayout

strong

Enumerates the supported memory layouts for tensors in the VART API.

This enum defines the various memory layouts that can be used to represent tensor data. It includes formats such as NHWC, NCHW, and others that specify how tensor dimensions are organized in memory.

The set of memory layouts actually supported is RunnerType-specific. For RunnerType::VAIML, see RunnerType::VAIML Configuration.

Enumerator
UNKNOWN	Unknown memory layout.
NC	Model batch, Channels (packed format).
NCH	Model batch, Channels (packed format), Height.
NHC	Model batch, Height, Channels (packed format).
NHW	Model batch, Height, Width.
NWC	Model batch, Width, Channels (packed format).
NHWC	Model batch, Height, Width, Channels (packed format).
NCHW	Model batch, Channels, Height, Width (planar format).
NHWC4	Model batch, Height, Width, Channel groups of 4 (e.g. RGBA).
NHWC8	Model batch, Height, Width, Channel groups of 8.
NC4HW4	Model batch, Channels / 4, Height, Width, Channel groups of 4.
NC8HW8	Model batch, Channels / 8, Height, Width, Channel groups of 8.
HCWNC4	Height, Channels / 4, Width, N = 1, Channel groups of 4.
HCWNC8	Height, Channels / 8, Width, N = 1, Channel groups of 8.
HCWNC16	Height, Channels / 16, Width, N = 1, Channel groups of 16.
NHW16C4WC	Model batch, Height, Width / 16, Channels / 4, Width groups of 16, Channel groups of 4.
NHW16WC4C	Model batch, Height, Width / 16, Width groups of 16, Channels / 4, Channel groups of 4.
GENERIC	Generic layout. See NpuTensorInfo::memory_layout_order for more info.

◆ MemoryType

enum vart::MemoryType

strong

Enumerates the various memory types utilized for tensors in the VART API.

This enumeration specifies the locations where tensor data is stored.

Enumerator
UNKNOWN	Memory type is not specified or recognized.
XRT_BO	Buffer object associated with XRT.
DMA_FD	File descriptor used for Direct Memory Access (DMA).
USER_POINTER_CMA	User-provided pointer to a contiguous physical memory block.
USER_POINTER_NON_CMA	User-provided pointer without contiguous memory guarantee (e.g. new, malloc).

◆ RoundingMode

enum vart::RoundingMode

strong

Enumerates the rounding modes used in quantization.

This enum defines the different rounding modes that can be applied during quantization, such as rounding to nearest even or truncating towards zero.

Enumerator
UNKNOWN	Unknown rounding mode.
ROUND_TO_NEAREST_EVEN	Round to nearest even value.
ROUND_TOWARD_ZERO	Truncate towards zero (no rounding).

◆ RunnerType

enum vart::RunnerType

strong

Enumerates the types of runner implementations supported.

This enumeration specifies the different runner types available for model inference. RunnerType identifies the runner implementation used to execute the model.

Enumerator
VAIML	VAIML-based runner implementation.

◆ StatusCode

enum vart::StatusCode

strong

Enumerates the status codes used in the VART.

This enum defines the various status codes that can be returned by VART functions, indicating the success or failure of an operation.

Enumerator
SUCCESS	Operation completed successfully.
FAILURE	Operation failed.
INVALID_INPUT	Invalid input parameters.
INVALID_OUTPUT	Invalid output parameters.
OUT_OF_MEMORY	Memory allocation failed.
RUNTIME_ERROR	Runtime error occurred.
JOB_PENDING	Job is still pending.
INVALID_JOB_ID	Provided job ID is invalid.
RESOURCE_UNAVAILABLE	Required resource is unavailable. Transient; retry the operation.

◆ TensorDirection

enum vart::TensorDirection

strong

Enumerates the supported tensor directions in the VART API.

This enum defines the various directions that tensors can have in the context of model inference. It includes input and output directions.

Enumerator
INPUT	Input tensor direction.
OUTPUT	Output tensor direction.

◆ TensorType

enum vart::TensorType

strong

Specifies the tensor types supported in the VART API.

Enumerates the available tensor types.

Note: AMD optimizes its AI engines with unique data formats and memory layouts. As a result, the HW tensor layout and format will typically differ from the CPU tensor representation defined by the ONNX model.

Enumerator
CPU	Tensor metadata from the ONNX model, as defined for standard CPU execution.
HW	AMD hardware-specific tensor metadata, formatted for direct execution on AMD AI engines.

Classes

Enumerations

Detailed Description

Enumeration Type Documentation

◆ DataType

◆ MemoryLayout

◆ MemoryType

◆ RoundingMode

◆ RunnerType

◆ StatusCode

◆ TensorDirection

◆ TensorType