VART-ML  0.3.0
vart Namespace Reference

VART (Vitis AI Runtime) ML inference API namespace. More...

Classes

struct  NpuTensorInfo
 Metadata structure describing a tensor used in VART. More...
 
class  NpuTensor
 This class represents a tensor in the VART API. More...
 
struct  QuantParameters
 Struct representing quantization parameters for a tensor. More...
 
struct  JobHandle
 Struct representing a job handle for asynchronous execution. More...
 
class  Runner
 Abstract base class for executing model inference operations. More...
 
class  RunnerFactory
 Factory class for creating Runner instances. More...
 

Enumerations

enum class  DataType {
  UNKNOWN , BOOLEAN , INT8 , UINT8 ,
  INT16 , UINT16 , BF16 , FP16 ,
  INT32 , UINT32 , FLOAT32 , INT64 ,
  UINT64
}
 Enumerates the supported data types for tensors in the VART API. More...
 
enum class  MemoryLayout {
  UNKNOWN , NC , NCH , NHC ,
  NHW , NWC , NHWC , NCHW ,
  NHWC4 , NHWC8 , NC4HW4 , NC8HW8 ,
  HCWNC4 , HCWNC8 , HCWNC16 , NHW16C4WC ,
  NHW16WC4C , GENERIC
}
 Enumerates the supported memory layouts for tensors in the VART API. More...
 
enum class  MemoryType {
  UNKNOWN , XRT_BO , DMA_FD , USER_POINTER_CMA ,
  USER_POINTER_NON_CMA
}
 Enumerates the various memory types utilized for tensors in the VART API. More...
 
enum class  TensorDirection { INPUT , OUTPUT }
 Enumerates the supported tensor directions in the VART API. More...
 
enum class  TensorType { CPU , HW }
 Specifies the tensor types supported in the VART API. More...
 
enum class  RunnerType { VAIML }
 Enumerates the types of runner implementations supported. More...
 
enum class  RoundingMode { UNKNOWN , ROUND_TO_NEAREST_EVEN , ROUND_TOWARD_ZERO }
 Enumerates the rounding modes used in quantization. More...
 
enum class  StatusCode {
  SUCCESS = 0 , FAILURE , INVALID_INPUT , INVALID_OUTPUT ,
  OUT_OF_MEMORY , RUNTIME_ERROR , JOB_PENDING , INVALID_JOB_ID ,
  RESOURCE_UNAVAILABLE
}
 Enumerates the status codes used in the VART. More...
 

Detailed Description

VART (Vitis AI Runtime) ML inference API namespace.

Provides ML inference APIs for AMD NPU hardware, including tensor management (NpuTensor), synchronous and asynchronous inference execution (Runner), and runner instantiation (RunnerFactory).

Enumeration Type Documentation

◆ DataType

enum vart::DataType
strong

Enumerates the supported data types for tensors in the VART API.

This enum defines the various data types that can be used to represent tensor elements. It includes integer and floating-point formats, as well as specialized types such as BF16.

The set of data types actually supported is RunnerType-specific. For RunnerType::VAIML, see RunnerType::VAIML Configuration.

Enumerator
UNKNOWN 

Unknown data type.

BOOLEAN 

Boolean type.

INT8 

8-bit signed integer.

UINT8 

8-bit unsigned integer.

INT16 

16-bit signed integer.

UINT16 

16-bit unsigned integer.

BF16 

16-bit Brain Floating Point format.

FP16 

16-bit floating point.

INT32 

32-bit signed integer.

UINT32 

32-bit unsigned integer.

FLOAT32 

32-bit floating point.

INT64 

64-bit signed integer.

UINT64 

64-bit unsigned integer.

◆ MemoryLayout

enum vart::MemoryLayout
strong

Enumerates the supported memory layouts for tensors in the VART API.

This enum defines the various memory layouts that can be used to represent tensor data. It includes formats such as NHWC, NCHW, and others that specify how tensor dimensions are organized in memory.

The set of memory layouts actually supported is RunnerType-specific. For RunnerType::VAIML, see RunnerType::VAIML Configuration.

Enumerator
UNKNOWN 

Unknown memory layout.

NC 

Model batch, Channels (packed format).

NCH 

Model batch, Channels (packed format), Height.

NHC 

Model batch, Height, Channels (packed format).

NHW 

Model batch, Height, Width.

NWC 

Model batch, Width, Channels (packed format).

NHWC 

Model batch, Height, Width, Channels (packed format).

NCHW 

Model batch, Channels, Height, Width (planar format).

NHWC4 

Model batch, Height, Width, Channel groups of 4 (e.g. RGBA).

NHWC8 

Model batch, Height, Width, Channel groups of 8.

NC4HW4 

Model batch, Channels / 4, Height, Width, Channel groups of 4.

NC8HW8 

Model batch, Channels / 8, Height, Width, Channel groups of 8.

HCWNC4 

Height, Channels / 4, Width, N = 1, Channel groups of 4.

HCWNC8 

Height, Channels / 8, Width, N = 1, Channel groups of 8.

HCWNC16 

Height, Channels / 16, Width, N = 1, Channel groups of 16.

NHW16C4WC 

Model batch, Height, Width / 16, Channels / 4, Width groups of 16, Channel groups of 4.

NHW16WC4C 

Model batch, Height, Width / 16, Width groups of 16, Channels / 4, Channel groups of 4.

GENERIC 

Generic layout. See NpuTensorInfo::memory_layout_order for more info.

◆ MemoryType

enum vart::MemoryType
strong

Enumerates the various memory types utilized for tensors in the VART API.

This enumeration specifies the locations where tensor data is stored.

Enumerator
UNKNOWN 

Memory type is not specified or recognized.

XRT_BO 

Buffer object associated with XRT.

DMA_FD 

File descriptor used for Direct Memory Access (DMA).

USER_POINTER_CMA 

User-provided pointer to a contiguous physical memory block.

USER_POINTER_NON_CMA 

User-provided pointer without contiguous memory guarantee (e.g. new, malloc).

◆ RoundingMode

enum vart::RoundingMode
strong

Enumerates the rounding modes used in quantization.

This enum defines the different rounding modes that can be applied during quantization, such as rounding to nearest even or truncating towards zero.

Enumerator
UNKNOWN 

Unknown rounding mode.

ROUND_TO_NEAREST_EVEN 

Round to nearest even value.

ROUND_TOWARD_ZERO 

Truncate towards zero (no rounding).

◆ RunnerType

enum vart::RunnerType
strong

Enumerates the types of runner implementations supported.

This enumeration specifies the different runner types available for model inference. RunnerType identifies the runner implementation used to execute the model.

Enumerator
VAIML 

VAIML-based runner implementation.

◆ StatusCode

enum vart::StatusCode
strong

Enumerates the status codes used in the VART.

This enum defines the various status codes that can be returned by VART functions, indicating the success or failure of an operation.

Enumerator
SUCCESS 

Operation completed successfully.

FAILURE 

Operation failed.

INVALID_INPUT 

Invalid input parameters.

INVALID_OUTPUT 

Invalid output parameters.

OUT_OF_MEMORY 

Memory allocation failed.

RUNTIME_ERROR 

Runtime error occurred.

JOB_PENDING 

Job is still pending.

INVALID_JOB_ID 

Provided job ID is invalid.

RESOURCE_UNAVAILABLE 

Required resource is unavailable. Transient; retry the operation.

◆ TensorDirection

enum vart::TensorDirection
strong

Enumerates the supported tensor directions in the VART API.

This enum defines the various directions that tensors can have in the context of model inference. It includes input and output directions.

Enumerator
INPUT 

Input tensor direction.

OUTPUT 

Output tensor direction.

◆ TensorType

enum vart::TensorType
strong

Specifies the tensor types supported in the VART API.

Enumerates the available tensor types.

Note
AMD optimizes its AI engines with unique data formats and memory layouts. As a result, the HW tensor layout and format will typically differ from the CPU tensor representation defined by the ONNX model.
Enumerator
CPU 

Tensor metadata from the ONNX model, as defined for standard CPU execution.

HW 

AMD hardware-specific tensor metadata, formatted for direct execution on AMD AI engines.