|
VART-ML
0.3.0
|
VART (Vitis AI Runtime) ML inference API namespace. More...
Classes | |
| struct | NpuTensorInfo |
| Metadata structure describing a tensor used in VART. More... | |
| class | NpuTensor |
| This class represents a tensor in the VART API. More... | |
| struct | QuantParameters |
| Struct representing quantization parameters for a tensor. More... | |
| struct | JobHandle |
| Struct representing a job handle for asynchronous execution. More... | |
| class | Runner |
| Abstract base class for executing model inference operations. More... | |
| class | RunnerFactory |
| Factory class for creating Runner instances. More... | |
Enumerations | |
| enum class | DataType { UNKNOWN , BOOLEAN , INT8 , UINT8 , INT16 , UINT16 , BF16 , FP16 , INT32 , UINT32 , FLOAT32 , INT64 , UINT64 } |
| Enumerates the supported data types for tensors in the VART API. More... | |
| enum class | MemoryLayout { UNKNOWN , NC , NCH , NHC , NHW , NWC , NHWC , NCHW , NHWC4 , NHWC8 , NC4HW4 , NC8HW8 , HCWNC4 , HCWNC8 , HCWNC16 , NHW16C4WC , NHW16WC4C , GENERIC } |
| Enumerates the supported memory layouts for tensors in the VART API. More... | |
| enum class | MemoryType { UNKNOWN , XRT_BO , DMA_FD , USER_POINTER_CMA , USER_POINTER_NON_CMA } |
| Enumerates the various memory types utilized for tensors in the VART API. More... | |
| enum class | TensorDirection { INPUT , OUTPUT } |
| Enumerates the supported tensor directions in the VART API. More... | |
| enum class | TensorType { CPU , HW } |
| Specifies the tensor types supported in the VART API. More... | |
| enum class | RunnerType { VAIML } |
| Enumerates the types of runner implementations supported. More... | |
| enum class | RoundingMode { UNKNOWN , ROUND_TO_NEAREST_EVEN , ROUND_TOWARD_ZERO } |
| Enumerates the rounding modes used in quantization. More... | |
| enum class | StatusCode { SUCCESS = 0 , FAILURE , INVALID_INPUT , INVALID_OUTPUT , OUT_OF_MEMORY , RUNTIME_ERROR , JOB_PENDING , INVALID_JOB_ID , RESOURCE_UNAVAILABLE } |
| Enumerates the status codes used in the VART. More... | |
VART (Vitis AI Runtime) ML inference API namespace.
Provides ML inference APIs for AMD NPU hardware, including tensor management (NpuTensor), synchronous and asynchronous inference execution (Runner), and runner instantiation (RunnerFactory).
|
strong |
Enumerates the supported data types for tensors in the VART API.
This enum defines the various data types that can be used to represent tensor elements. It includes integer and floating-point formats, as well as specialized types such as BF16.
The set of data types actually supported is RunnerType-specific. For RunnerType::VAIML, see RunnerType::VAIML Configuration.
|
strong |
Enumerates the supported memory layouts for tensors in the VART API.
This enum defines the various memory layouts that can be used to represent tensor data. It includes formats such as NHWC, NCHW, and others that specify how tensor dimensions are organized in memory.
The set of memory layouts actually supported is RunnerType-specific. For RunnerType::VAIML, see RunnerType::VAIML Configuration.
| Enumerator | |
|---|---|
| UNKNOWN | Unknown memory layout. |
| NC | Model batch, Channels (packed format). |
| NCH | Model batch, Channels (packed format), Height. |
| NHC | Model batch, Height, Channels (packed format). |
| NHW | Model batch, Height, Width. |
| NWC | Model batch, Width, Channels (packed format). |
| NHWC | Model batch, Height, Width, Channels (packed format). |
| NCHW | Model batch, Channels, Height, Width (planar format). |
| NHWC4 | Model batch, Height, Width, Channel groups of 4 (e.g. RGBA). |
| NHWC8 | Model batch, Height, Width, Channel groups of 8. |
| NC4HW4 | Model batch, Channels / 4, Height, Width, Channel groups of 4. |
| NC8HW8 | Model batch, Channels / 8, Height, Width, Channel groups of 8. |
| HCWNC4 | Height, Channels / 4, Width, N = 1, Channel groups of 4. |
| HCWNC8 | Height, Channels / 8, Width, N = 1, Channel groups of 8. |
| HCWNC16 | Height, Channels / 16, Width, N = 1, Channel groups of 16. |
| NHW16C4WC | Model batch, Height, Width / 16, Channels / 4, Width groups of 16, Channel groups of 4. |
| NHW16WC4C | Model batch, Height, Width / 16, Width groups of 16, Channels / 4, Channel groups of 4. |
| GENERIC | Generic layout. See NpuTensorInfo::memory_layout_order for more info. |
|
strong |
Enumerates the various memory types utilized for tensors in the VART API.
This enumeration specifies the locations where tensor data is stored.
|
strong |
Enumerates the rounding modes used in quantization.
This enum defines the different rounding modes that can be applied during quantization, such as rounding to nearest even or truncating towards zero.
| Enumerator | |
|---|---|
| UNKNOWN | Unknown rounding mode. |
| ROUND_TO_NEAREST_EVEN | Round to nearest even value. |
| ROUND_TOWARD_ZERO | Truncate towards zero (no rounding). |
|
strong |
|
strong |
Enumerates the status codes used in the VART.
This enum defines the various status codes that can be returned by VART functions, indicating the success or failure of an operation.
|
strong |
|
strong |
Specifies the tensor types supported in the VART API.
Enumerates the available tensor types.
| Enumerator | |
|---|---|
| CPU | Tensor metadata from the ONNX model, as defined for standard CPU execution. |
| HW | AMD hardware-specific tensor metadata, formatted for direct execution on AMD AI engines. |