This is the API reference for AMD's VART-ML library — a high-performance C++ runtime for ML inference on AMD NPU hardware, providing a unified API across different AMD AI platforms. It provides information on individual classes, methods, and types.

VART (Vitis AI Runtime) is the runtime namespace (vart), and VART-ML is its ML inference interface supporting both hardware-native and ONNX-compatible tensor views. VART-ML offers zero-copy data paths, synchronous and asynchronous execution, and fine-grained control over tensor memory placement, all behind a small, consistent API. Applications interact only with the abstract interfaces; the concrete backend is selected through vart::RunnerType.

Key Classes

vart::RunnerFactory – Creates vart::Runner instances for a given vart::RunnerType.
vart::Runner – Abstract inference interface: synchronous and asynchronous execution, tensor allocation, and tensor metadata queries.
vart::NpuTensor – Handle that wraps a tensor buffer together with its metadata.

Supporting Types

Enums: vart::RunnerType, vart::MemoryType, vart::TensorType, vart::TensorDirection, vart::DataType, vart::MemoryLayout, vart::RoundingMode, vart::StatusCode. Structs: vart::NpuTensorInfo, vart::JobHandle, vart::QuantParameters.

Headers

Include vart/vart_runner_factory.hpp for the full API. It pulls in vart/vart_npu_tensor.hpp internally, so all tensor types and enums are available through this single include. Include vart/vart_npu_tensor.hpp directly only when working with tensors without the Runner interface.

RunnerType Configuration

Each vart::RunnerType has its own model loading mechanism, configuration options, and supported formats. Refer to the RunnerType-specific page for details:

RunnerType::VAIML Configuration