Class vart::Runner#

class Runner : public vart::BaseRunner<const void**, void**>#

Class of the Runner, provides API to use the runner.

The runner instance has a number of member functions to control the execution and get the input and output tensors of the runner.

Sample code:

// This example assumes that you have a snapshot stored in the model_path.
// The way to create a runner to run the snapshot is shown below.

// create runner
auto runner = vart::Runner::create_runner(model_path, in_shape_format, out_shape_format);
// get input tensors
auto input_tensors = runner->get_input_tensors();
// get output tensors
auto output_tensors = runner->get_output_tensors();
// run runner
auto v = runner->execute_async(input, output);
auto status = runner->wait((int)v.first, 1000000000);
}

Public Functions

virtual ~Runner() = default#
virtual std::pair<uint32_t, int> execute_async(const void **input, void **output) = 0#

Executes the runner.

This is a non-blocking function.

Parameters:
  • input – An array of pointer to the input buffers. For a model with N input layers and a snapshot with B as batch size, N*B pointers must be given. The input n of the batch b will be located at position [b*N + n]. In case an incomplete batch inference has to be executed, use NULL pointer for the unused batches. The non null pointers must point to a buffer with enough memory: In case native format is enabled, the size is obtained from get_native_size, otherwise, the size is the one of the original model, from the tensor structure.

  • output – An array of pointer to the output buffers. Similar as the input.

Returns:

pair<jobid, status> status 0 for exit successfully, others for customized warnings or errors

virtual std::pair<uint32_t, int> execute_async(const uint64_t *input, uint64_t *output) = 0#

Executes the runner.

This is a non-blocking function.

Parameters:
  • input – An array of physical addresses to the input buffers. For a model with N input layers and a snapshot with B as batch size, N*B addresses must be given. The input n of the batch b will be located at position [b*N + n]. In case an incomplete batch inference has to be executed, use 0 for the unused batches. The non null adresses must point to a buffer with enough memory: The size is obtained from get_native_size, and the data needs to be contigous in physical memory.

  • output – An array of physical addresses to the output buffers. Similar as the input.

  • jobid – job id, neg for any id, others for specific job id

virtual int execute(const void **input, void **output, int jobid = -1) = 0#

Executes the runner.

This is a blocking function.

Parameters:
  • input – An array of pointer to the input buffers. For a model with N input layers and a snapshot with B as batch size, N*B pointers must be given. The input n of the batch b will be located at position [b*N + n]. In case an incomplete batch inference has to be executed, use NULL pointer for the unused batches. The non null pointers must point to a buffer with enough memory: In case native format is enabled, the size is obtained from get_native_size, otherwise, the size is the one of the original model, from the tensor structure.

  • output – An array of pointer to the output buffers. Similar as the input.

  • jobid – job id, neg for any id, others for specific job id

Returns:

int The vaisw error (vaisw_error_id) code of the operation.

virtual int execute(const uint64_t *input, uint64_t *output, int jobid = -1) = 0#

Executes the runner.

This is a non-blocking function.

Parameters:
  • input – An array of physical addresses to the input buffers. For a model with N input layers and a snapshot with B as batch size, N*B addresses must be given. The input n of the batch b will be located at position [b*N + n]. In case an incomplete batch inference has to be executed, use 0 for the unused batches. The non null adresses must point to a buffer with enough memory: The size is obtained from get_native_size, and the data needs to be contigous in physical memory.

  • output – An array of physical addresses to the output buffers. Similar as the input.

  • jobid – job id, neg for any id, others for specific job id

Returns:

int The vaisw error (vaisw_error_id) code of the operation.

virtual std::vector<const npu_tensor_t*> get_input_tensors(void) = 0#

Waits for the end of DPU processing.

modes: 1. Blocking wait for specific ID. 2. Non-blocking wait for specific ID. 3. Blocking wait for any ID. 4. Non-blocking wait for any ID

Get all input tensors of runner.

Sample code:

inputTensors = runner->get_input_tensors();
for (auto input : inputTensor) {
    input->name;
    input->size;
    input->native_size;
    input->shape;
    input->coeff;
    input->data_type;
}
Parameters:
  • jobid – job id, neg for any id, others for specific job id

  • timeout – timeout, neg for block for ever, 0 for non-block, pos for block with a limitation(ms).

Returns:

status 0 for exit successfully, others for customized warnings or errors

Returns:

All input tensors. A vector of raw pointer to the input tensor.

virtual std::vector<const npu_tensor_t*> get_output_tensors(void) = 0#

Get all output tensors of runner.

Sample code:

outputTensors = runner->get_output_tensors();
for (auto output : outputTensor) {
    output->name;
    output->size;
    output->native_size;
    output->shape;
    output->coeff;
    output->data_type;
}
Returns:

All output tensors. A vector of raw pointer to the output tensor.

virtual const npu_tensor_t *get_tensor(const std::string &name) = 0#

Return a copy of the input/output tensor with the given name.

Parameters:

name – Name of the tensor.

virtual size_t get_batch_size(void) = 0#

Return the batch size of the snapshot.

This is the maximum size of a batch of images the engine can process in a single call.

Returns:

The maximum batch size supported by the snapshot.

virtual int set_input_cacheable_attribute(bool value) = 0#

Specify cacheability of memory region where input data is stored.

If this attribute is set to true, input data is assumed to be in a cacheable memory region and copying will be skipped. Input data copying is performed by default.

The purpose of this method is purely aiming towards performance-tuning. Depending on whether this assumption reflects reality or not, performance can either increase or decrease.

Only affect input tensors of AIE nodes.

Parameters:

value – Boolean to specify if input data is assumed to be in a cacheable memory region.

Returns:

int The vaisw error (vaisw_error_id) code of the operation.

virtual int set_output_cacheable_attribute(bool value) = 0#

Specify cacheability of memory region where output data is stored.

If this attribute is set to true, output data is assumed to be in a cacheable memory region and copying will be skipped. Output data copying is performed by default.

The purpose of this method is purely aiming towards performance-tuning. Depending on whether this assumption reflects reality or not, performance can either increase or decrease.

Only affect output tensors of AIE nodes.

Parameters:

value – Boolean to specify if output data is assumed to be in a cacheable memory region.

Returns:

int The vaisw error (vaisw_error_id) code of the operation.

virtual void *malloc_buffer(uint64_t size, uint8_t ddr = 0) = 0#

Return a pointer to a buffer of the given size.

The buffer allocation is done by the runner and the free will be done by the destructor.

Parameters:
  • size – Size of the buffer.

  • ddr – Index of the target DDR. On a multi-DDR platform, pass the index of the DDR where allocation will occur. Data payload distribution over multiple DDRs can improve the performance.

Returns:

Pointer to the newly created buffer.

virtual void free_buffer(void *buffer_ptr) = 0#

Free a previously allocated buffer.

The buffer allocation must have been done through the malloc_buffer function.

Parameters:

buffer_ptr – Pointer to the buffer.

virtual uint64_t get_physical_addr(void *buffer_ptr) = 0#

Return the physical address of a buffer.

The buffer allocation must have been done through the malloc_buffer function.

Parameters:

buffer_ptr – Pointer to the buffer.

Returns:

Physical address.

virtual uint8_t get_nb_ddrs(void) = 0#

Return the number of DDR used bu the NPU.

Returns:

Number of DDR.

virtual std::vector<uint32_t> get_shape(const npu_tensor_t *tensor, bool include_batch_size = false) = 0#

Return the shape of a tensor (by default, without batch size).

Parameters:
  • tensor – Tensor to get the shape from.

  • include_batch_size – If true, keep batch size in shape.

Returns:

The shape of the tensor.

virtual std::string get_shape_format(const npu_tensor_t *tensor) = 0#

Get the shape format of a tensor.

Parameters:

tensor – Tensor to get the shape format from.

Returns:

The shape format of the tensor.

virtual int set_shape_format(const npu_tensor_t *tensor, std::string format) = 0#

Select the shape format of a tensor.

Only works for tensors of AIE nodes. If the tensor is an input or an output of a CPU node, the function have no effect and returns an error.

Parameters:
  • tensor – Tensor of which the shape format will be changed.

  • format – New shape format of the input or output tensor.

Returns:

int The vaisw error (vaisw_error_id) code of the operation.

virtual std::string get_native_shape_format(const npu_tensor_t *tensor) = 0#

Get the native shape format of a tensor.

Parameters:

tensor – Tensor to get the native shape format from.

Returns:

The native shape format of the tensor.

virtual int set_input_shape_formats(std::string format) = 0#

Select the shape format for all input tensors of the graph.

Only works if all input tensors are linked to AIE nodes. If any input tensor is an input or an output of a CPU node, the function have no effect and returns an error.

Parameters:

format – New shape format of the input tensors.

Returns:

int The vaisw error (vaisw_error_id) code of the operation.

virtual int set_output_shape_formats(std::string format) = 0#

Select the shape format for all output tensors of the graph.

Only works if all output tensors are linked to AIE nodes. If any output tensor is an output of a CPU node, the function have no effect and returns an error.

Parameters:

format – New shape format of the output tensors.

Returns:

int The vaisw error (vaisw_error_id) code of the operation.

virtual DataType get_data_type(const npu_tensor_t *tensor) = 0#

Get the data type of a tensor.

Parameters:

tensor – Tensor to get the data type from.

Returns:

DataType The data type of the given tensor.

virtual int set_data_type(const npu_tensor_t *tensor, DataType data_type) = 0#

Select the data type of a tensor.

If the data type is different from the one expected by the snapshot, a convertion will be applied during the execution if possible.

Only works for tensors of AIE nodes. If the tensor is an input or an output of a CPU node, the function have no effect and returns an error.

Parameters:
  • tensor – Tensor of which the data type will be changed.

  • data_type – Data type of the data provided/expected by user for the tensor data.

Returns:

int The vaisw error (vaisw_error_id) code of the operation.

virtual int set_input_data_types(DataType data_type) = 0#

Select the data type for all input tensors of the graph.

If the data types are different from the ones expected by the snapshot, convertions will be applied during the execution if possible.

Only works if all input tensors are linked to AIE nodes. If any input tensor is an input or an output of a CPU node, the function have no effect and returns an error.

Parameters:

data_type – Data types of the data provided/expected by user for the input tensors data.

Returns:

int The vaisw error (vaisw_error_id) code of the operation.

virtual int set_output_data_types(DataType data_type) = 0#

Select the data type for all output tensors of the graph.

If the data types are different from the ones expected by the snapshot, convertions will be applied during the execution if possible.

Only works if all output tensors are linked to AIE nodes. If any output tensor is an output of a CPU node, the function have no effect and returns an error.

Parameters:

data_type – Data types of the data provided/expected by user for the output tensors data.

Returns:

int The vaisw error (vaisw_error_id) code of the operation.

virtual size_t get_type_size(const npu_tensor_t *tensor) = 0#

Return the size of a tensor’s data type.

Parameters:

tensor – Tensor to get the data type size from.

Returns:

The size of the data type of the tensor.

virtual size_t get_size(const npu_tensor_t *tensor) = 0#

Return the size of a tensor’s element (without batch size).

Parameters:

tensor – Tensor to get the element size from.

Returns:

The size of an element of the tensor.

virtual size_t get_native_size(const npu_tensor_t *tensor) = 0#

Return the native size in DDR of an tensor.

Parameters:

tensor – Tensor to get the native size from.

Returns:

The native size of the tensor.

virtual int set_native_format(const npu_tensor_t *tensor, int format) = 0#

Select the data format of a tensor.

The native format currently support 3 values:

  • 0: non native format (pointer is virtual) (equivalent to ‘false’ in the previous API version)

  • 1: native format and pointer is virtual (equivalent to ‘true’ in the previous API version)

  • 2: native format and pointer is physical

Only works for tensors of AIE nodes. If the tensor is an input or an output of a CPU node, the function have no effect and returns an error.

Parameters:
  • tensor – Tensor of which the data format will be changed.

  • format – Indicates the data format of the data provided/expected by user for the tensor data.

Returns:

int The vaisw error (vaisw_error_id) code of the operation.

virtual int set_input_native_formats(int format) = 0#

Select the data format for all input tensors of the graph.

The native format currently support 3 values:

  • 0: non native format (pointer is virtual) (equivalent to ‘false’ in the previous API version)

  • 1: native format and pointer is virtual (equivalent to ‘true’ in the previous API version)

  • 2: native format and pointer is physical

Only works if all input tensors are linked to AIE nodes. If any input tensor is an input or an output of a CPU node, the function have no effect and returns an error.

Parameters:

format – Indicates the data format of the data provided by user for the input tensors data.

Returns:

int The vaisw error (vaisw_error_id) code of the operation.

virtual int set_output_native_formats(int format) = 0#

Select the data format for all output tensors of the graph.

The native format currently support 3 values:

  • 0: non native format (pointer is virtual) (equivalent to ‘false’ in the previous API version)

  • 1: native format and pointer is virtual (equivalent to ‘true’ in the previous API version)

  • 2: native format and pointer is physical

Only works if all output tensors are linked to AIE nodes. If any output tensor is an output of a CPU node, the function have no effect and returns an error.

Parameters:

format – Indicates the data format of the data expected by user for the output tensors data.

Returns:

int The vaisw error (vaisw_error_id) code of the operation.

virtual std::vector<uint32_t> get_strides(const npu_tensor_t *tensor) = 0#

Return the strides of a graph’s tensor.

Parameters:

tensor – Tensor to get the strides from.

Returns:

The strides of the tensor.

virtual int set_strides(const npu_tensor_t *tensor, std::vector<uint32_t> strides) = 0#

Set the strides of a graph’s tensor.

The data are considered packed by default. If a padding was applied, the strides need to be adjusted with the correct value.

Only works for tensors of AIE nodes. If the tensor is an input or an output of a CPU node, the function have no effect and returns an error.

Parameters:
  • tensor – Tensor to set the strides to.

  • strides – New strides for the tensor.

Returns:

int The vaisw error (vaisw_error_id) code of the operation.

virtual std::pair<std::uint32_t, int> execute_async(InputType input, OutputType output) = 0#

execute_async

Parameters:
  • input – inputs with a customized type

  • output – outputs with a customized type

Returns:

pair<jobid, status> status 0 for exit successfully, others for customized warnings or errors

virtual int execute(InputType input, OutputType output, int jobid) = 0#

execute

Parameters:
  • input – inputs with a customized type

  • output – outputs with a customized type

  • jobid – job id

Returns:

int The vaisw error (vaisw_error_id) code of the operation.

virtual int wait(int jobid, int timeout = -1) = 0#

wait

modes: 1. Blocking wait for specific ID. 2. Non-blocking wait for specific ID. 3. Blocking wait for any ID. 4. Non-blocking wait for any ID

Parameters:
  • jobid – job id, neg for any id, others for specific job id

  • timeout – timeout, neg for block for ever, 0 for non-block, pos for block with a limitation(ms).

Returns:

status 0 for exit successfully, others for customized warnings or errors

Public Static Functions

static std::unique_ptr<Runner> create_runner(const std::string &model_path, const std::string &in_shape_format = "NHWC", const std::string &out_shape_format = "NHWC", const std::optional<std::vector<std::string>> output_names = std::nullopt, bool aie_only = false)#

Factory function to create an instance of runner by snapshot.

Sample code:

// This API can be used like:
auto runner = vart::Runner::create_runner(model_path, in_shape_format, out_shape_format);
Parameters:

model_path – snapshot directory

Returns:

An instance of runner.