Namespace vart

Namespace vart#

namespace vart#

Typedefs

typedef struct vart::npu_tensor npu_tensor_t#

Structure of a tensor.

This structure contains all the parameters needed to define a tensor.

Enums

enum DataType#

Enum that list the different data type supported by the VART API.

Values:

enumerator INT8#

enumerator FLOAT32#

enumerator UINT8#

enumerator BF16#

enumerator INT64#

enumerator UNKNOWN#

Variables

static const int sys_log_level[] = {0, LOG_ERR, LOG_WARNING, LOG_NOTICE, LOG_INFO, LOG_DEBUG}#

static Logger &obj = Logger::get_instance()#

struct npu_tensor#

#include <runner.h>

Structure of a tensor.

This structure contains all the parameters needed to define a tensor.

template<typename InputType, typename OutputType> class BaseRunner#

Subclassed by Runner

Public Functions

virtual std::pair<std::uint32_t, int> execute_async(InputType input, OutputType output) = 0#

execute_async

Parameters:

input – inputs with a customized type
output – outputs with a customized type

Returns:

pair<jobid, status> status 0 for exit successfully, others for customized warnings or errors

virtual int execute(InputType input, OutputType output, int jobid) = 0#

execute

Parameters:

input – inputs with a customized type
output – outputs with a customized type
jobid – job id

Returns:

int The vaisw error (vaisw_error_id) code of the operation.

virtual int wait(int jobid, int timeout = -1) = 0#

wait

modes: 1. Blocking wait for specific ID. 2. Non-blocking wait for specific ID. 3. Blocking wait for any ID. 4. Non-blocking wait for any ID

Parameters:

jobid – job id, neg for any id, others for specific job id
timeout – timeout, neg for block for ever, 0 for non-block, pos for block with a limitation(ms).

Returns:

status 0 for exit successfully, others for customized warnings or errors

class Runner : public vart::BaseRunner<const void**, void**>#

#include <runner.h>

Class of the Runner, provides API to use the runner.

The runner instance has a number of member functions to control the execution and get the input and output tensors of the runner.

Sample code:

// This example assumes that you have a snapshot stored in the model_path.
// The way to create a runner to run the snapshot is shown below.

// create runner
auto runner = vart::Runner::create_runner(model_path, in_shape_format, out_shape_format);
// get input tensors
auto input_tensors = runner->get_input_tensors();
// get output tensors
auto output_tensors = runner->get_output_tensors();
// run runner
auto v = runner->execute_async(input, output);
auto status = runner->wait((int)v.first, 1000000000);
}

Public Functions

virtual std::pair<uint32_t, int> execute_async(const void **input, void **output) = 0#

Executes the runner.

This is a non-blocking function.

Parameters:

input – An array of pointer to the input buffers. For a model with N input layers and a snapshot with B as batch size, N*B pointers must be given. The input n of the batch b will be located at position [b*N + n]. In case an incomplete batch inference has to be executed, use NULL pointer for the unused batches. The non null pointers must point to a buffer with enough memory: In case native format is enabled, the size is obtained from get_native_size, otherwise, the size is the one of the original model, from the tensor structure.
output – An array of pointer to the output buffers. Similar as the input.

Returns:

pair<jobid, status> status 0 for exit successfully, others for customized warnings or errors

virtual std::pair<uint32_t, int> execute_async(const uint64_t *input, uint64_t *output) = 0#

Executes the runner.

This is a non-blocking function.

Parameters:

input – An array of physical addresses to the input buffers. For a model with N input layers and a snapshot with B as batch size, N*B addresses must be given. The input n of the batch b will be located at position [b*N + n]. In case an incomplete batch inference has to be executed, use 0 for the unused batches. The non null adresses must point to a buffer with enough memory: The size is obtained from get_native_size, and the data needs to be contigous in physical memory.
output – An array of physical addresses to the output buffers. Similar as the input.
jobid – job id, neg for any id, others for specific job id

virtual int execute(const void **input, void **output, int jobid = -1) = 0#

Executes the runner.

This is a blocking function.

Parameters:

input – An array of pointer to the input buffers. For a model with N input layers and a snapshot with B as batch size, N*B pointers must be given. The input n of the batch b will be located at position [b*N + n]. In case an incomplete batch inference has to be executed, use NULL pointer for the unused batches. The non null pointers must point to a buffer with enough memory: In case native format is enabled, the size is obtained from get_native_size, otherwise, the size is the one of the original model, from the tensor structure.
output – An array of pointer to the output buffers. Similar as the input.
jobid – job id, neg for any id, others for specific job id

Returns:

int The vaisw error (vaisw_error_id) code of the operation.

virtual int execute(const uint64_t *input, uint64_t *output, int jobid = -1) = 0#

Executes the runner.

This is a non-blocking function.

Parameters:

input – An array of physical addresses to the input buffers. For a model with N input layers and a snapshot with B as batch size, N*B addresses must be given. The input n of the batch b will be located at position [b*N + n]. In case an incomplete batch inference has to be executed, use 0 for the unused batches. The non null adresses must point to a buffer with enough memory: The size is obtained from get_native_size, and the data needs to be contigous in physical memory.
output – An array of physical addresses to the output buffers. Similar as the input.
jobid – job id, neg for any id, others for specific job id

Returns:

int The vaisw error (vaisw_error_id) code of the operation.

virtual std::vector<const npu_tensor_t*> get_input_tensors(void) = 0#

Waits for the end of DPU processing.

modes: 1. Blocking wait for specific ID. 2. Non-blocking wait for specific ID. 3. Blocking wait for any ID. 4. Non-blocking wait for any ID

Get all input tensors of runner.

Sample code:

inputTensors = runner->get_input_tensors();
for (auto input : inputTensor) {
    input->name;
    input->size;
    input->native_size;
    input->shape;
    input->coeff;
    input->data_type;
}

Parameters:

jobid – job id, neg for any id, others for specific job id
timeout – timeout, neg for block for ever, 0 for non-block, pos for block with a limitation(ms).

Returns:

status 0 for exit successfully, others for customized warnings or errors

Returns:

All input tensors. A vector of raw pointer to the input tensor.

virtual std::vector<const npu_tensor_t*> get_output_tensors(void) = 0#

Get all output tensors of runner.

Sample code:

outputTensors = runner->get_output_tensors();
for (auto output : outputTensor) {
    output->name;
    output->size;
    output->native_size;
    output->shape;
    output->coeff;
    output->data_type;
}

Returns:: All output tensors. A vector of raw pointer to the output tensor.

virtual const npu_tensor_t *get_tensor(const std::string &name) = 0#

Return a copy of the input/output tensor with the given name.

Parameters:: name – Name of the tensor.

virtual size_t get_batch_size(void) = 0#

Return the batch size of the snapshot.

This is the maximum size of a batch of images the engine can process in a single call.

Returns:: The maximum batch size supported by the snapshot.

virtual int set_input_cacheable_attribute(bool value) = 0#

Specify cacheability of memory region where input data is stored.

If this attribute is set to true, input data is assumed to be in a cacheable memory region and copying will be skipped. Input data copying is performed by default.

The purpose of this method is purely aiming towards performance-tuning. Depending on whether this assumption reflects reality or not, performance can either increase or decrease.

Only affect input tensors of AIE nodes.

Parameters:: value – Boolean to specify if input data is assumed to be in a cacheable memory region.
Returns:: int The vaisw error (vaisw_error_id) code of the operation.

virtual int set_output_cacheable_attribute(bool value) = 0#

Specify cacheability of memory region where output data is stored.

If this attribute is set to true, output data is assumed to be in a cacheable memory region and copying will be skipped. Output data copying is performed by default.

The purpose of this method is purely aiming towards performance-tuning. Depending on whether this assumption reflects reality or not, performance can either increase or decrease.

Only affect output tensors of AIE nodes.

Parameters:: value – Boolean to specify if output data is assumed to be in a cacheable memory region.
Returns:: int The vaisw error (vaisw_error_id) code of the operation.

virtual void *malloc_buffer(uint64_t size, uint8_t ddr = 0) = 0#

Return a pointer to a buffer of the given size.

The buffer allocation is done by the runner and the free will be done by the destructor.

Parameters:

size – Size of the buffer.
ddr – Index of the target DDR. On a multi-DDR platform, pass the index of the DDR where allocation will occur. Data payload distribution over multiple DDRs can improve the performance.

Returns:

Pointer to the newly created buffer.

virtual void free_buffer(void *buffer_ptr) = 0#

Free a previously allocated buffer.

The buffer allocation must have been done through the malloc_buffer function.

Parameters:: buffer_ptr – Pointer to the buffer.

virtual uint64_t get_physical_addr(void *buffer_ptr) = 0#

Return the physical address of a buffer.

The buffer allocation must have been done through the malloc_buffer function.

Parameters:: buffer_ptr – Pointer to the buffer.
Returns:: Physical address.

virtual uint8_t get_nb_ddrs(void) = 0#

Return the number of DDR used bu the NPU.

Returns:: Number of DDR.

virtual std::vector<uint32_t> get_shape(const npu_tensor_t *tensor, bool include_batch_size = false) = 0#

Return the shape of a tensor (by default, without batch size).

Parameters:

tensor – Tensor to get the shape from.
include_batch_size – If true, keep batch size in shape.

Returns:

The shape of the tensor.

virtual std::string get_shape_format(const npu_tensor_t *tensor) = 0#

Get the shape format of a tensor.

Parameters:: tensor – Tensor to get the shape format from.
Returns:: The shape format of the tensor.

virtual int set_shape_format(const npu_tensor_t *tensor, std::string format) = 0#

Select the shape format of a tensor.

Only works for tensors of AIE nodes. If the tensor is an input or an output of a CPU node, the function have no effect and returns an error.

Parameters:

tensor – Tensor of which the shape format will be changed.
format – New shape format of the input or output tensor.

Returns:

int The vaisw error (vaisw_error_id) code of the operation.

virtual std::string get_native_shape_format(const npu_tensor_t *tensor) = 0#

Get the native shape format of a tensor.

Parameters:: tensor – Tensor to get the native shape format from.
Returns:: The native shape format of the tensor.

virtual int set_input_shape_formats(std::string format) = 0#

Select the shape format for all input tensors of the graph.

Only works if all input tensors are linked to AIE nodes. If any input tensor is an input or an output of a CPU node, the function have no effect and returns an error.

Parameters:: format – New shape format of the input tensors.
Returns:: int The vaisw error (vaisw_error_id) code of the operation.

virtual int set_output_shape_formats(std::string format) = 0#

Select the shape format for all output tensors of the graph.

Only works if all output tensors are linked to AIE nodes. If any output tensor is an output of a CPU node, the function have no effect and returns an error.

Parameters:: format – New shape format of the output tensors.
Returns:: int The vaisw error (vaisw_error_id) code of the operation.

virtual DataType get_data_type(const npu_tensor_t *tensor) = 0#

Get the data type of a tensor.

Parameters:: tensor – Tensor to get the data type from.
Returns:: DataType The data type of the given tensor.

virtual int set_data_type(const npu_tensor_t *tensor, DataType data_type) = 0#

Select the data type of a tensor.

If the data type is different from the one expected by the snapshot, a convertion will be applied during the execution if possible.

Only works for tensors of AIE nodes. If the tensor is an input or an output of a CPU node, the function have no effect and returns an error.

Parameters:

tensor – Tensor of which the data type will be changed.
data_type – Data type of the data provided/expected by user for the tensor data.

Returns:

int The vaisw error (vaisw_error_id) code of the operation.

virtual int set_input_data_types(DataType data_type) = 0#

Select the data type for all input tensors of the graph.

If the data types are different from the ones expected by the snapshot, convertions will be applied during the execution if possible.

Only works if all input tensors are linked to AIE nodes. If any input tensor is an input or an output of a CPU node, the function have no effect and returns an error.

Parameters:: data_type – Data types of the data provided/expected by user for the input tensors data.
Returns:: int The vaisw error (vaisw_error_id) code of the operation.

virtual int set_output_data_types(DataType data_type) = 0#

Select the data type for all output tensors of the graph.

If the data types are different from the ones expected by the snapshot, convertions will be applied during the execution if possible.

Only works if all output tensors are linked to AIE nodes. If any output tensor is an output of a CPU node, the function have no effect and returns an error.

Parameters:: data_type – Data types of the data provided/expected by user for the output tensors data.
Returns:: int The vaisw error (vaisw_error_id) code of the operation.

virtual size_t get_type_size(const npu_tensor_t *tensor) = 0#

Return the size of a tensor’s data type.

Parameters:: tensor – Tensor to get the data type size from.
Returns:: The size of the data type of the tensor.

virtual size_t get_size(const npu_tensor_t *tensor) = 0#

Return the size of a tensor’s element (without batch size).

Parameters:: tensor – Tensor to get the element size from.
Returns:: The size of an element of the tensor.

virtual size_t get_native_size(const npu_tensor_t *tensor) = 0#

Return the native size in DDR of an tensor.

Parameters:: tensor – Tensor to get the native size from.
Returns:: The native size of the tensor.

virtual int set_native_format(const npu_tensor_t *tensor, int format) = 0#

Select the data format of a tensor.

The native format currently support 3 values:

0: non native format (pointer is virtual) (equivalent to ‘false’ in the previous API version)
1: native format and pointer is virtual (equivalent to ‘true’ in the previous API version)
2: native format and pointer is physical

Only works for tensors of AIE nodes. If the tensor is an input or an output of a CPU node, the function have no effect and returns an error.

Parameters:

tensor – Tensor of which the data format will be changed.
format – Indicates the data format of the data provided/expected by user for the tensor data.

Returns:

int The vaisw error (vaisw_error_id) code of the operation.

virtual int set_input_native_formats(int format) = 0#

Select the data format for all input tensors of the graph.

The native format currently support 3 values:

0: non native format (pointer is virtual) (equivalent to ‘false’ in the previous API version)
1: native format and pointer is virtual (equivalent to ‘true’ in the previous API version)
2: native format and pointer is physical

Only works if all input tensors are linked to AIE nodes. If any input tensor is an input or an output of a CPU node, the function have no effect and returns an error.

Parameters:: format – Indicates the data format of the data provided by user for the input tensors data.
Returns:: int The vaisw error (vaisw_error_id) code of the operation.

virtual int set_output_native_formats(int format) = 0#

Select the data format for all output tensors of the graph.

The native format currently support 3 values:

0: non native format (pointer is virtual) (equivalent to ‘false’ in the previous API version)
1: native format and pointer is virtual (equivalent to ‘true’ in the previous API version)
2: native format and pointer is physical

Only works if all output tensors are linked to AIE nodes. If any output tensor is an output of a CPU node, the function have no effect and returns an error.

Parameters:: format – Indicates the data format of the data expected by user for the output tensors data.
Returns:: int The vaisw error (vaisw_error_id) code of the operation.

virtual std::vector<uint32_t> get_strides(const npu_tensor_t *tensor) = 0#

Return the strides of a graph’s tensor.

Parameters:: tensor – Tensor to get the strides from.
Returns:: The strides of the tensor.

virtual int set_strides(const npu_tensor_t *tensor, std::vector<uint32_t> strides) = 0#

Set the strides of a graph’s tensor.

The data are considered packed by default. If a padding was applied, the strides need to be adjusted with the correct value.

Only works for tensors of AIE nodes. If the tensor is an input or an output of a CPU node, the function have no effect and returns an error.

Parameters:

tensor – Tensor to set the strides to.
strides – New strides for the tensor.

Returns:

int The vaisw error (vaisw_error_id) code of the operation.

virtual std::pair<std::uint32_t, int> execute_async(InputType input, OutputType output) = 0#

execute_async

Parameters:

input – inputs with a customized type
output – outputs with a customized type

Returns:

pair<jobid, status> status 0 for exit successfully, others for customized warnings or errors

virtual int execute(InputType input, OutputType output, int jobid) = 0#

execute

Parameters:

input – inputs with a customized type
output – outputs with a customized type
jobid – job id

Returns:

int The vaisw error (vaisw_error_id) code of the operation.

virtual int wait(int jobid, int timeout = -1) = 0#

wait

modes: 1. Blocking wait for specific ID. 2. Non-blocking wait for specific ID. 3. Blocking wait for any ID. 4. Non-blocking wait for any ID

Parameters:

jobid – job id, neg for any id, others for specific job id
timeout – timeout, neg for block for ever, 0 for non-block, pos for block with a limitation(ms).

Returns:

status 0 for exit successfully, others for customized warnings or errors

Public Static Functions

static std::unique_ptr<Runner> create_runner(const std::string &model_path, const std::string &in_shape_format = "NHWC", const std::string &out_shape_format = "NHWC", const std::optional<std::vector<std::string>> output_names = std::nullopt, bool aie_only = false)#

Factory function to create an instance of runner by snapshot.

Sample code:

// This API can be used like:
auto runner = vart::Runner::create_runner(model_path, in_shape_format, out_shape_format);

Parameters:: model_path – snapshot directory
Returns:: An instance of runner.

class Device#

#include <vart_device.hpp>

This module manages the hardware context and loading of xclbin on to the device.

Please check API documentation for more information. Any module utilizing hardware acceleration requires an instance of the Device class.

Public Functions

const std::shared_ptr<DeviceImplBase> &get_pimpl_handle() const#

get_pimpl_handle() - Retrieves pointer to Device implementation Base class

Returns:: Returns a constant reference of pointer to implementation class.

int32_t get_device_index() const#

get_device_index() - Retrieves device index of current instance

Returns:: Returns the device index

Public Static Functions

static std::shared_ptr<Device> get_device_hdl(const int32_t dev_idx, const std::string &xclbin_loc)#

get_device_hdl() - Creates a Device() instance if not created earlier, else return an existing instance for given device index and xclbin.

The following criteria is the basis for either to create a new Device instance or not:

------------------------------------------------------------
| dev_idx    | xclbin |   Result                           |
------------------------------------------------------------
| Exists     | Exists | Returns an existing Device instance|
| new        | new    | creates a new Device instance      |
| Exists     | new    | Returns error                      |
| new        | Exists | creates a new Device instance      |
------------------------------------------------------------

Parameters:

dev_idx – Device index on which XCLBIN is to be loaded.
xclbin_loc – XCLBIN path.

Returns:

Returns a Device pointer created using given dev_idx, xclbin_loc

class InferResult : public std::enable_shared_from_this<InferResult>#

#include <vart_inferresult.hpp>

This module is used to represent inference results.

Presently, the default supported types include classification and detection. Users can integrate new types by overriding base class methods to incorporate custom inference results. For additional information please check API documentation.

Public Functions

InferResult(InferResultType infer_res_type)#

InferResult() - Constructor for using supported infer result type implementations.

Parameters:: infer_res_type – Enum class to specify which implementation to instantiate

InferResult(std::shared_ptr<InferResultImplBase> ptr)#

InferResult() - Constructor for using user defined implementation.

Parameters:: ptr – Pointer to user’s implementation instance

const std::shared_ptr<InferResultImplBase> &get_pimpl_handle() const#

get_pimpl_handle() - Gives pointer to implementation class.

Returns:: Returns a constant reference of pointer to implementation class.

void transform(InferResScaleInfo &info)#

transform() - Scales the infer results by a given scale factor

Typically, in video pipelines, the original input resolution differs from the resolution of the input to the machine learning model. If we have inference results, like detection from may be 200x200 resolution (input resolution to ML model) and if we want to transform the bounding box results to the original input resolution, which may be 1920x1080. In such cases, one can use this function to transform the bounding box dimensions in the infer results to the resolution of input image.

Parameters:: info – Scale factor info with which infer results to be scaled.

InferResultData *get_infer_result()#

get_infer_result() - Get inference results for a frame.

Returns:: Inference result object for the requested frame.

void add_child(std::shared_ptr<InferResult> child)#

add_child() - Add a child to this infer result node.

User can use this method to build cascade infer result tree by adding child nodes to the previous inference level starting with a dummy root node.

Parameters:: child – Child node to be added

void add_children(std::vector<std::shared_ptr<InferResult>> arg_children)#

add_children() - Add all children at once.

User can use this method to add all children at once.

Parameters:: arg_children – Child nodes to be added

std::vector<std::shared_ptr<InferResult>> get_children()#

get_children() - Returns all children of the node.

Returns:: vector of child nodes.

std::shared_ptr<InferResult> get_parent()#

get_parent() - Returns parent of the given inference result node.

Returns:: Parent result node.

std::shared_ptr<InferResult> get_root()#

get_root() - Returns root of the given inference result node.

Returns:: Root result node.

void traverse(const std::function<void(std::shared_ptr<InferResult>, void*)> &callback, TraversalOrder order, void *user_data)#

traverse() - Traverse through the tree nodes in a given order.

Parameters:

callback – User callback function to be called for each node as per the order chosen.
order – Order in which the tree nodes to be traversed.
user_data – User data to be passed to the callback function.

size_t get_depth()#

get_depth() - Returns depth of the current node

Returns:: Depth of the current node.

class Logger#

#include <vart_logger.hpp>

The logger module provides the logging support for VART modules.

Supports logging to console, file, syslog.

Public Functions

std::shared_ptr<struct log_context> mod_register(const char *modname, bool need_id)#

mod_register() - Registers a module with logger object

If the module is already registered with the logger, its instance id is incremented, else a default instance id = 0 is assigned. Module instance log level is assigned from logger object’s database or from global log level if per module log level is not provided by user. If need_id is set to false, module instance id is not appended to module’s instance name. Components such as ‘videoframe’, ‘memory’ need to set this flag to false as there can be too many instances of those components in a single pipeline use case and can be be distinguished by their pointers. Components representing IP blocks can set this to true to distinguish IP instances.

Parameters:

modname – Module name to register with logger
need_id – Flag describing the need for an id to be appended to module name during logging

Returns:

Returns logger context associated with a module instance.

void vart_logger_log_obj(LogLevel, std::shared_ptr<struct log_context> ctx, const char *filename, const char *func, uint32_t line, const char *fmt, ...)#

vart_logger_log_obj() - Logs a module’s message along with metadata like filename, function name, line number, process id, thread id.

Parameters:

LogLevel – Logging level with which module logs a message
ctx – Module’s logger context obtained with mod_register()
filename – Source code filename from which this logging is triggered
func – Source code function name
line – Source code line number
fmt – Format string passed for logging.

Public Static Functions

static inline Logger &get_instance()#

get_instance() - Creates a logger() instance if not created earlier, else return an existing instance.

Logger object when created reads the environment variable VART_CORE_DEBUG to get and store log levels of each module and global log level. Default module instance id is stored as -1. This id will be updated later when each module instance registers with logger. Apart from that, logger reads the environment variable VART_LOG_FILE_PATH and if:

VART_LOG_FILE_PATH is set to CONSOLE, logs messages to on screen console.
VART_LOG_FILE_PATH is set to SYSLOG, appends log messages to syslog.
VART_LOG_FILE_PATH is set to a file on disk, log messages to that file. If the file cannot be opened to log, defaults to syslog logging.

Returns:: Returns a logger instance

class Memory#

#include <vart_memory.hpp>

This module is responsible for allocating and managing memory on the device.

Public Functions

Memory(MemoryImplType type, size_t size, uint8_t mbank_idx, std::shared_ptr<Device> device)#

Memory() - Constructor to allocate memory using implementation specific method based on input ‘type’.

Parameters:

type – Enum class to specify which type of memory allocation method to use
size – Size of the buffer
mbank_idx – memory bank index on which memory needs to be allocated
device – Device handle to be used by implementation

Memory(MemoryImplType type, uint8_t *data, size_t size, std::shared_ptr<Device> device)#

Memory() - Constructor to create Memory instance using user provided data pointer.

Parameters:

type – Enum class to specify which type of memory allocation method to use
data – user allocated buffer pointer
size – Size of the buffer
device – Device handle to be used by implementation

Memory(std::shared_ptr<MemoryImplBase> ptr)#

Memory() - Constructor for using user defined implementation.

Parameters:: ptr – Pointer to user’s implementation instance

const uint8_t *map(DataMapFlags map_flags)#

map() - Maps allocated memory to user space.

Only Memory allocated with XRT/CMA type of memory can be used with map()

Parameters:: map_flags – Flag used to indicate mode of memory mapping
Returns:: returns the virtual address

void unmap()#: unmap() - Unmaps allocated memory from user space

const std::shared_ptr<MemoryImplBase> &get_pimpl_handle() const#

get_pimpl_handle() - Retrieves pointer to Memory Implementation Base class

Returns:: Returns a constant reference of pointer to implementation class.

std::shared_ptr<Device> get_device_handle() const#

get_device_handle() - Retrieves the device handle associated with memory allocation

Returns:: Returns Device handle

size_t get_size()#

get_size() - gets the size of the allocated memory

Returns:: returns the memory size

uint64_t get_physical_addr()#

get_physical_addr() - Retrieves the physical address of the allocated memory if memory type is XRT.

Returns:: returns the physical address

class MetaConvert#

#include <vart_metaconvert.hpp>

This module facilitates the conversion of Infer metadata into a format compatible with the overlay module.

Metaconvert also accepts configuration parameters as JSON string, which provide further flexibility on configuring overlay information such as line thickness, font size, font type ., etc. Please check API documentation for more information. Additionally, if users have a custom meta data then they can integrate customized functions to convert them into a format suitable for processing by the overlay module by overriding base class.

Public Functions

MetaConvert(InferResultType infer_res_type, std::string &json_data, std::shared_ptr<Device> device)#

MetaConvert() - Constructor for using existing metaconvert implementations.

Parameters:

infer_res_type – Enum class to specify which implementation to instantiate
json_data – JSON config string based on the implementation class
device – Device handle to be used

MetaConvert(std::shared_ptr<MetaConvertImplBase> ptr)#

MetaConvert() - Constructor for using user defined implementation.

Parameters:: ptr – Pointer to user’s implementation instance

const std::shared_ptr<MetaConvertImplBase> &get_pimpl_handle() const#

get_pimpl_handle() - Gives pointer to implementation class.

Returns:: Returns a constant reference of pointer to implementation class.

std::shared_ptr<OverlayShapeInfo> prepare_overlay_meta(std::shared_ptr<InferResult> root_infer_res)#

prepare_overlay_meta() - Converts inference results to data structures needed for overlay.

Parameters:: root_infer_res – Root node of the Inference results to be interpreted and converted
Returns:: Overlay shape info as per the inference results.

class Overlay#

#include <vart_overlay.hpp>

This module facilitates the overlay of annotations onto the video frame, currently overlay utilizes OpenCV library to draw on frames, which is software based.

Overlay supports drawing of bounding boxes, text, lines, arrows, circles and polygons on frames. Application can also incorporate custom implementation using base class.

Public Functions

Overlay(OverlayImplType overlay_impl_type, std::shared_ptr<Device> device)#

Overlay() - Constructor for creating an overlay instance, which accepts input type and device instance parameters.

Parameters:

overlay_impl_type – Enum class to specify which type of overlay implementation to use
device – Device handle to be used by implementation

Overlay(std::shared_ptr<OverlayImplBase> ptr)#

Overlay() - Constructor for creating an overlay instance, which accepts user defined pimpl parameters.

Parameters:: ptr – Pointer to user’s implementation instance

const std::shared_ptr<OverlayImplBase> &get_pimpl_handle() const#

get_pimpl_handle() - Retrieves pointer to implementation class.

Returns:: Returns a constant reference of pointer to implementation class.

void draw_overlay(VideoFrame &frame, OverlayShapeInfo &shape_info)#

draw_overlay() - Draws the input overlay information on to the frame

Parameters:

frame – video frame on which drawing operation to be performed
shape_info – Information used in drawing operation

class PLKernel#

Public Functions

PLKernel(PLKernelImplType type, const std::string &kernel_name, std::string &json_data, std::shared_ptr<Device> device)#

PLKernel() - Constructs a PLKernel object.

Parameters:

type – The type of the kernel implementation.
kernel_name – The name of the kernel.
json_data – The JSON data for the kernel configuration.
device – The device on which the kernel will run.

PLKernel(std::shared_ptr<PLKernelImplBase> ptr)#

PLKernel() - Constructs a PLKernel object with a given implementation pointer.

Parameters:: ptr – A shared pointer to the kernel implementation.

~PLKernel()#: ~PLKernel() - Destructs a PLKernel object

const std::shared_ptr<PLKernelImplBase> &get_pimpl_handle() const#

get_pimpl_handle() - Gives pointer to implementation class.

Returns:: Returns a constant reference of pointer to implementation class.

template<typename ...Args> inline void process(Args&&... args)#

Process() - Set and process the given arguments on Kernel.

Template Parameters:: Args – Variadic template arguments
Parameters:: args – Arguments to be processed. Arguments specified to program as kernel arguments in sequence.

int wait(unsigned int timeout)#

wait() - Waits for the kernel to complete processing.

Parameters:: timeout – The timeout value in milliseconds.
Returns:: An integer indicating the status.

template<typename PLKernelAnyInfo> inline void set_config(PLKernelAnyInfo &info)#

set_config() - Sets the configuration for the kernel.

Template Parameters:: PLKernelAnyInfo – The type of the configuration information.
Parameters:: info – The configuration information.

template<typename PLKernelAnyInfo> inline void get_config(PLKernelAnyInfo &info)#

get_config() - Gets the configuration of the kernel.

Template Parameters:: PLKernelAnyInfo – The type of the configuration information.
Parameters:: info – The configuration information.

class PostProcess#

#include <vart_postprocess.hpp>

This module performs additional computations on output tensor data from NPU to generate more meaningful interpretation.

Post processing by default supports YOLOv2, ResNet50, SSD-ResNet34, please check API documentation on usage and additional information. If an application requires custom post processing, it can override the base class methods.

Public Functions

PostProcess(PostProcessType postprocess_type, std::string &json_data, std::shared_ptr<Device> device)#

PostProcess() - Constructor for using existing post-process implementations.

Parameters:

postprocess_type – Enum class to specify which implementation to instantiate
json_data – JSON config string based on the implementation class
device – Device handle to be used by implementations

PostProcess(std::shared_ptr<PostProcessImplBase> ptr)#

PostProcess() - Constructor for using user defined implementation.

Parameters:: ptr – Pointer to user’s implementation instance

const std::shared_ptr<PostProcessImplBase> &get_pimpl_handle() const#

get_pimpl_handle() - Gives pointer to implementation class.

Returns:: Returns a constant reference of pointer to implementation class.

void set_config(std::vector<TensorInfo> &info, uint32_t batch_size)#

set_config() - Set PostProcessInfo config data before start doing the post-process.

Use this method to set batch size per tensor and tensor information required to parse/process the ML network output. Call this method before the first call to “process” method.

Parameters:

info – TensorInfo to be set.
batch_size – Supported batch size.

std::vector<std::vector<std::shared_ptr<InferResult>>> process(std::vector<int8_t*> data, uint32_t current_batch_size)#

process() - Process/parse tensors data from ML network output to create infer results.

Parameters:

data – Vector of tensors data. Each tensor will have the data for the entire batch of images.
current_batch_size – Numer of inputs in the current batch

Returns:

Vector of inference result objects for every image in the batch.

std::vector<std::vector<std::shared_ptr<InferResult>>> process(std::vector<std::vector<std::shared_ptr<vart::Memory>>> tensor_memory, uint32_t current_batch_size)#

process() - Process/parse tensors data from ML network output to create infer results.

Parameters:

tensor_memory – Vector of vart::Memory pointers. Each vart::Memory contains one tensor, total number of tensors is equal to current_batch_size * number of tensors in each batch.
current_batch_size – Numer of inputs in the current batch

Returns:

Vector of inference result objects for every image in the batch.

class PreProcess#

#include <vart_preprocess.hpp>

The preprocessing module handles data preparation tasks such as normalization, scaling, and video format conversion.

This module supports software based pre-processing as well as hardware accelerated pre-processing for optimized performance. It ensures that input data is appropriately formatted for inference. Application can also incorporate custom pre processing by overriding base class methods.

Public Functions

PreProcess(PreProcessImplType type, std::string &json_data, std::shared_ptr<Device> device)#

PreProcess() - Constructor with implementation type and json data.

Parameters:

type – PreProcessImplType based on which implementaion need to be used.
json_data – Additional preprocessing/user related information in JSON Format which can be used by implementations.
device – Device handle to be used by implementations.

PreProcess(std::shared_ptr<PreProcessImplBase> ptr)#

PreProcess() - Constructor for using user defined implementation.

Parameters:: ptr – Pointer to user’s implementation instance

void process(std::vector<PreProcessOp> &preprocess_ops)#

process() - Perform pre-processing based on the specified parameters.

Parameters:: preprocess_ops – Vector of PreProcessOp to be performed. PreProcessOp contains input/output frame and their corresponding ROI.

void get_input_vinfo(int32_t height, int32_t width, VideoFormat fmt, VideoInfo &vinfo)#

get_input_vinfo() - Fills the input video info from input params.

Parameters:

height – Height of the input frame.
width – Width of the input frame.
fmt – Video format of the input frame.
vinfo – VideoInfo which can be filled by API. Filled VideoInfo can be used to construct input videoframe. User must provide valid pointer.

const VideoInfo &get_output_vinfo()#

get_output_vinfo() - Get the output video info.

Returns:: Returns VideoInfo which can be used to construct output video frame.

void set_preprocess_info(PreProcessInfo &preprocess_info)#

set_preprocess_info() - Set the required Preprocess parameters.

Parameters:: preprocess_info – Struture with pre processing parameters.

const PreProcessInfo &get_preprocess_info()#

get_preprocess_info() - Get the preprocess info

Returns:: Returns struture of pre processing parameters.

const std::shared_ptr<PreProcessImplBase> &get_pimpl_handle() const#

get_pimpl_handle() - Gives pointer to implementation class.

Returns:: Returns a constant reference of pointer to implementation class.

class VideoFrame#

#include <vart_videoframe.hpp>

This module simplifies the management of frame memory complexities and provides APIs for reading and writing a frame.

The VideoFrame class offers flexibility for applications to encapsulate their own memory into the VideoFrame class. In such instances, the application bears the responsibility for deallocating the frame memory.

Public Functions

VideoFrame(VideoFrameImplType type, size_t size, uint8_t mbank_idx, VideoInfo &vinfo, std::shared_ptr<Device> device)#

VideoFrame() - Constructor for creating video frame instance using implementation specific method based on input ‘type’.

Parameters:

type – Enum class to specify which type of memory allocation method to use
size – Size of the buffer
mbank_idx – memory bank index on which memory needss to be allocated
vinfo – VideoInfo instance which contains video frame specific information
device – Device handle to be used by implementation

VideoFrame(VideoFrameImplType type, std::vector<uint8_t*> &data_vec, VideoInfo &vinfo, std::shared_ptr<Device> device)#

VideoFrame() - Constructor for creating video frame instance using input data pointers.

Parameters:

type – Enum class to specify which type of memory allocation method to use
data_vec – Vector of user allocated buffer pointers
vinfo – VideoInfo instance which contains video frame specific information
device – Device handle to be used by implementation

VideoFrame(VideoFrameImplType type, uint8_t mbank_idx, std::vector<xrt::bo*> &bo_vec, VideoInfo &vinfo, std::shared_ptr<Device> device)#

VideoFrame() - Constructor for creating video frame instance using input XRT Buffer object(BO)

Parameters:

type – Enum class to specify which type of memory allocation method to use
mbank_idx – memory bank index on which memory was allocated
bo_vec – Vector of XRT BOs allocated by user
vinfo – VideoInfo instance which contains video frame specific information
device – Device handle to be used by implementation

const VideoFrameMapInfo &map(DataMapFlags map_flags)#

map() - Maps video frame data to user space

Parameters:: map_flags – Flag used to indicate mode of memory mapping
Returns:: Returns information containing user space data pointer(s) and its corresponding video frame information.

void unmap()#: unmap() - Unmaps video frame data from user space

const VideoInfo &get_video_info() const#

get_video_info() - Retrieves the video frame information from video frame

Returns:: Returns infomation related to video frame

std::shared_ptr<Device> get_device_handle() const#: get_device_handle() - Retrieves the device handle associated with the current frame

const std::shared_ptr<VideoFrameImplBase> &get_pimpl_handle() const#

get_pimpl_handle() - Retrieves pointer to implementation class.

Returns:: Returns a constant reference of pointer to implementation class.

Namespace vart

Contents

Namespace vart#