Model Compilation#

You can compile the model by executing a Python script that initializes an ONNX Runtime (ORT) inference session with the desired model and the AMD Vitis™ AI Execution Provider (EP). This process enables the Vitis AI EP to build the model, generating the necessary binaries for execution on the NPU.

Use the following code as a simple template to compile models for the NPU:

import onnxruntime

onnx_model = ...           # ONNX model to be compiled

provider_options_dict = {
    "config_file": ...,    # JSON config file for the Vitis AI compiler
    "cache_dir":   ...,    # Path to the cache directory
    "cache_key":   ...,    # Subfolder in the cache directory for the compiled model
    "target":   ...,       # Target platform for Vitis AI execution provider
}

session = onnxruntime.InferenceSession(
    onnx_model,
    providers=["VitisAIExecutionProvider"],  # Use the Vitis AI Execution Provider
    provider_options=[provider_options_dict] # The provider options for the Vitis AI EP
)

Supported Devices for Compilation#

Following list of devices are supported for compilation. The device must be specified in the Vitis AI config file.

{
  "passes": [
   {
     "name": "init",
     "plugin": "vaip-pass_init"
   },
   {
    "name": "vaiml_partition",
    "plugin": "vaip-pass_vaiml_partition",
    "vaiml_config":
    {
      "device": "ve2-xc2ve3858", # for XC2VE3858
      "logging_level": "info"
    }
   }
  ],
  "target": "VAIML",
  "targets": [
    {
      "name": "VAIML",
      "pass": ["init", "vaiml_partition"]
    }
  ]
}

Device Configuration Options#

Device

Vitis AI Config Option

XC2VE3858

“device”: “ve2-xc2ve3858”

XC2VE3504

“device”: “ve2-xc2ve3504”

XC2VE3558

“device”: “ve2-xc2ve3558”

XC2VE3804

“device”: “ve2-xc2ve3804”

Note

Ensure that the working directory inside the Docker container has write permissions so that the generated model is saved successfully.

Note

For ONNX models greater than 2GB and when converting a PyTorch model to ONNX format, the ONNX model can be split into 2 parts:

  • model.onnx

  • model.onnx.data

import onnxruntime
import os

Model = 'mymodel'

# Prepend full path to ONNX model
model_name = os.getcwd() + '/' + Model + '.onnx'
compile_cache = os.getcwd() + '/cache_dir'

provider_options_dict1 = {
    "config_file": 'vitisai_config.json',
    "cache_dir": compile_cache,
    "cache_key": Model,
    "ai_analyzer_visualization": True,  # Enable visualization output for analysis
    "ai_analyzer_profiling": True,      # Enable profiling data collection
    "target": "VAIML"
}

session1 = onnxruntime.InferenceSession(
    model_name,
    providers = ["VitisAIExecutionProvider"],
    provider_options = [provider_options_dict1])

Steps to invoke the compiler#

Follow these steps to invoke the compiler

  1. Load the Docker image onto the host machine:

    docker load -i <Docker Name>.tgz
    
  2. Get the Docker repository and tag to be used:

    docker images
    
  3. Start the Docker container with the model folder and license directory mounted:

    docker run -it --rm --ulimit stack=-1:-1 --network host \
        -v </yourLicenseDir>:/usr/licenses \
        -v <HostDir>/MyModel:/MyModel \
        <Docker REPOSITORY:TAG> bash
    

    Note

    The -v option specifies the host directory to be mapped into a directory inside the Docker container. Ensure the host folder mapped into the container has write access for all users.

  1. Inside the container, compile the ONNX model with Vitis AI:

    python3 compile.py
    
compile.py creates an ONNX Runtime inference session using the

VitisAIExecutionProvider. It reads compilation settings from vitisai_config.json, including the target device and output cache directory. Running this script triggers model compilation and saves the compiled artifacts to the cache directory specified in the provider options.