AI Analyzer#
AMD AI Analyzer is a tool that supports analysis and visualization of model compilation and inference on AMD Vitis™ AI. The primary goal of the tool is to help you better understand how models are processed by the hardware, and to identify performance bottlenecks that might be present during model inference. Using AI Analyzer, you can visualize graph and operator partitions between the NPU and CPU.
Enabling Profiling and Visualization#
The Vitis AI compiler configuration file vitisai_config.json should include at least the following content to prevent the compiler from removing artifacts required by AI Analyzer:
"vaiml_config": {
"keep_outputs": true,
...
}
Note
Analysis and profiling options increase compilation time and add memory and performance overhead at runtime. Use them only for short development and analysis runs—not for long-running sessions or production deployment.
You can enable profiling and visualization by passing provider options to ONNX Runtime’s InferenceSession during both model compilation and hardware inference. This ensures that both stages can be visualized and profiled. Example:
provider_options = [{
'config_file': 'vaip_config.json',
'cacheDir': str(cache_dir),
'cacheKey': 'modelcachekey',
'ai_analyzer_visualization': True,
'ai_analyzer_profiling': True,
}]
session = ort.InferenceSession(
model.SerializeToString(),
providers=providers,
provider_options=provider_options
)
The ai_analyzer_profiling flag enables generation of artifacts for inference profiling. It captures the time required to execute the different layers from the user perspective, including latency introduced by data transfers such as activations, spills, and weights. If you require detailed report on L3 (DDR) to/from L2 (memory Tile within AI Engine Array) data transfers, enable the enhanced profiling option as explained in DDR Throughput Profiling.
The ai_analyzer_visualization flag enables generation of artifacts related to graph partitions and operator fusion. These artifacts are generated as JSON files in the current run directory.
AI Analyzer also supports native ONNX Runtime profiling, which can be used to analyze portions of the session that run on the CPU. Enable ONNX Runtime profiling through session options and pass those options with the provider options, as in the following example:
# Configure session options for profiling
sess_options = ort.SessionOptions()
sess_options.enable_profiling = True
provider_options = [{
'config_file': 'vaip_config.json',
'cacheDir': str(cache_dir),
'cacheKey': 'modelcachekey',
'ai_analyzer_visualization': True,
'ai_analyzer_profiling': True,
}]
session = ort.InferenceSession(
model.SerializeToString(),
sess_options,
providers=providers,
provider_options=provider_options
)
Running Model on Hardware
After compiling the model with the appropriate configuration, run inference on the target hardware. During execution, the runtime automatically generates timestamp files and, when enhanced profiling is enabled, trace-dump files that contain profiling data for each inference. These files are required for AI Analyzer performance analysis.
After inference completes on the hardware, collect the generated profiling data into a vai_profile.zip archive using the following command on the hardware:
vaiprofile-collect-data .
You can then copy vai_profile.zip to your host machine and extract it into the folder containing your model artifacts:
unzip vai_profile.zip -d /path/to/model_dir
Note
If enhanced profiling is enabled, the archive contains additional files that must be processed to extract detailed profiling data. Instead of unzipping the file, use the following command on the host machine:
vaiprofile <VAIML_design_dir_path> vai_profile.zip --frequency <freqMHz>
cd <VAIML_design_dir_path>
cp analyzed_data/mlprofiler_ddr_merge/record_timer*.json analyzed_data
cp analyzed_data/mlprofiler_ddr_merge/onnxruntime_profile_*.json analyzed_data
Where <VAIML_design_dir_path> is the path to the design directory generated by the Vitis AI compiler (it contains the vitisai_config.json file), and <freqMHz> is the NPU clock frequency in MHz (for example, 1250 for 1.25 GHz). More details are available in DDR Throughput Profiling.
Installation#
Running AI Analyzer with Docker#
To run AI Analyzer inside the Docker container, map the same TCP port on both the host and the container, mount the directory containing your model artifacts into the container, and launch AI Analyzer against that mounted path.
Use the --no-browser option to prevent automatic browser launch and bind to 0.0.0.0 to allow access from remote hosts. If you are not using Docker, you can run aianalyzer directly on your host by pointing it to the model directory and specifying the port.
Command#
docker run --ulimit stack=-1:-1 -it -p <PORT>:<PORT> --rm -v /path/to/model_dir:/model_dir DOCKER_REPOSITORY:DOCKER_TAG bash
aianalyzer /model_dir --port <PORT> --no-browser --bind 0.0.0.0
Example#
Start the docker and run aianalyzer inside the docker:
docker run --ulimit stack=-1:-1 -it -p 8011:8011 --rm -v "$PWD/aModel:/aModel" vitis_ai_ve2_docker:release_v6.1_0206 bash
aianalyzer /My_Model_Path --port 8011 --no-browser --bind 0.0.0.0
It shows the address that can be accessed from host machine, for example:
2026-**-** 00:13:51,582 INFO [client_id=n/a] 140184040990272 server.py:35 AI Analyzer 1.6.0.dev20251005221519+g1ea47349 serving on http://0.0.0.0:8011/dashboard?token=x8pCYzKWkXJFsdLPVwcjq3ungf2upNEreZeAnqjQuk (Press CTRL+C to quit)
Then start a browser from host machine, and access the above address from the browser: http://0.0.0.0:8011/dashboard?token=x8pCYzKWkXJFsdLPVwcjq3ungf2upNEreZeAnqjQuk
Launching AI Analyzer#
After the model is compiled and/or inference run on the Versal AI Edge Series Gen2 board, aianalyzer can be invoked through the command line as follows:
aianalyzer <logdir> <additional options>
Positional Arguments#
logdir: Path to the folder containing generated artifacts, this can be directory where the model has been compiled or directory from which the inference run.
Additional Options#
-v,--version: Show the version info and exit.-b ADDR,--bind ADDR: Hostname or IP address on which to listen. Default is ‘localhost’.-p PORT,--port PORT: TCP port on which to listen. Default is ‘8000’.-n,--no-browser: Prevents the opening of the default URL in the browser.-t TOKEN,--token TOKEN: Token used for authenticating first-time connections to the server. The default is to generate a new, random token. Setting to an empty string disables authentication altogether, which is not recommended.
Features#
AI Analyzer provides visibility into how your AI model is compiled and executed on Vitis AI hardware. Its two main use cases are:
Analyzing how the model was partitioned and mapped onto Vitis AI CPU and NPU accelerator.
Profiling model performance as it executes inference workloads.
When launched, the AI Analyzer server scans the folder specified with the logdir argument and detects and loads all files relevant to compilation and/or inferencing per the ai_analyzer_visualization and ai_analyzer_profiling flags.
You can instruct the AI Analyzer server to either start a browser on the same host or return a URL that you can then load into a browser on any host.
User Interface#
AI Analyzer has the following three sections as seen in the left-panel navigator:
Partitioning (Partitioning): A breakdown of how your model was assigned to execute inference across CPU and NPU.
NPU insights (NPU insights): A detailed look at how your model was optimized for inference execution on NPU.
Performance (Performance): A breakdown of inference execution through the model. This tab is key when it comes to analyzing the performance of your model and identifying bottlenecks. It is only available if you have profiling data from running inference on hardware, as described in DDR Throughput Profiling.