UNET Acceleration on FPGA
Description
UNET FPGA Acceleration
Affiliation: University of Ostrava, Faculty of Medicine
This solution is designed for the diagnostic analysis of Parkinson’s Disease using transcranial ultrasound imaging. The pipeline operates in two primary stages:
-
Brain Stem Localization: The first deep neural network (U-Net) processes the raw ultrasound data to identify and segment the brain stem mask.
-
Substantia Nigra Identification: This segmented region is then used as the Region of Interest (ROI) for a second specialized network. Its goal is to identify the Substantia Nigra (SN)—the primary focus of this research, as its echogenicity is a key biomarker for Parkinson’s.
Final Output: Once the objective is identified, the system automatically fits an elliptical regressor around the Substantia Nigra.
Targeted devices
- Kria KV260
- Pynq Z2
- MYIR FZ3
- MYIR FZ5
Key capabilities
-
Training: Integration with Keras and Vitis AI Quantizer to prepare U-Net models for low-precision FPGA inference. The training flow is highly configurable, allowing for quick changes in the neural network and training parameters.
-
Vitis AI Flow: Full implementation of the Vitis AI compilation chain, transforming high-level models into hardware-optimized
.xmodelfiles. -
DPU Acceleration: Leveraging the Xilinx DPU for high-throughput, low-latency image segmentation.
-
Webapp: Includes a ready-to-use application for real-time inference and demonstration on the Kria KV260.
- The application can also be run on the Pynq Z2, MYIR FZ3, and FZ5 boards.
-
Performance Benchmarking: Tools to evaluate inference accuracy and latency directly on the target hardware.
- Software/Hardware Comparison: Both models are implemented in software flow and hardware-accelerated flow for direct comparison, and are available for use in the included web application.
- Ellipse regresor: The second part of the AI flow is implemented as a UNET model and a small ellipse regresor. Which will be used for further study of the echogenicity,
Features
-
Target Hardware: Specifically optimized for Xilinx Kria KV260, Pynq Z2, MYIR FZ3 and FZ5
-
End-to-End Pipeline: Covers everything from data preparation and training to quantization, compilation, and final deployment.
-
Hardware-Specific Scripts: Includes automated shell scripts for the entire Vitis AI flow (quantization, evaluation, and compilation).
-
Webapp: A complete web-based UI for interacting with the deployed model on the board.
What's included
-
Vitis AI Source: Comprehensive scripts (
vitis_ai/) for the DPU compilation flow (scripts0_...through5_...). -
Training Logic: Quantized U-Net implementation and training scripts compatible with FPGA deployment requirements.
-
Deployment Tools: The webapp source code and helper scripts for board-side execution.
Minimum requirements
-
OS: Desktop Linux (Ubuntu 18.04/20.04 recommended for Vitis AI tools).
-
Runtime: Python 3.7+; Vitis AI Docker environment or local toolchain (Quantizer, Compiler).
-
Hardware: * Development: Workstation with Xilinx Vitis AI toolchain installed.
-
Target: Xilinx Kria KV260, Pynq Z2, MYIR FZ3, MYIR FZ5 with DPU-based image.
-
-
Data: Raw images for segmentation
How to cite
Cite the Zenodo DOI for this version.
Authors: Denis Kurka, Petr Čermák
Files
denisuskurka/UNET_ACCEL-zenodo3.zip
Files
(48.6 MB)
| Name | Size | Download all |
|---|---|---|
|
md5:2922f8f9f88affbc147006a07cc9ee3f
|
48.6 MB | Preview Download |
Additional details
Related works
- Is supplement to
- Software: https://github.com/denisuskurka/UNET_ACCEL/tree/zenodo3 (URL)
Funding
- European Union
- The project National Institute for Neurological Research, Programme EXCELES, Next Generation EU LX22NPO510x
Software
References
- TensorFlow: Abadi, M., et al. (2015). TensorFlow: Large-scale machine learning on heterogeneous systems. https://www.tensorflow.org
- Keras: Chollet, F., et al. (2015). Keras: The Python Deep Learning API. https://keras.io
- U-Net (Original Paper): Ronneberger, O., Fischer, P., & Brox, T. (2015). U-Net: Convolutional Networks for Biomedical Image Segmentation. arXiv preprint arXiv:1505.04597. https://doi.org/10.48550/arXiv.1505.04597
- FINN (Original Paper): Umuroglu, Y., et al. (2017). FINN: A Framework for Fast, Scalable Binarized Neural Network Inference. Proceedings of the 2017 ACM/SIGDA International Symposium on Field-Programmable Gate Arrays (FPGA '17), 65–74. https://doi.org/10.1145/3020078.3021744
- FINN-R (Journal Paper): Blott, M., et al. (2018). FINN-R: An end-to-end deep-learning framework for fast exploration of quantized neural networks. ACM Transactions on Reconfigurable Technology and Systems (TRETS), 11(3), 1–23. https://doi.org/10.1145/3242897