Comparación de Vitis-AI y FINN para implementar redes neuronales convolucionales en FPGA

Nicolás Urbano Pintos; Héctor Lacomi; Mario Lavorato

doi:10.37537/rev.elektron.8.2.200.2024

Authors

Nicolás Urbano Pintos Universidad Tecnológica Nacional - Facultad Regional Haedo CITEDEF
Héctor Lacomi
Mario Lavorato

DOI:

https://doi.org/10.37537/rev.elektron.8.2.200.2024

Keywords:

FPGA, CNN, FINN, Vitis-AI, Quantization

Abstract

Convolutional neural networks (CNNs) are essential for image classification and detection, and their implementation in embedded systems is becoming increasingly attractive due to their compact size and low power consumption. Field-Programmable Gate Arrays (FPGAs) have emerged as a promising option, thanks to their low latency and high energy efficiency. Vitis AI and FINN are two development environments that automate the implementation of CNNs on FPGAs. Vitis AI uses a deep learning processing unit (DPU) and memory accelerators, while FINN is based on a streaming architecture and fine-tunes parallelization. Both environments implement parameter quantization techniques to reduce memory usage. This work extends previous comparisons by evaluating both environments by implementing four models with different numbers of layers on the Xilinx Kria KV260 FPGA platform. The complete process from training to evaluation on FPGA, including quantization and hardware implementation, is described in detail. The results show that FINN provides lower latency, higher throughput, and better energy efficiency than Vitis AI. However, Vitis AI stands out for its simplicity in model training and ease of implementation on FPGA. The main finding of this study is that as the complexity of the models increases (with more layers in the neural networks), the differences in terms of performance and energy efficiency between FINN and Vitis AI are significantly reduced.

Downloads

Download data is not yet available.

References

X. Zhao, L. Wang, Y. Zhang, X. Han, M. Deveci, and M. Parmar, “A

review of convolutional neural networks in computer vision,” Artificial

Intelligence Review, vol. 57, no. 4, pp. 1–43, 2024.

Z. Li, F. Liu, W. Yang, S. Peng, and J. Zhou, “A survey of convolu-

tional neural networks: Analysis, applications, and prospects,” IEEE

Transactions on Neural Networks and Learning Systems, vol. 33,

no. 12, pp. 6999–7019, 2022.

Z. Zhang and J. Li, “A review of artificial intelligence in embedded

systems,” Micromachines, vol. 14, no. 5, p. 897, 2023.

I. Hubara, M. Courbariaux, D. Soudry, R. El-Yaniv, and Y. Bengio,

“Quantized neural networks: Training neural networks with low

precision weights and activations,” Journal of Machine Learning

Research, vol. 18, pp. 1–30, 2018.

M. Courbariaux, I. Hubara, D. Soudry, R. El-Yaniv, and Y. Bengio,

“Binarized Neural Networks: Training Deep Neural Networks with

Weights and Activations Constrained to +1 or -1,” arXiv: Learning,

[Online]. Available: http://arxiv.org/abs/1602.02830

T. P. Swaminathan, C. Silver, and T. Akilan, “Benchmarking

deep learning models on nvidia jetson nano for real-time

systems: An empirical investigation,” 2024. [Online]. Available:

https://arxiv.org/abs/2406.17749

K. P. Seng, P. J. Lee, and L. M. Ang, “Embedded intelligence on fpga:

Survey, applications and challenges,” Electronics, vol. 10, no. 8, 2021.

[Online]. Available: https://www.mdpi.com/2079-9292/10/8/895

XILINX, “Vitis ai - adaptable & real-time ai inference acceleration,”

[Online]. Available: https://github.com/Xilinx/Vitis-AI

Y. Umuroglu, N. J. Fraser, G. Gambardella, M. Blott, P. Leong,

M. Jahre, and K. Vissers, “FINN: A framework for fast, scalable

binarized neural network inference,” FPGA 2017 - Proceedings of the

ACM/SIGDA International Symposium on Field-Programmable

Gate Arrays, no. February, pp. 65–74, 2017, doi: 10.1145/3020078.

M. Machura, M. Danilowicz, and T. Kryjak, “Embedded object

detection with custom littlenet, finn and vitis ai dcnn accelerators,”

Journal of Low Power Electronics and Applications, vol. 12, no. 2,

[Online]. Available: https://www.mdpi.com/2079-9268/12/2/30

F. Hamanaka, T. Odan, K. Kise, and T. V. Chu, “An exploration of

state-of-the-art automation frameworks for fpga-based dnn accelera-

tion,” IEEE Access, vol. 11, pp. 5701–5713, 2023.

Xilinx, “Kria kv260 vision ai starter kit,” 2021. [Online]. Availa-

ble: https://www.amd.com/en/products/system-on-modules/kria/k26/

kv260-vision-starter-kit.html

——, “Dpuczdx8g for zynq ultrascale+ mpsocs product guide

(pg338),” 2023. [Online]. Available: https://docs.xilinx.com/r/en-US/

pg338-dpu/Core-Overview

A. Paszke, S. Gross, F. Massa, A. Lerer, J. Bradbury, G. Chanan,

T. Killeen, Z. Lin, N. Gimelshein, L. Antiga, A. Desmaison, A. Köpf,

E. Yang, Z. DeVito, M. Raison, A. Tejani, S. Chilamkurthy, B. Steiner,

L. Fang, J. Bai, and S. Chintala, “PyTorch: An imperative style, high-

performance deep learning library,” Advances in Neural Information

Processing Systems, vol. 32, no. NeurIPS, 2019.

M. Abadi, A. Agarwal, P. Barham, E. Brevdo, Z. Chen, C. Citro, G. S.

Corrado, A. Davis, J. Dean, M. Devin, S. Ghemawat, I. Goodfellow,

A. Harp, G. Irving, M. Isard, Y. Jia, R. Jozefowicz, L. Kaiser,

M. Kudlur, J. Levenberg, D. Mané, R. Monga, S. Moore, D. Murray,

C. Olah, M. Schuster, J. Shlens, B. Steiner, I. Sutskever, K. Talwar,

P. Tucker, V. Vanhoucke, V. Vasudevan, F. Viégas, O. Vinyals,

P. Warden, M. Wattenberg, M. Wicke, Y. Yu, and X. Zheng,

“TensorFlow: Large-scale machine learning on heterogeneous

systems,” 2015, software available from tensorflow.org. [Online].

Available: https://www.tensorflow.org/

Y. Jia, E. Shelhamer, J. Donahue, S. Karayev, J. Long, R. Girshick,

S. Guadarrama, and T. Darrell, “Caffe: Convolutional architecture

for fast feature embedding,” 2014. [Online]. Available: https:

//arxiv.org/abs/1408.5093

M. Blott, T. B. Preuber, N. J. Fraser, G. Gambardella, K. O’Brien,

Y. Umuroglu, M. Leeser, and K. Vissers, “FinN-R: An end-to-end

deep-learning framework for fast exploration of quantized neural

networks,” ACM Transactions on Reconfigurable Technology and

Systems, vol. 11, no. 3, 2018, doi: 10.1145/3242897.

A. Pappalardo, “Xilinx/brevitas,” 2021, doi: 10.5281/zenodo.3333552.

F. Manca, F. Ratto, and F. Palumbo, “Onnx-to-hardware design flow

for adaptive neural-network inference on fpgas,” 2024. [Online].

Available: https://arxiv.org/abs/2406.09078

Q. Ducasse, P. Cotret, L. Lagadec, and R. Stewart, “Benchmarking

quantized neural networks on fpgas with finn,” arXiv preprint ar-

Xiv:2102.01341, 2021.

Y. Lecun, L. Bottou, Y. Bengio, and P. Haffner, “Gradient-based

learning applied to document recognition,” Proceedings of the IEEE,

vol. 86, no. 11, pp. 2278–2324, 1998, doi: 10.1109/5.726791.

H. Xiao, K. Rasul, and R. Vollgraf, “Fashion-mnist: a novel

image dataset for benchmarking machine learning algorithms,” 2017.

[Online]. Available: https://arxiv.org/abs/1708.07747

M. Kristan, J. Matas, A. Leonardis, T. Vojir, R. Pflugfelder, G. Fer-

nandez, G. Nebehay, F. Porikli, and L. Čehovin, “A novel performance

evaluation methodology for single-target trackers,” IEEE Transactions

on Pattern Analysis and Machine Intelligence, vol. 38, no. 11, pp.

–2155, Nov 2016.

K. He, X. Zhang, S. Ren, and J. Sun, “Deep residual learning for

image recognition,” 2015. [Online]. Available: https://arxiv.org/abs/

03385

N. Urbano Pintos, H. Lacomi, and M. Lavorato, “Implementación de

red neuronal de convolución vgg16 en fpga con vitis ai,” in Libro

de resúmenes de la 109a Reunión de la Asociación Fı́sica Argentina,

, pp. 47–48.

K. Simonyan and A. Zisserman, “Very deep convolutional networks

for large-scale image recognition,” 2014, doi: 10.48550/ARXIV.

1556. [Online]. Available: https://arxiv.org/abs/1409.1556

A. Krizhevsky, “Learning multiple layers of features from tiny ima-

ges,” University of Toronto Department of Computer Science, 2009.

N. Urbano Pintos, H. Lacomi, and M. Lavorato, “B-vgg16: Red

neuronal de convolución cuantizada binariamente para la clasificación

de imágenes,” Elektron, vol. 6, no. 2, pp. 107–114, 2022.

S. Liu and W. Deng, “Very deep convolutional neural network based

image classification using small training sample size,” Proceedings -

rd IAPR Asian Conference on Pattern Recognition, ACPR 2015, pp.

–734, 2016, doi: 10.1109/ACPR.2015.7486599.

Y. Netzer, T. Wang, A. Coates, A. Bissacco, B. Wu, and A. Y. Ng,

“Reading digits in natural images with unsupervised feature learning,”

in NIPS Workshop on Deep Learning and Unsupervised Feature

Learning 2011, 2011.

XILINX, “Fast, scalable quantized neural network inference on

fpgas,” 2024. [Online]. Available: https://github.com/Xilinx/finn

——, “Dataflow qnn inference accelerator examples on fpgas,” 2024.

[Online]. Available: https://github.com/Xilinx/finn-examples

T. maintainers and contributors, “Torchvision: Pytorch’s computer

vision library,” https://github.com/pytorch/vision, 2016.

A. Farahani, B. Pourshojae, K. Rasheed, and H. R. Arabnia, “A

concise review of transfer learning,” 2021. [Online]. Available:

https://arxiv.org/abs/2104.02144

XILINX, “Dpu on pynq,” 2022. [Online]. Available: https:

//github.com/Xilinx/DPU-PYNQ

Xilinx, “Vitis ai optimizer,” 2023. [Online]. Available: https:

//docs.amd.com/r/en-US/ug1414-vitis-ai/Vitis-AI-Optimizer

XILINX, “Kria-pynq,” 2022. [Online]. Available: https://github.com/

Xilinx/Kria-PYNQ

Comparison of Vitis-AI and FINN for implementing convolutional neural networks on FPGA

Authors

DOI:

Keywords:

Abstract

Downloads

References

Downloads

Published

Issue

Section

License

How to Cite

Language

Information

Latest publications

Keywords

Browse Articles

Developed By