Comparación de Vitis-AI y FINN para implementar redes neuronales convolucionales en FPGA

Nicolás Urbano Pintos, Héctor Lacomi, Mario Lavorato

Resumen


Las redes neuronales convolucionales (CNN) son esenciales para la clasificación y detección de imágenes, y su implementación en sistemas embebidos resulta cada vez más atractiva debido a su tamaño compacto y bajo consumo energético. Los FPGA (Field-Programmable Gate Arrays) han surgido como una opción prometedora, gracias a su baja latencia y alta eficiencia energética. Vitis AI y FINN son dos entornos de desarrollo que automatizan la implementación de CNN en FPGA. Vitis AI utiliza una unidad de procesamiento de aprendizaje profundo (DPU) y aceleradores de memoria, mientras que FINN se basa en una arquitectura de transmisión de datos (streaming) y ajusta la paralelización. Ambos entornos implementan técnicas de cuantización de parámetros para reducir el uso de memoria. Este trabajo extiende comparaciones previas al evaluar ambos entornos mediante la implementación de cuatro modelos con diferentes cantidades de capas en la plataforma FPGA Kria KV260 de Xilinx. Se describe en detalle el proceso completo, desde el entrenamiento hasta la evaluación en FPGA, incluyendo la cuantización y la implementación en hardware. Los resultados muestran que FINN proporciona menor latencia, mayor rendimiento y mejor eficiencia energética que Vitis AI. No obstante, Vitis AI destaca por su simplicidad en el entrenamiento de modelos y facilidad de implementación en FPGA. El hallazgo principal del estudio es que, al aumentar la complejidad de los modelos con más capas, las diferencias de rendimiento y eficiencia energética entre FINN y Vitis AI se reducen notablemente.


Palabras clave


FPGA; CNN; FINN; Vitis-AI; Cuantización

Texto completo:

PDF HTML

Referencias


X. Zhao, L. Wang, Y. Zhang, X. Han, M. Deveci, and M. Parmar, “A

review of convolutional neural networks in computer vision,” Artificial

Intelligence Review, vol. 57, no. 4, pp. 1–43, 2024.

Z. Li, F. Liu, W. Yang, S. Peng, and J. Zhou, “A survey of convolu-

tional neural networks: Analysis, applications, and prospects,” IEEE

Transactions on Neural Networks and Learning Systems, vol. 33,

no. 12, pp. 6999–7019, 2022.

Z. Zhang and J. Li, “A review of artificial intelligence in embedded

systems,” Micromachines, vol. 14, no. 5, p. 897, 2023.

I. Hubara, M. Courbariaux, D. Soudry, R. El-Yaniv, and Y. Bengio,

“Quantized neural networks: Training neural networks with low

precision weights and activations,” Journal of Machine Learning

Research, vol. 18, pp. 1–30, 2018.

M. Courbariaux, I. Hubara, D. Soudry, R. El-Yaniv, and Y. Bengio,

“Binarized Neural Networks: Training Deep Neural Networks with

Weights and Activations Constrained to +1 or -1,” arXiv: Learning,

[Online]. Available: http://arxiv.org/abs/1602.02830

T. P. Swaminathan, C. Silver, and T. Akilan, “Benchmarking

deep learning models on nvidia jetson nano for real-time

systems: An empirical investigation,” 2024. [Online]. Available:

https://arxiv.org/abs/2406.17749

K. P. Seng, P. J. Lee, and L. M. Ang, “Embedded intelligence on fpga:

Survey, applications and challenges,” Electronics, vol. 10, no. 8, 2021.

[Online]. Available: https://www.mdpi.com/2079-9292/10/8/895

XILINX, “Vitis ai - adaptable & real-time ai inference acceleration,”

[Online]. Available: https://github.com/Xilinx/Vitis-AI

Y. Umuroglu, N. J. Fraser, G. Gambardella, M. Blott, P. Leong,

M. Jahre, and K. Vissers, “FINN: A framework for fast, scalable

binarized neural network inference,” FPGA 2017 - Proceedings of the

ACM/SIGDA International Symposium on Field-Programmable

Gate Arrays, no. February, pp. 65–74, 2017, doi: 10.1145/3020078.

M. Machura, M. Danilowicz, and T. Kryjak, “Embedded object

detection with custom littlenet, finn and vitis ai dcnn accelerators,”

Journal of Low Power Electronics and Applications, vol. 12, no. 2,

[Online]. Available: https://www.mdpi.com/2079-9268/12/2/30

F. Hamanaka, T. Odan, K. Kise, and T. V. Chu, “An exploration of

state-of-the-art automation frameworks for fpga-based dnn accelera-

tion,” IEEE Access, vol. 11, pp. 5701–5713, 2023.

Xilinx, “Kria kv260 vision ai starter kit,” 2021. [Online]. Availa-

ble: https://www.amd.com/en/products/system-on-modules/kria/k26/

kv260-vision-starter-kit.html

——, “Dpuczdx8g for zynq ultrascale+ mpsocs product guide

(pg338),” 2023. [Online]. Available: https://docs.xilinx.com/r/en-US/

pg338-dpu/Core-Overview

A. Paszke, S. Gross, F. Massa, A. Lerer, J. Bradbury, G. Chanan,

T. Killeen, Z. Lin, N. Gimelshein, L. Antiga, A. Desmaison, A. Köpf,

E. Yang, Z. DeVito, M. Raison, A. Tejani, S. Chilamkurthy, B. Steiner,

L. Fang, J. Bai, and S. Chintala, “PyTorch: An imperative style, high-

performance deep learning library,” Advances in Neural Information

Processing Systems, vol. 32, no. NeurIPS, 2019.

M. Abadi, A. Agarwal, P. Barham, E. Brevdo, Z. Chen, C. Citro, G. S.

Corrado, A. Davis, J. Dean, M. Devin, S. Ghemawat, I. Goodfellow,

A. Harp, G. Irving, M. Isard, Y. Jia, R. Jozefowicz, L. Kaiser,

M. Kudlur, J. Levenberg, D. Mané, R. Monga, S. Moore, D. Murray,

C. Olah, M. Schuster, J. Shlens, B. Steiner, I. Sutskever, K. Talwar,

P. Tucker, V. Vanhoucke, V. Vasudevan, F. Viégas, O. Vinyals,

P. Warden, M. Wattenberg, M. Wicke, Y. Yu, and X. Zheng,

“TensorFlow: Large-scale machine learning on heterogeneous

systems,” 2015, software available from tensorflow.org. [Online].

Available: https://www.tensorflow.org/

Y. Jia, E. Shelhamer, J. Donahue, S. Karayev, J. Long, R. Girshick,

S. Guadarrama, and T. Darrell, “Caffe: Convolutional architecture

for fast feature embedding,” 2014. [Online]. Available: https:

//arxiv.org/abs/1408.5093

M. Blott, T. B. Preuber, N. J. Fraser, G. Gambardella, K. O’Brien,

Y. Umuroglu, M. Leeser, and K. Vissers, “FinN-R: An end-to-end

deep-learning framework for fast exploration of quantized neural

networks,” ACM Transactions on Reconfigurable Technology and

Systems, vol. 11, no. 3, 2018, doi: 10.1145/3242897.

A. Pappalardo, “Xilinx/brevitas,” 2021, doi: 10.5281/zenodo.3333552.

F. Manca, F. Ratto, and F. Palumbo, “Onnx-to-hardware design flow

for adaptive neural-network inference on fpgas,” 2024. [Online].

Available: https://arxiv.org/abs/2406.09078

Q. Ducasse, P. Cotret, L. Lagadec, and R. Stewart, “Benchmarking

quantized neural networks on fpgas with finn,” arXiv preprint ar-

Xiv:2102.01341, 2021.

Y. Lecun, L. Bottou, Y. Bengio, and P. Haffner, “Gradient-based

learning applied to document recognition,” Proceedings of the IEEE,

vol. 86, no. 11, pp. 2278–2324, 1998, doi: 10.1109/5.726791.

H. Xiao, K. Rasul, and R. Vollgraf, “Fashion-mnist: a novel

image dataset for benchmarking machine learning algorithms,” 2017.

[Online]. Available: https://arxiv.org/abs/1708.07747

M. Kristan, J. Matas, A. Leonardis, T. Vojir, R. Pflugfelder, G. Fer-

nandez, G. Nebehay, F. Porikli, and L. Čehovin, “A novel performance

evaluation methodology for single-target trackers,” IEEE Transactions

on Pattern Analysis and Machine Intelligence, vol. 38, no. 11, pp.

–2155, Nov 2016.

K. He, X. Zhang, S. Ren, and J. Sun, “Deep residual learning for

image recognition,” 2015. [Online]. Available: https://arxiv.org/abs/

03385

N. Urbano Pintos, H. Lacomi, and M. Lavorato, “Implementación de

red neuronal de convolución vgg16 en fpga con vitis ai,” in Libro

de resúmenes de la 109a Reunión de la Asociación Fı́sica Argentina,

, pp. 47–48.

K. Simonyan and A. Zisserman, “Very deep convolutional networks

for large-scale image recognition,” 2014, doi: 10.48550/ARXIV.

1556. [Online]. Available: https://arxiv.org/abs/1409.1556

A. Krizhevsky, “Learning multiple layers of features from tiny ima-

ges,” University of Toronto Department of Computer Science, 2009.

N. Urbano Pintos, H. Lacomi, and M. Lavorato, “B-vgg16: Red

neuronal de convolución cuantizada binariamente para la clasificación

de imágenes,” Elektron, vol. 6, no. 2, pp. 107–114, 2022.

S. Liu and W. Deng, “Very deep convolutional neural network based

image classification using small training sample size,” Proceedings -

rd IAPR Asian Conference on Pattern Recognition, ACPR 2015, pp.

–734, 2016, doi: 10.1109/ACPR.2015.7486599.

Y. Netzer, T. Wang, A. Coates, A. Bissacco, B. Wu, and A. Y. Ng,

“Reading digits in natural images with unsupervised feature learning,”

in NIPS Workshop on Deep Learning and Unsupervised Feature

Learning 2011, 2011.

XILINX, “Fast, scalable quantized neural network inference on

fpgas,” 2024. [Online]. Available: https://github.com/Xilinx/finn

——, “Dataflow qnn inference accelerator examples on fpgas,” 2024.

[Online]. Available: https://github.com/Xilinx/finn-examples

T. maintainers and contributors, “Torchvision: Pytorch’s computer

vision library,” https://github.com/pytorch/vision, 2016.

A. Farahani, B. Pourshojae, K. Rasheed, and H. R. Arabnia, “A

concise review of transfer learning,” 2021. [Online]. Available:

https://arxiv.org/abs/2104.02144

XILINX, “Dpu on pynq,” 2022. [Online]. Available: https:

//github.com/Xilinx/DPU-PYNQ

Xilinx, “Vitis ai optimizer,” 2023. [Online]. Available: https:

//docs.amd.com/r/en-US/ug1414-vitis-ai/Vitis-AI-Optimizer

XILINX, “Kria-pynq,” 2022. [Online]. Available: https://github.com/

Xilinx/Kria-PYNQ




DOI: https://doi.org/10.37537/rev.elektron.8.2.200.2024

Enlaces de Referencia

  • Por el momento, no existen enlaces de referencia


Copyright (c) 2024 Nicolás Urbano Pintos

Creative Commons License
This work is licensed under a Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 International License.


Revista elektron,  ISSN-L 2525-0159
Facultad de Ingeniería. Universidad de Buenos Aires 
Paseo Colón 850, 3er piso
C1063ACV - Buenos Aires - Argentina
revista.elektron@fi.uba.ar
+54 (11) 528-50889