Comparación de Vitis-AI y FINN para implementar redes neuronales convolucionales en FPGA
Resumen
Las redes neuronales convolucionales (CNN) son esenciales para la clasificación y detección de imágenes, y su implementación en sistemas embebidos resulta cada vez más atractiva debido a su tamaño compacto y bajo consumo energético. Los FPGA (Field-Programmable Gate Arrays) han surgido como una opción prometedora, gracias a su baja latencia y alta eficiencia energética. Vitis AI y FINN son dos entornos de desarrollo que automatizan la implementación de CNN en FPGA. Vitis AI utiliza una unidad de procesamiento de aprendizaje profundo (DPU) y aceleradores de memoria, mientras que FINN se basa en una arquitectura de transmisión de datos (streaming) y ajusta la paralelización. Ambos entornos implementan técnicas de cuantización de parámetros para reducir el uso de memoria. Este trabajo extiende comparaciones previas al evaluar ambos entornos mediante la implementación de cuatro modelos con diferentes cantidades de capas en la plataforma FPGA Kria KV260 de Xilinx. Se describe en detalle el proceso completo, desde el entrenamiento hasta la evaluación en FPGA, incluyendo la cuantización y la implementación en hardware. Los resultados muestran que FINN proporciona menor latencia, mayor rendimiento y mejor eficiencia energética que Vitis AI. No obstante, Vitis AI destaca por su simplicidad en el entrenamiento de modelos y facilidad de implementación en FPGA. El hallazgo principal del estudio es que, al aumentar la complejidad de los modelos con más capas, las diferencias de rendimiento y eficiencia energética entre FINN y Vitis AI se reducen notablemente.
Palabras clave
Referencias
X. Zhao, L. Wang, Y. Zhang, X. Han, M. Deveci, and M. Parmar, “A
review of convolutional neural networks in computer vision,” Artificial
Intelligence Review, vol. 57, no. 4, pp. 1–43, 2024.
Z. Li, F. Liu, W. Yang, S. Peng, and J. Zhou, “A survey of convolu-
tional neural networks: Analysis, applications, and prospects,” IEEE
Transactions on Neural Networks and Learning Systems, vol. 33,
no. 12, pp. 6999–7019, 2022.
Z. Zhang and J. Li, “A review of artificial intelligence in embedded
systems,” Micromachines, vol. 14, no. 5, p. 897, 2023.
I. Hubara, M. Courbariaux, D. Soudry, R. El-Yaniv, and Y. Bengio,
“Quantized neural networks: Training neural networks with low
precision weights and activations,” Journal of Machine Learning
Research, vol. 18, pp. 1–30, 2018.
M. Courbariaux, I. Hubara, D. Soudry, R. El-Yaniv, and Y. Bengio,
“Binarized Neural Networks: Training Deep Neural Networks with
Weights and Activations Constrained to +1 or -1,” arXiv: Learning,
[Online]. Available: http://arxiv.org/abs/1602.02830
T. P. Swaminathan, C. Silver, and T. Akilan, “Benchmarking
deep learning models on nvidia jetson nano for real-time
systems: An empirical investigation,” 2024. [Online]. Available:
https://arxiv.org/abs/2406.17749
K. P. Seng, P. J. Lee, and L. M. Ang, “Embedded intelligence on fpga:
Survey, applications and challenges,” Electronics, vol. 10, no. 8, 2021.
[Online]. Available: https://www.mdpi.com/2079-9292/10/8/895
XILINX, “Vitis ai - adaptable & real-time ai inference acceleration,”
[Online]. Available: https://github.com/Xilinx/Vitis-AI
Y. Umuroglu, N. J. Fraser, G. Gambardella, M. Blott, P. Leong,
M. Jahre, and K. Vissers, “FINN: A framework for fast, scalable
binarized neural network inference,” FPGA 2017 - Proceedings of the
ACM/SIGDA International Symposium on Field-Programmable
Gate Arrays, no. February, pp. 65–74, 2017, doi: 10.1145/3020078.
M. Machura, M. Danilowicz, and T. Kryjak, “Embedded object
detection with custom littlenet, finn and vitis ai dcnn accelerators,”
Journal of Low Power Electronics and Applications, vol. 12, no. 2,
[Online]. Available: https://www.mdpi.com/2079-9268/12/2/30
F. Hamanaka, T. Odan, K. Kise, and T. V. Chu, “An exploration of
state-of-the-art automation frameworks for fpga-based dnn accelera-
tion,” IEEE Access, vol. 11, pp. 5701–5713, 2023.
Xilinx, “Kria kv260 vision ai starter kit,” 2021. [Online]. Availa-
ble: https://www.amd.com/en/products/system-on-modules/kria/k26/
kv260-vision-starter-kit.html
——, “Dpuczdx8g for zynq ultrascale+ mpsocs product guide
(pg338),” 2023. [Online]. Available: https://docs.xilinx.com/r/en-US/
pg338-dpu/Core-Overview
A. Paszke, S. Gross, F. Massa, A. Lerer, J. Bradbury, G. Chanan,
T. Killeen, Z. Lin, N. Gimelshein, L. Antiga, A. Desmaison, A. Köpf,
E. Yang, Z. DeVito, M. Raison, A. Tejani, S. Chilamkurthy, B. Steiner,
L. Fang, J. Bai, and S. Chintala, “PyTorch: An imperative style, high-
performance deep learning library,” Advances in Neural Information
Processing Systems, vol. 32, no. NeurIPS, 2019.
M. Abadi, A. Agarwal, P. Barham, E. Brevdo, Z. Chen, C. Citro, G. S.
Corrado, A. Davis, J. Dean, M. Devin, S. Ghemawat, I. Goodfellow,
A. Harp, G. Irving, M. Isard, Y. Jia, R. Jozefowicz, L. Kaiser,
M. Kudlur, J. Levenberg, D. Mané, R. Monga, S. Moore, D. Murray,
C. Olah, M. Schuster, J. Shlens, B. Steiner, I. Sutskever, K. Talwar,
P. Tucker, V. Vanhoucke, V. Vasudevan, F. Viégas, O. Vinyals,
P. Warden, M. Wattenberg, M. Wicke, Y. Yu, and X. Zheng,
“TensorFlow: Large-scale machine learning on heterogeneous
systems,” 2015, software available from tensorflow.org. [Online].
Available: https://www.tensorflow.org/
Y. Jia, E. Shelhamer, J. Donahue, S. Karayev, J. Long, R. Girshick,
S. Guadarrama, and T. Darrell, “Caffe: Convolutional architecture
for fast feature embedding,” 2014. [Online]. Available: https:
//arxiv.org/abs/1408.5093
M. Blott, T. B. Preuber, N. J. Fraser, G. Gambardella, K. O’Brien,
Y. Umuroglu, M. Leeser, and K. Vissers, “FinN-R: An end-to-end
deep-learning framework for fast exploration of quantized neural
networks,” ACM Transactions on Reconfigurable Technology and
Systems, vol. 11, no. 3, 2018, doi: 10.1145/3242897.
A. Pappalardo, “Xilinx/brevitas,” 2021, doi: 10.5281/zenodo.3333552.
F. Manca, F. Ratto, and F. Palumbo, “Onnx-to-hardware design flow
for adaptive neural-network inference on fpgas,” 2024. [Online].
Available: https://arxiv.org/abs/2406.09078
Q. Ducasse, P. Cotret, L. Lagadec, and R. Stewart, “Benchmarking
quantized neural networks on fpgas with finn,” arXiv preprint ar-
Xiv:2102.01341, 2021.
Y. Lecun, L. Bottou, Y. Bengio, and P. Haffner, “Gradient-based
learning applied to document recognition,” Proceedings of the IEEE,
vol. 86, no. 11, pp. 2278–2324, 1998, doi: 10.1109/5.726791.
H. Xiao, K. Rasul, and R. Vollgraf, “Fashion-mnist: a novel
image dataset for benchmarking machine learning algorithms,” 2017.
[Online]. Available: https://arxiv.org/abs/1708.07747
M. Kristan, J. Matas, A. Leonardis, T. Vojir, R. Pflugfelder, G. Fer-
nandez, G. Nebehay, F. Porikli, and L. Čehovin, “A novel performance
evaluation methodology for single-target trackers,” IEEE Transactions
on Pattern Analysis and Machine Intelligence, vol. 38, no. 11, pp.
–2155, Nov 2016.
K. He, X. Zhang, S. Ren, and J. Sun, “Deep residual learning for
image recognition,” 2015. [Online]. Available: https://arxiv.org/abs/
03385
N. Urbano Pintos, H. Lacomi, and M. Lavorato, “Implementación de
red neuronal de convolución vgg16 en fpga con vitis ai,” in Libro
de resúmenes de la 109a Reunión de la Asociación Fı́sica Argentina,
, pp. 47–48.
K. Simonyan and A. Zisserman, “Very deep convolutional networks
for large-scale image recognition,” 2014, doi: 10.48550/ARXIV.
1556. [Online]. Available: https://arxiv.org/abs/1409.1556
A. Krizhevsky, “Learning multiple layers of features from tiny ima-
ges,” University of Toronto Department of Computer Science, 2009.
N. Urbano Pintos, H. Lacomi, and M. Lavorato, “B-vgg16: Red
neuronal de convolución cuantizada binariamente para la clasificación
de imágenes,” Elektron, vol. 6, no. 2, pp. 107–114, 2022.
S. Liu and W. Deng, “Very deep convolutional neural network based
image classification using small training sample size,” Proceedings -
rd IAPR Asian Conference on Pattern Recognition, ACPR 2015, pp.
–734, 2016, doi: 10.1109/ACPR.2015.7486599.
Y. Netzer, T. Wang, A. Coates, A. Bissacco, B. Wu, and A. Y. Ng,
“Reading digits in natural images with unsupervised feature learning,”
in NIPS Workshop on Deep Learning and Unsupervised Feature
Learning 2011, 2011.
XILINX, “Fast, scalable quantized neural network inference on
fpgas,” 2024. [Online]. Available: https://github.com/Xilinx/finn
——, “Dataflow qnn inference accelerator examples on fpgas,” 2024.
[Online]. Available: https://github.com/Xilinx/finn-examples
T. maintainers and contributors, “Torchvision: Pytorch’s computer
vision library,” https://github.com/pytorch/vision, 2016.
A. Farahani, B. Pourshojae, K. Rasheed, and H. R. Arabnia, “A
concise review of transfer learning,” 2021. [Online]. Available:
https://arxiv.org/abs/2104.02144
XILINX, “Dpu on pynq,” 2022. [Online]. Available: https:
//github.com/Xilinx/DPU-PYNQ
Xilinx, “Vitis ai optimizer,” 2023. [Online]. Available: https:
//docs.amd.com/r/en-US/ug1414-vitis-ai/Vitis-AI-Optimizer
XILINX, “Kria-pynq,” 2022. [Online]. Available: https://github.com/
Xilinx/Kria-PYNQ
DOI: https://doi.org/10.37537/rev.elektron.8.2.200.2024
Enlaces de Referencia
- Por el momento, no existen enlaces de referencia
Copyright (c) 2024 Nicolás Urbano Pintos
This work is licensed under a Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 International License.
Revista elektron, ISSN-L 2525-0159
Facultad de Ingeniería. Universidad de Buenos Aires
Paseo Colón 850, 3er piso
C1063ACV - Buenos Aires - Argentina
revista.elektron@fi.uba.ar
+54 (11) 528-50889