Spectrograms of baleen whale records synthesized from Autoenconder architectures: CAE, VAE and CAE-LSTM

Authors

DOI:

https://doi.org/10.37537/rev.elektron.6.2.167.2022

Keywords:

Convolutional autoencoders, recursive layers, spectrograms, underwater sound, synthesis

Abstract

In this paper, different architectures of simple convolutional networks are analyzed to generate synthetic spectrograms corresponding to baleen whales. Simplicity in these models plays an important role in the implementations of these type of networks on embedded systems. In addition, the scarcity of available data requires the generation of efficient models. With this aim in mind, simple Autoencoder architectures with a low number of as- sociated parameters are presented and trained in this paper. Then, adequate metrics are obtained and the corresponding comparison among the architecture alternatives is made. The obtained results show that the more straightforward architecture is, in turn, the most convenient. Finally, from these models, synthetic spectrograms are generated from few data samples are generated, employing a low complexity architecture and assuming a normal distribution of the latent space vectors from the training data.

Downloads

Download data is not yet available.

Author Biographies

  • María Celeste Cabedio, Universidad Nacional de Mar del Plata
    Professor and PhD student in the Electronic Department of Engineering at  the National University of Mar del Plata.
  • Marco Carnaghi, Universidad Nacional de Mar del Plata
    PhD student in the Electronic Department of Engineering at  the National University of Mar del Plata.

References

T. Markus and S. P. P. Silva, Managing and Regulating Underwater Noise Pollution. Springer International Publishing, 2018, pp. 971–995. [Online]. Available: https://doi.org/10.1007/978-3-319-60156-4 52

N. Jones, “Ocean uproar: saving marine life from a barrage of noise,” Nature, vol. 568, pp. 158–161, 04 2019.

E. Tejero, “Aplicaciones de Machine Learning a la Bioacústica Marina,” Ph.D. dissertation, 07 2020.

D. Tuia and E. Al, “Perspectives in machine learning for wildlife conservation,” Nature Communications, vol. 13, no. 792, 2022.

A. Lamba, P. Cassey, R. Raja Segaran, and L. Koh, “Deep learning for environmental conservation,” Current Biology, vol. 29, pp. R977–R982, 10 2019.

A. Ibrahim and et. al, “A multimodel deep learning algorithm to detect North Atlantic right whale up-calls,” The Journal of the Acoustical Society of America, vol. 150, 08 2021.

Q. Xu, Z. Wu, Y. Yang, and L. Zhang, “The difference learning of hidden layer between autoencoder and variational autoencoder,” in 29th Chinese Control And Decision Conference, 2017, pp. 4801–4804.

N. Mansouri and Z. Lachiri, “Human Laughter Generation using Hybrid Generative Models,” KSII Transactions on Internet and Information Systems (TIIS), pp. 1590–1609, 2021.

A. Sarroff and M. Casey, “Musical audio synthesis using auto-encoding neural nets,” in In Joint International Computer Music Conference (ICMC) and Sound and Music Computing conference (SMC), 2014.

N. Mansouri and Z. Lachiri, “Laughter synthesis: A comparison between Variational autoencoder and Autoencoder,” in 5th International Conference on Advanced Technologies for Signal and Image Processing (ATSIP), 2020, pp. 1–6.

J. Wei, “AlexNet: The Architecture that Challenged CNNs,” Towards Data Science, 2019. [Online]. Available: https://acortar.link/IrMULc(acceso:25dejuniode2022).

M. Carnaghi and M. C. Cebedio, “Espectrogramas de registros de Ballenas Barbadas, sintetizados a partir de Autoencoders,” Congreso Argentino de Sistemas Embebidos CASE, 08 2022.

“Ocean Sound Library: Natural and Man-Made,” Ocean Conservation Research, 2022. [Online]. Available: https://ocr.org/sound-library/

“Song and Sound,” Whale Trust, 2022. [Online]. Available: https://whaletrust.org/song-sound//

“Marine Mammals,” Discovery of Sound in the Sea, 2022. [Online]. Available: https://dosits.org/galleries/audio-gallery/marine-mammals/

“Watkins Marine Mammal Sound Database,” Woods Hole Oceanographic Institution, 2022. [Online]. Available: https://whoicf2.whoi.edu/science/B/whalesounds/index.cfm

M. C. Cebedio and M. Carnaghi, “Datos,” Google Drive, 2022. [Online]. Available: https://drive.google.com/drive/folders/1HxalJvSf3L4MXW8VsFXsvYelkTb5xYDj?usp=sharing

J. Brownlee, Long Short-Term Memory Networks With Python, 1st ed. Machine Learning Mastery, 2017.

S. Ioffe and C. Szegedy, “Batch Normalization: Accelerating Deep Network Training by Reducing Internal Covariate Shift,” 2015. [Online]. Available: https://arxiv.org/abs/1502.03167

D. Kingma and J. Ba, “Adam: A method for stochastic optimization,” International Conference on Learning Representations, 12 2014.

A. León-Batallas, J. Bermeo-Paucar, Paredes-Quevedo, and H. Torres-Ordoñez, “Una revisión de las métricas aplicadas en el procesamiento de imágenes,” RECIMUNDO, pp. 267–273, 2020.

M. C. Cebedio and M. Carnaghi, “Repositorio-CASE2022,” GitHub, 2022. [Online]. Available: https://github.com/Reposinnombre/CASE2022

Published

2022-12-15

Issue

Section

Computer Networks and Informatics

How to Cite

[1]
M. C. Cabedio and M. Carnaghi, “Spectrograms of baleen whale records synthesized from Autoenconder architectures: CAE, VAE and CAE-LSTM”, Elektron, vol. 6, no. 2, pp. 129–134, Dec. 2022, doi: 10.37537/rev.elektron.6.2.167.2022.