Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
skip to main content
10.1145/3638530.3654432acmconferencesArticle/Chapter ViewAbstractPublication PagesgeccoConference Proceedingsconference-collections
poster

LVNS-RAVE: Diversified audio generation with RAVE and Latent Vector Novelty Search

Published: 01 August 2024 Publication History

Abstract

Evolutionary Algorithms and Generative Deep Learning have been two of the most powerful tools for sound generation tasks. However, they have limitations: Evolutionary Algorithms require complicated designs, posing challenges in control and achieving realistic sound generation. Generative Deep Learning models often copy from the dataset and lack creativity. In this paper, we propose LVNS-RAVE, a method to combine Evolutionary Algorithms and Generative Deep Learning to produce realistic and novel sounds. We use the RAVE model as the sound generator and the VGGish model as a novelty evaluator in the Latent Vector Novelty Search (LVNS) algorithm. The reported experiments show that the method can successfully generate diversified, novel audio samples under different mutation setups using different pre-trained RAVE models. The characteristics of the generation process can be easily controlled with the mutation parameters. The proposed algorithm can be a creative tool for sound artists and musicians.

References

[1]
Philip Bontrager, Aditi Roy, Julian Togelius, Nasir Memon, and Arun Ross. 2018. DeepMasterPrints: Generating MasterPrints for Dictionary Attacks via Latent Variable Evolution. In IEEE 9th Intl. Conf. on Biometrics Theory, Applications and Systems (BTAS). 1--9. ISSN: 2474-9699.
[2]
Antoine Caillon and Philippe Esling. 2021. RAVE: A variational autoencoder for fast and high-quality neural audio synthesis. arXiv:2111.05011 [cs, eess].
[3]
Palle Dahlstedt. 2007. Evolution in Creative Sound Design. In Evolutionary Computer Music, Eduardo Reck Miranda and John Al Biles (Eds.). Springer London, London, 79--99.
[4]
Hugo Flores Garcia, Prem Seetharaman, Rithesh Kumar, and Bryan Pardo. 2023. VampNet: Music Generation via Masked Acoustic Token Modeling. http://arxiv.org/abs/2307.04686 arXiv:2307.04686 [cs, eess].
[5]
Jort F. Gemmeke, Daniel P. W. Ellis, Dylan Freedman, Aren Jansen, Wade Lawrence, R. Channing Moore, Manoj Plakal, and Marvin Ritter. 2017. Audio Set: An ontology and human-labeled dataset for audio events. In Proc. IEEE ICASSP 2017. New Orleans, LA, 776--780.
[6]
Shawn Hershey, Sourish Chaudhuri, Daniel P. W. Ellis, Jort F. Gemmeke, Aren Jansen, Channing Moore, Manoj Plakal, Devin Platt, Rif A. Saurous, Bryan Seybold, Malcolm Slaney, Ron Weiss, and Kevin Wilson. 2017. CNN Architectures for Large-Scale Audio Classification. In International Conference on Acoustics, Speech and Signal Processing (ICASSP). https://arxiv.org/abs/1609.09430
[7]
Björn Þór Jónsson, Çağrı Erdem, Stefano Fasciani, and Kyrre Glette. 2024. Towards Sound Innovation Engines Using Pattern-Producing Networks and Audio Graphs. In Artificial Intelligence in Music, Sound, Art and Design, Colin Johnson, Sérgio M. Rebelo, and Iria Santos (Eds.). Springer Nature Switzerland, Cham, 211--227.
[8]
Björn Þór Jónsson, Amy K. Hoover, and Sebastian Risi. 2015. Interactively Evolving Compositional Sound Synthesis Networks. In Proceedings of the 2015 Annual Conference on Genetic and Evolutionary Computation (GECCO '15). Association for Computing Machinery, New York, NY, USA, 321--328.
[9]
Joel Lehman and Kenneth O. Stanley. 2011. Abandoning Objectives: Evolution Through the Search for Novelty Alone. Evolutionary Computation 19, 2 (June 2011), 189--223.
[10]
Joel Lehman and Kenneth O. Stanley. 2011. Evolving a diversity of virtual creatures through novelty search and local competition. In Proceedings of the 13th annual conference on Genetic and evolutionary computation (GECCO '11). Association for Computing Machinery, New York, NY, USA, 211--218.
[11]
James McDermott, Niall J. L. Griffith, and Michael O'Neill. 2008. Evolutionary Computation Applied to Sound Synthesis. In The Art of Artificial Evolution: A Handbook on Evolutionary Art and Music, Juan Romero and Penousal Machado (Eds.). Springer Berlin Heidelberg, Berlin, Heidelberg, 81--101.
[12]
Aaron van den Oord, Sander Dieleman, Heiga Zen, Karen Simonyan, Oriol Vinyals, Alex Graves, Nal Kalchbrenner, Andrew Senior, and Koray Kavukcuoglu. 2016. WaveNet: A Generative Model for Raw Audio. arXiv:1609.03499 [cs].
[13]
Anurag Sarkar and Seth Cooper. 2021. Generating and Blending Game Levels via Quality-Diversity in the Latent Space of a Variational Autoencoder. In Proceedings of the 16th International Conference on the Foundations of Digital Games (FDG '21). Association for Computing Machinery, New York, NY, USA, 1--11.
[14]
Flavio Schneider, Zhijing Jin, and Bernhard Schölkopf. 2023. Moûsai: Text-to-Music Generation with Long-Context Latent Diffusion. arXiv:2301.11757 [cs, eess].
[15]
Takumi Tanabe, Kazuto Fukuchi, Jun Sakuma, and Youhei Akimoto. 2021. Level generation for angry birds with sequential VAE and latent variable evolution. In Proceedings of the Genetic and Evolutionary Computation Conference (GECCO '21). Association for Computing Machinery, New York, NY, USA, 1052--1060.

Recommendations

Comments

Information & Contributors

Information

Published In

cover image ACM Conferences
GECCO '24 Companion: Proceedings of the Genetic and Evolutionary Computation Conference Companion
July 2024
2187 pages
ISBN:9798400704956
DOI:10.1145/3638530
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for third-party components of this work must be honored. For all other uses, contact the owner/author(s).

Sponsors

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 01 August 2024

Check for updates

Author Tags

  1. neural audio synthesis
  2. variational autoencoder
  3. latent vector evolution

Qualifiers

  • Poster

Conference

GECCO '24 Companion
Sponsor:

Acceptance Rates

Overall Acceptance Rate 1,669 of 4,410 submissions, 38%

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • 0
    Total Citations
  • 51
    Total Downloads
  • Downloads (Last 12 months)51
  • Downloads (Last 6 weeks)16
Reflects downloads up to 05 Mar 2025

Other Metrics

Citations

View Options

Login options

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Figures

Tables

Media

Share

Share

Share this Publication link

Share on social media