Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
skip to main content
research-article

The Aesthetics of Disharmony: Harnessing Sounds and Images for Dynamic Soundscapes Generation

Published: 04 October 2023 Publication History

Abstract

This work presents an autonomous approach that explores the dynamic generation of relaxing soundscapes for games and artistic installations. Differently from past works, this system can generate music and images simultaneously, preserving human intent and coherency. We present our algorithm for the generation of audiovisual instances and also a system based on this approach, verifying the quality of the outcomes it can produce in light of current approaches for the generation of images and music. We also instigate the discussion around the new paradigm in arts, where the creative process is delegated to autonomous systems, with limited human participation. Our user study (N=74) shows that our approach overcomes current deep learning models in terms of quality, being recognized as human production, as if the outcome were being generated out of an endless musical improvisation performance.

References

[1]
Nipun Agarwala, Yuki Inoue, and Axel Sly. 2017. Music composition using recurrent neural networks. CS 224n: Natural Language Processing with Deep Learning, Spring, 1 (2017), 1–10.
[2]
Andrea Agostinelli, Timo I. Denk, Zalán Borsos, Jesse Engel, Mauro Verzetti, Antoine Caillon, Qingqing Huang, Aren Jansen, Adam Roberts, Marco Tagliasacchi, Matt Sharifi, Neil Zeghidour, and Christian Frank. 2023. MusicLM: Generating Music From Text. arxiv:2301.11325.
[3]
Maximilian Altmeyer, Vladislav Hnatovskiy, Katja Rogers, Pascal Lessel, and Lennart E. Nacke. 2022. Here Comes No Boom! The Lack of Sound Feedback Effects on Performance and User Experience in a Gamified Image Classification Task. In Proceedings of the 2022 CHI Conference on Human Factors in Computing Systems (CHI ’22). Association for Computing Machinery, New York, NY, USA. Article 193, 14 pages. isbn:9781450391573 https://doi.org/10.1145/3491102.3517581
[4]
Willi Apel. 2003. The Harvard dictionary of music. 16, Harvard University Press, Harvard University.
[5]
BGMC. 2014. Cafe Music BGM channel. https://www.youtube.com/channel/UCJhjE7wbdYAae1G25m0tHAA
[6]
Jean-Pierre Briot and François Pachet. 2020. Deep learning for music generation: challenges and directions. Neural Computing and Applications, 32, 4 (2020), 981–993.
[7]
John Brown. 2012. The Mezzo Study. Journal of Research, 5 (2012), 100–120.
[8]
Marcio Cabral, Andre Montes, Gabriel Roque, Olavo Belloc, Mario Nagamura, Regis RA Faria, Fernando Teubl, Celso Kurashima, Roseli Lopes, and Marcelo Zuffo. 2015. Crosscale: A 3D virtual musical instrument interface. In 2015 IEEE Symposium on 3D User Interfaces (3DUI). IEEE Xplore, Arles, France. 199–200.
[9]
Filippo Carnovalini and Antonio Rodà. 2020. Computational creativity and music generation systems: An introduction to the state of the art. Frontiers in Artificial Intelligence, 3 (2020), 14.
[10]
Pablo Samuel Castro. 2019. Performing Structured Improvisations with pre-trained Deep Learning Models. arXiv preprint arXiv:1904.13285, 1 (2019), 7.
[11]
Matteo Casu, Marinos Koutsomichalis, and Andrea Valle. 2014. Imaginary Soundscapes: The SoDA Project. In Proceedings of the 9th Audio Mostly: A Conference on Interaction With Sound (AM ’14). Association for Computing Machinery, New York, NY, USA. Article 5, 8 pages. isbn:9781450330329 https://doi.org/10.1145/2636879.2636885
[12]
Gabriele Cimolino and T.C. Nicholas Graham. 2022. Two Heads Are Better Than One: A Dimension Space for Unifying Human and Artificial Intelligence in Shared Control. In Proceedings of the 2022 CHI Conference on Human Factors in Computing Systems (CHI ’22). Association for Computing Machinery, New York, NY, USA. Article 8, 21 pages. isbn:9781450391573 https://doi.org/10.1145/3491102.3517610
[13]
William G Collier and Timothy L Hubbard. 2001. Musical scales and evaluations of happiness and awkwardness: Effects of pitch, direction, and scale mode. American Journal of psychology, 114, 3 (2001), 355–375.
[14]
Mihaly Csikszentmihalyi. 1990. Flow: The Psychology of Optimal Experience. Harper & Row, USA. isbn:0060162538
[15]
Prafulla Dhariwal, Heewoo Jun, Christine Payne, Jong Wook Kim, Alec Radford, and Ilya Sutskever. 2020. Jukebox: A generative model for music. arXiv preprint arXiv:2005.00341, 1, 1 (2020), 20.
[16]
2019. MidiMe: Personalizing a MusicVAE model with user data, Monica Dinculescu, Jesse Engel, and Adam Roberts (Eds.).
[17]
Mark D’Inverno and Jon McCormack. 2015. Heroic versus Collaborative AI for the Arts. In Proceedings of the 24th International Conference on Artificial Intelligence (IJCAI’15). AAAI Press, Buenos Aires, Argentina. 2438–2444. isbn:9781577357384
[18]
Arne Eigenfeldt and Philippe Pasquier. 2013. Considering Vertical and Horizontal Context in Corpus-based Generative Electronic Dance Music. In Proceedings of the Fourth International Conference on Computational Creativity (ICCC). International Conference on Computational Creativity, Sydney, Australia. 1–7.
[19]
Ahmed Elgammal, Bingchen Liu, Mohamed Elhoseiny, and Marian Mazzone. 2017. Can: Creative adversarial networks, generating" art" by learning about styles and deviating from style norms. arXiv preprint arXiv:1706.07068.
[20]
Andromeda Entertainment. 2020. SoundSelf: A Technodelic.
[21]
Mário Escarce, Georgia Rossmann Martins, Leandro Soriano Marcolino, and Yuri Tavares dos Passos. 2017. Emerging Sounds Through Implicit Cooperation: A Novel Model for Dynamic Music Generation. In Proceedings of the Thirteenth AAAI Conference on Artificial Intelligence and Interactive Digital Entertainment). AAAI, Snowbird, Little Cottonwood Canyon, Utah, USA. 186–192. https://aaai.org/ocs/index.php/AIIDE/AIIDE17/paper/view/15887
[22]
Mário Escarce Junior. 2022. Meta-Interactivity and Playful Approaches for Musical Composition. In Extended Abstracts of the 2022 CHI Conference on Human Factors in Computing Systems (CHI EA ’22). Association for Computing Machinery, New York, NY, USA. Article 58, 4 pages. isbn:9781450391566 https://doi.org/10.1145/3491101.3503819
[23]
Mário Escarce Junior, Georgia Rossmann Martins, Leandro Soriano Marcolino, and Elisa Rubegni. 2021. A Meta-Interactive Compositional Approach That Fosters Musical Emergence through Ludic Expressivity. Proc. ACM Hum.-Comput. Interact., 5, CHI PLAY (2021), Article 262, oct, 32 pages. https://doi.org/10.1145/3474689
[24]
Jose D Fernández and Francisco Vico. 2013. AI methods in algorithmic composition: A comprehensive survey. Journal of Artificial Intelligence Research, 48 (2013), 513–582.
[25]
Lucas N. Ferreira and E. James Whitehead. 2019. Learning to Generate Music With Sentiment. ArXiv, abs/2103.06125 (2019).
[26]
Lauryn Gayhardt and Maya Ackerman. 2021. SOVIA: Sonification of Visual Interactive Art. In ICCC. ICCC, 391–394.
[27]
Darrell Gibson and Richard Polfreman. 2019. A framework for the development and evaluation of graphical interpolation for synthesizer parameter mappings. In Proceedings of the 16th Sound and Music Computing Conference. SMC2019, Nottingham, United Kingdom. 8.
[28]
Dorien Herremans, Ching-Hua Chuan, and Elaine Chew. 2017. A Functional Taxonomy of Music Generation Systems. ACM Comput. Surv., 50, 5 (2017), Article 69, sep, 30 pages. issn:0360-0300 https://doi.org/10.1145/3108242
[29]
Simon Holland, Katie Wilkie, Paul Mulholland, and Allan Seago. 2013. Music interaction: understanding music and human-computer interaction. In Music and human-computer interaction. Springer, London, UK. 1–28.
[30]
Amy K. Hoover, Michael P. Rosario, and Kenneth O. Stanley. 2008. Scaffolding for Interactively Evolving Novel Drum Tracks for Existing Songs. In Applications of Evolutionary Computing, Mario Giacobini, Anthony Brabazon, Stefano Cagnoni, Gianni A. Di Caro, Rolf Drechsler, Anikó Ekárt, Anna Isabel Esparcia-Alcázar, Muddassar Farooq, Andreas Fink, Jon McCormack, Michael O’Neill, Juan Romero, Franz Rothlauf, Giovanni Squillero, A. Şima Uyar, and Shengxiang Yang (Eds.). Springer Berlin Heidelberg, Berlin, Heidelberg. 412–422. isbn:978-3-540-78761-7
[31]
Amy K. Hoover, Michael P. Rosario, and Kenneth O. Stanley. 2008. Scaffolding for Interactively Evolving Novel Drum Tracks for Existing Songs. In Applications of Evolutionary Computing, Mario Giacobini, Anthony Brabazon, Stefano Cagnoni, Gianni A. Di Caro, Rolf Drechsler, Anikó Ekárt, Anna Isabel Esparcia-Alcázar, Muddassar Farooq, Andreas Fink, Jon McCormack, Michael O’Neill, Juan Romero, Franz Rothlauf, Giovanni Squillero, A. Şima Uyar, and Shengxiang Yang (Eds.). Springer Berlin Heidelberg, Berlin, Heidelberg. 412–422. isbn:978-3-540-78761-7
[32]
Amy K Hoover and Kenneth O Stanley. 2009. Exploiting functional relationships in musical composition. Connection Science, 21, 2-3 (2009), 227–251.
[33]
Amy K Hoover, Paul A Szerlip, and Kenneth O Stanley. 2014. Functional scaffolding for composing additional musical voices. Computer Music Journal, 38, 4 (2014), 80–99.
[34]
Allen Huang and Raymond Wu. 2016. Deep Learning for Music. ArXiv, abs/1606.04930 (2016), 8.
[35]
Kori Inkpen, Stevie Chancellor, Munmun De Choudhury, Michael Veale, and Eric PS Baumer. 2019. Where is the human? Bridging the gap between AI and HCI. In Extended abstracts of the 2019 chi conference on human factors in computing systems. 1–9.
[36]
Patrik N Juslin and John Sloboda. 2011. Handbook of music and emotion: Theory, research, applications. Oxford University Press.
[37]
Yuma Kajihara, Shoya Dozono, and Nao Tokui. 2017. Imaginary Soundscape: Cross-Modal Approach to Generate Pseudo Sound Environments. In Proceedings of the Workshop on Machine Learning for Creativity and Design (NIPS 2017), Long Beach, CA, USA. NIPS, 1–3.
[38]
David Kanaga and Ed Key. 2013. Proteus.
[39]
Anssi Klapuri and Manuel Davy. 2007. Signal processing methods for music transcription. Springer-Verlag US, United States. isbn:978-1-4419-4035-3
[40]
Pinyao Liu, Ekaterina R. Stepanova, Alexandra Kitson, Thecla Schiphorst, and Bernhard E. Riecke. 2022. Virtual Transcendent Dream: Empowering People through Embodied Flying in Virtual Reality. In Proceedings of the 2022 CHI Conference on Human Factors in Computing Systems (CHI ’22). Association for Computing Machinery, New York, NY, USA. Article 133, 18 pages. isbn:9781450391573 https://doi.org/10.1145/3491102.3517677
[41]
Phil Lopes, Antonios Liapis, and Georgios N Yannakakis. 2015. Targeting horror via level and soundscape generation. In Eleventh Artificial Intelligence and Interactive Digital Entertainment Conference.
[42]
Phil Lopes, Antonios Liapis, and Georgios N. Yannakakis. 2016. Framing Tension for Game Generation. In Proceedings of the International Conference on Computational Creativity.
[43]
Phil Lopes, Antonios Liapis, and Georgios N Yannakakis. 2017. Modelling affect for horror soundscapes. IEEE Transactions on Affective Computing, 10, 2 (2017), 209–222.
[44]
Andrés Lucero, Evangelos Karapanos, Juha Arrasvuori, and Hannu Korhonen. 2014. Playful or Gameful? Creating Delightful User Experiences. Interactions, 21, 3 (2014), May, 34–39. issn:1072-5520 https://doi.org/10.1145/2590973
[45]
Michael Lyvers, Samantha Cotterell, and Fred Thorberg. 2018. “Music is my drug”: Alexithymia, empathy, and emotional responding to music. Psychology of Music, 48 (2018), 12, 030573561881616. https://doi.org/10.1177/0305735618816166
[46]
Cong Hung Mai, Ryohei Nakatsu, Naoko Tosa, Takashi Kusumi, and Koji Koyamada. 2020. Learning of art style using AI and its evaluation based on psychological experiments. In International Conference on Entertainment Computing. 308–316.
[47]
William P Malm. 1996. Music cultures of the Pacific, the Near East, and Asia. 2, Pearson College Division, New Jersey, USA.
[48]
Georgia Rossmann Martins, Mário Escarce Junior, and Leandro Soriano Marcolino. 2016. Jikan to Kukan: A Hands-On Musical Experience in AI, Games and Art. In Proceedings of the Thirtieth AAAI Conference on Artificial Intelligence. AAAI, Phoenix, Arizona, USA. 4377–4378. http://www.aaai.org/ocs/index.php/AAAI/AAAI16/paper/view/12123
[49]
Chase Mitchusson. 2020. Indeterminate Sample Sequencing in Virtual Reality. Ph. D. Dissertation. Louisiana State University – LSU. USA. issn:2220-4806 https://doi.org/10.5281/zenodo.4813332
[50]
Julian Moreira, Pierre Roy, and François Pachet. 2013. Virtualband: Interacting with Styslistically Consistent Agents. In Proceedings of the 14th International Society for Music Information Retrieval Conference. ISMIR, Curitiba, Brazil.
[51]
Prakash M Nadkarni, Lucila Ohno-Machado, and Wendy W Chapman. 2011. Natural language processing: an introduction. Journal of the American Medical Informatics Association, 18, 5 (2011), 544–551.
[52]
Jonas Oppenlaender. 2022. The Creativity of Text-based Generative Art. arXiv preprint arXiv:2206.02904.
[53]
François Pachet, Pierre Roy, Julian Moreira, and Mark d’Inverno. 2013. Reflexive loopers for solo musical improvisation. In Proceedings of the SIGCHI Conference on Human Factors in Computing Systems. Association for Computing Machinery – ACM, New York, NY, United States. 4.
[54]
Philippe Pasquier, Arne Eigenfeldt, Oliver Bown, and Shlomo Dubnov. 2017. An Introduction to Musical Metacreation. Comput. Entertain., 14, 2 (2017), Article 2, jan, 14 pages. https://doi.org/10.1145/2930672
[55]
Nikolaos Passalis and Stavros Doropoulos. 2021. deepsing: Generating sentiment-aware visual stories using cross-modal music translation. Expert Systems with Applications, 164 (2021), 114059.
[56]
Vincent Barreau Pierre Barreau, Denis Shtefan. 2016. AIVA.
[57]
Luke Plunkett. 2022. AI Creating ’Art’ Is An Ethical And Copyright Nightmare.
[58]
Anthony Prechtl. 2016. Adaptive music generation for computer games. Open University (United Kingdom).
[59]
Mr D Murahari Reddy, Mr Sk Masthan Basha, Mr M Chinnaiahgari Hari, and Mr N Penchalaiah. 2021. Dall-e: Creating images from text. Dogo Rangsang Research Journal - DRSR.
[60]
Adam Roberts, Jesse Engel, Curtis Hawthorne, Ian Simon, Elliot Waite, Sageev Oore, Natasha Jaques, Cinjon Resnick, and Douglas Eck. 2017. Interactive musical improvisation with Magenta. In Proceedings of the Thirtieth Annual Conference on Neural Information Processing Systems. NIPS, Barcelona, Spain. (Demonstration)
[61]
Juan G Roederer. 2008. The physics and psychophysics of music: An introduction. Springer Science & Business Media, University of Alaska, Fairbanks, AK, USA.
[62]
Katja Rogers, Maximilian Milo, Michael Weber, and Lennart E Nacke. 2020. The Potential Disconnect between Time Perception and Immersion: Effects of Music on VR Player Experience. In Proceedings of the Annual Symposium on Computer-Human Interaction in Play. Association for Computing Machinery – ACM, Virtual Event Canada. 414–426.
[63]
Asreen Rostami and Donald McMillan. 2022. The Normal Natural Troubles of Virtual Reality in Mixed-Reality Performances. In Proceedings of the 2022 CHI Conference on Human Factors in Computing Systems (CHI ’22). Association for Computing Machinery, New York, NY, USA. Article 132, 22 pages. isbn:9781450391573 https://doi.org/10.1145/3491102.3502139
[64]
Angus Russell. 2019. NightCafe Studio.
[65]
P. Schaeffer, C. North, and J. Dack. 2012. In Search of a Concrete Music. University of California Press. isbn:9780520265745 lccn:2012029627 https://books.google.com.br/books?id=6nTruQAACAAJ
[66]
Raymond Murray Schafer. 1969. The new soundscape. BMI Canada Limited Don Mills.
[67]
Emery Schubert, Sergio Canazza, Giovanni De Poli, and Antonio Rodà. 2017. Algorithms can mimic human piano performance: the deep blues of music. Journal of New Music Research, 46, 2 (2017), 175–186.
[68]
Marco Scirea, Julian Togelius, Peter Eklund, and Sebastian Risi. 2017. Affective Evolutionary Music Composition with MetaCompose. Genetic Programming and Evolvable Machines, 18, 4 (2017), dec, 433–465. issn:1389-2576 https://doi.org/10.1007/s10710-017-9307-y
[69]
Bob L Sturm, Oded Ben-Tal, Úna Monaghan, Nick Collins, Dorien Herremans, Elaine Chew, Gaëtan Hadjeres, Emmanuel Deruty, and François Pachet. 2019. Machine learning research that matters for music creation: A case study. Journal of New Music Research, 48, 1 (2019), 36–55.
[70]
Koray Tahiroğlu, Thor Magnusson, Adam Parkinson, Iris Garrelfs, and Atau Tanaka. 2020. Digital Musical Instruments as Probes: How computation changes the mode-of-being of musical instruments. Organised Sound, 25, 1 (2020), 64–74.
[71]
Kıvanç Tatar and Philippe Pasquier. 2019. Musical agents: A typology and state of the art towards musical metacreation. Journal of New Music Research, 48, 1 (2019), 56–105.
[72]
Duncan Williams, Victoria Hodge, Lina Gega, Damian Murphy, Peter Cowling, and Anders Drachen. 2019. AI and automatic music generation for mindfulness. In Audio Engineering Society Conference: 2019 AES International Conference on Immersive and Interactive Audio.
[73]
Duncan Williams, Alexis Kirke, Eduardo R Miranda, Etienne Roesch, Ian Daly, and Slawomir Nasuto. 2015. Investigating affect in algorithmic composition systems. Psychology of Music, 43, 6 (2015), 831–854.
[74]
Kun Zhao, Siqi Li, Juanjuan Cai, Hui Wang, and Jingling Wang. 2019. An emotional symbolic music generation system based on LSTM networks. In 2019 IEEE 3rd Information Technology, Networking, Electronic and Automation Control Conference (ITNEC). 2039–2043.
[75]
Zhuoming Zhou, Elena Márquez Segura, Jared Duval, Michael John, and Katherine Isbister. 2019. Astaire: A Collaborative Mixed Reality Dance Game for Collocated Players. In Proceedings of the Annual Symposium on Computer-Human Interaction in Play. Association for Computing Machinery – ACM, New York, NY, United States. 5–18.

Recommendations

Comments

Information & Contributors

Information

Published In

cover image Proceedings of the ACM on Human-Computer Interaction
Proceedings of the ACM on Human-Computer Interaction  Volume 7, Issue CHI PLAY
November 2023
1360 pages
EISSN:2573-0142
DOI:10.1145/3554313
Issue’s Table of Contents
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than the author(s) must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected].

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 04 October 2023
Published in PACMHCI Volume 7, Issue CHI PLAY

Permissions

Request permissions for this article.

Check for updates

Author Tags

  1. AI
  2. HCI
  3. Procedural
  4. art
  5. audio
  6. games
  7. landscape
  8. level design
  9. music
  10. soundscape

Qualifiers

  • Research-article

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • 0
    Total Citations
  • 141
    Total Downloads
  • Downloads (Last 12 months)73
  • Downloads (Last 6 weeks)5
Reflects downloads up to 08 Feb 2025

Other Metrics

Citations

View Options

Login options

Full Access

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Figures

Tables

Media

Share

Share

Share this Publication link

Share on social media