Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
skip to main content
10.1145/3580305.3599772acmconferencesArticle/Chapter ViewAbstractPublication PageskddConference Proceedingsconference-collections
research-article
Free access

An Interpretable, Flexible, and Interactive Probabilistic Framework for Melody Generation

Published: 04 August 2023 Publication History

Abstract

The fast-growing demand for algorithmic music generation is found throughout entertainment, art, education, etc. Unfortunately, most recent models are practically impossible to interpret or musically fine-tune, as they use deep neural networks with thousands of parameters. We introduce an interpretable, flexible, and interactive model, SchenkComposer, for melody generation that empowers users to be creative in all aspects of the music generation pipeline and allows them to learn from the process. We divide the task of melody generation into steps based on the process that a human composer using music-theoretical domain knowledge might use. First, the model determines phrase structure based on form analysis and identifies an appropriate number of measures. Using concepts from Schenkerian analysis, the model then finds a fitting harmonic rhythm, middleground harmonic progression, foreground rhythm, and melody in a hierarchical, scaffolded approach using a probabilistic context-free grammar based on musical contours. By incorporating theories of musical form and harmonic structure, our model produces music with long-term structural coherence. In extensive human experiments, we find that music generated with our approach successfully passes a Turing test in human experiments while current state-of-the-art approaches fail, and we further demonstrate superior performance and preference for our melodies compared to existing melody generation methods. Additionally, we developed and deployed a public website for SchenkComposer, and conducted preliminary user surveys. Through analysis, we show the strong viability and enjoyability of SchenkComposer.

Supplementary Material

MP4 File (adfp490-2min-promo.mp4)
Short promotional video for SchenkComposer: An Interpretable, Flexible, and Interactive Probabilistic Framework for Melody Generation.

References

[1]
Gérard Assayag and Shlomo Dubnov. 2004. Using factor oracles for machine improvisation. Soft Computing, Vol. 8, 9 (2004), 604--610.
[2]
Bernard Bel and Jim Kippen. 1992. Modelling music with grammars: formal language representation in the Bol Processor.
[3]
Wallace Berry. 1966. Form in Music. Prentice Hall.
[4]
Allen Clayton Cadwallader and David Gagné. 2007. Analysis of Tonal Music: a Schenkerian Approach.
[5]
William E Caplin. 2001. Classical Form: A Theory of Formal Functions for the Instrumental Music of Haydn, Mozart, and Beethoven. Oxford University Press.
[6]
Jialei Chen, Simon Mak, V Roshan Joseph, and Chuck Zhang. 2020. Function-on-function kriging, with applications to three-dimensional printing of aortic tissues. Technometrics (2020), 1--12.
[7]
Michael Scott Cuthbert and Christopher Ariza. 2010. music21: A toolkit for computer-aided musicology and symbolic music data. Proceedings of the 11th International Society for Music Information Retrieval Conference (ISMIR 2010) (2010).
[8]
Shuqi Dai, Zeyu Jin, Celso Gomes, and Roger B Dannenberg. 2021. Controllable deep melody generation via hierarchical music structure representation. arXiv preprint arXiv:2109.00663 (2021).
[9]
Diana Deutsch and John Feroe. 1981. The internal representation of pitch sequences in tonal music. Psychological Review, Vol. 88 (11 1981), 503--522. https://doi.org/10.1037/0033-295X.88.6.503
[10]
Alvaro E Lopez Duarte. 2020. Algorithmic interactive music generation in videogames: A modular design for adaptive automatic music scoring. SoundEffects-An Interdisciplinary Journal of Sound and Sound Experience, Vol. 9, 1 (2020), 38--59.
[11]
Marc Evanstein. 2022. Can ChatGPT write a good melody? https://www.youtube.com/watch?app=desktop&v=ogfYRBgzZPU&ab_channel=MarcEvanstein%2Fmusic%E2%80%A4py
[12]
Allen Forte and Steven E Gilbert. 1982. Introduction to Schenkerian analysis. Norton.
[13]
Édouard Gilbert and Darrell Conklin. 2007. A probabilistic context-free grammar for melodic reduction. In Proceedings of the International Workshop on Artificial Intelligence and Music, 20th International Joint Conference on Artificial Intelligence. Citeseer, 83--94.
[14]
Gaëtan Hadjeres, Francc ois Pachet, and Frank Nielsen. 2017. Deepbach: a steerable model for bach chorales generation. In International Conference on Machine Learning. PMLR, 1362--1371.
[15]
Stephen Hahn. 2019. Continuous Harmonic Structure in J.S Bach's Triple Fugues in The Well-Tempered Clavier and Art of Fugue. https://digital.library.unt.edu/ark:/67531/metadc1538652/
[16]
Daniel Harasim, Martin Rohrmeier, and Timothy J O'Donnell. 2018. A Generalized Parsing Framework for Generative Models of Harmonic Syntax. In ISMIR. 152--159.
[17]
Cheng-Zhi Anna Huang, Ashish Vaswani, Jakob Uszkoreit, Noam Shazeer, Ian Simon, Curtis Hawthorne, Andrew M. Dai, Matthew D. Hoffman, Monica Dinculescu, and Douglas Eck. 2018. Music Transformer. https://doi.org/10.48550/ARXIV.1809.04281
[18]
Yu-Siang Huang and Yi-Hsuan Yang. 2020. Pop Music Transformer: Beat-based Modeling and Generation of Expressive Pop Piano Compositions. https://doi.org/10.48550/ARXIV.2002.00212
[19]
Robert M Keller and David R Morrison. 2007. A grammatical approach to automatic improvisation. In Proceedings of the Sound and Music Computing Conference. Citeseer, 330--337.
[20]
Phillip B Kirlin. 2014. A Probabilistic Model of Hierarchical Music Analysis. University of Massachusetts Amherst.
[21]
Steven Geoffrey Laitz. 2012. The Complete Musician: An Integrated Approach to Tonal Theory, Analysis, and Listening. Oxford University Press New York.
[22]
Steve Larson. 1998. Schenkerian analysis of modern jazz: questions about method. Music Theory Spectrum, Vol. 20, 2 (1998), 209--241.
[23]
Fred Lerdahl and Ray S Jackendoff. 1996. A Generative Theory of Tonal Music. MIT press.
[24]
Björn Lindblom and Johan Sundberg. 1970. Towards a Generative Theory of Melody. Department of Phonetics, Institute of Linguistics, University of Stockholm.
[25]
Simon Mak, Chih-Li Sung, Xingjian Wang, Shiang-Ting Yeh, Yu-Hung Chang, V Roshan Joseph, Vigor Yang, and C.-F. Jeff Wu. 2018. An efficient surrogate model for emulation and physics extraction of large eddy simulations. J. Amer. Statist. Assoc., Vol. 113, 524 (2018), 1443--1456.
[26]
Alan Marsden. 2010. Schenkerian analysis by computer: A proof of concept. Journal of New Music Research, Vol. 39, 3 (2010), 269--289.
[27]
Ryan Martin. 2018. Generative Music with the Living Machine: Using Rule-Based Improvisation to Generate Narrative and Soundtrack. Critical Studies in Improvisation/Études Critiques en Improvisation, Vol. 12, 2 (2018).
[28]
Gabriele Medeot, Srikanth Cherla, Katerina Kosta, Matt McVicar, Samer Abdallah, Marco Selvi, Ed Newton-Rex, and Kevin Webster. 2018. StructureNet: Inducing Structure in Generated Melodies. In ISMIR. 725--731.
[29]
Maximilian Muller-Eberstein and Nanne van Noord. 2019. Translating visual art into music. In Proceedings of the IEEE/CVF International Conference on Computer Vision Workshops. 0--0.
[30]
Eita Nakamura, Masatoshi Hamanaka, Keiji Hirata, and Kazuyoshi Yoshii. 2016. Tree-structured probabilistic model of monophonic written music based on the generative theory of tonal music. In 2016 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP). IEEE, 276--280.
[31]
Steven Nicholls, Stuart Cunningham, and Richard Picking. 2018. Collaborative artificial intelligence in music production. In Proceedings of the Audio Mostly 2018 on Sound in Immersion and Emotion. 1--4.
[32]
OpenAI. 2022. ChatGPT: Optimizing Language Models for Dialogue. https://openai.com/blog/chatgpt/. Accessed: 2022--12--20.
[33]
François Pachet, Pierre Roy, and Benoit Carré. 2020. Assisted music creation with Flow Machines: towards new categories of new. https://arxiv.org/abs/2006.09232
[34]
Thomas Pankhurst. 2008. SchenkerGUIDE: A Brief Handbook and Website for Schenkerian Analysis. Routledge.
[35]
Gary M Rader. 1974. A method for composing simple traditional music by computer. Commun. ACM, Vol. 17, 11 (1974), 631--638.
[36]
Curtis Roads and Paul Wieneke. 1979. Grammars as representations for music. Computer Music Journal (1979), 48--55.
[37]
Adam Roberts, Jesse Engel, Colin Raffel, Curtis Hawthorne, and Douglas Eck. 2018. A Hierarchical Latent Vector Model for Learning Long-Term Structure in Music. In International Conference on Machine Learning (ICML). http://proceedings.mlr.press/v80/roberts18a.html
[38]
Brayan Rodr'iguez, Raúl Gutiérrez de Pi nérez, and Gerardo M Sarria M. 2017. Using Probabilistic Parsers to Support Salsa Music Composition. In International Conference on Mathematics and Computation in Music. Springer, 361--372.
[39]
Cynthia Rudin, Chaofan Chen, Zhi Chen, Haiyang Huang, Lesia Semenova, and Chudi Zhong. 2022. Interpretable machine learning: Fundamental principles and 10 grand challenges. Statistics Surveys, Vol. 16, none (2022), 1--85. https://doi.org/10.1214/21-SS133
[40]
Carl Schachter. 1999. Unfoldings: Essays in Schenkerian theory and analysis. Oxford University Press on Demand.
[41]
Heinrich Schenker. 2001. Free Composition. Pendragon Press, Hillsdale, NY. Translated and edited by Ernst Oster.
[42]
Arnold Schoenberg, Leonard Stein, and Gerald Strang. 1967. Fundamentals of Musical Composition. Faber & Faber London.
[43]
Jonathan Stock. 1993. The application of Schenkerian Analysis to Ethnomusicology: problems and possibilities. Music Analysis, Vol. 12, 2 (1993), 215--240.
[44]
David Temperley. 2009. A unified probabilistic model for polyphonic music analysis. Journal of New Music Research, Vol. 38, 1 (2009), 3--18.
[45]
Hiroaki Tsushima, Eita Nakamura, Katsutoshi Itoyama, and Kazuyoshi Yoshii. 2017. Function-and Rhythm-Aware Melody Harmonization Based on Tree-Structured Parsing and Split-Merge Sampling of Chord Sequences. In ISMIR. 502--508.
[46]
Ashish Vaswani, Noam Shazeer, Niki Parmar, Jakob Uszkoreit, Llion Jones, Aidan N Gomez, Łukasz Kaiser, and Illia Polosukhin. 2017. Attention is all you need. Advances in Neural Information Processing Systems, Vol. 30 (2017).
[47]
Gilbert Wassermann and Mark Glickman. 2020. Automated harmonization of bass lines from Bach chorales: a hybrid approach. Computer Music Journal, Vol. 43, 2--3 (2020), 142--157.
[48]
Terry Winograd. 1968. Linguistics and the computer analysis of tonal harmony. Journal of Music Theory, Vol. 12, 1 (1968), 2--49.
[49]
Jian Wu, Changran Hu, Yulong Wang, Xiaolin Hu, and Jun Zhu. 2019. A hierarchical recurrent neural network for symbolic melody generation. IEEE Transactions on Cybernetics, Vol. 50, 6 (2019), 2749--2757.
[50]
Ruihan Yang, Dingsu Wang, Ziyu Wang, Tianyao Chen, Junyan Jiang, and Gus Xia. 2019. Deep music analogy via latent representation disentanglement. arXiv preprint arXiv:1906.03626 (2019).

Cited By

View all
  • (2024)Artificial intelligence in music: recent trends and challengesNeural Computing and Applications10.1007/s00521-024-10555-x37:2(801-839)Online publication date: 16-Nov-2024

Recommendations

Comments

Information & Contributors

Information

Published In

cover image ACM Conferences
KDD '23: Proceedings of the 29th ACM SIGKDD Conference on Knowledge Discovery and Data Mining
August 2023
5996 pages
ISBN:9798400701030
DOI:10.1145/3580305
Publication rights licensed to ACM. ACM acknowledges that this contribution was authored or co-authored by an employee, contractor or affiliate of the United States government. As such, the Government retains a nonexclusive, royalty-free right to publish or reproduce this article, or to allow others to do so, for Government purposes only.

Sponsors

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 04 August 2023

Permissions

Request permissions for this article.

Check for updates

Author Tags

  1. algorithmic music generation
  2. interpretable machine learning
  3. musical form
  4. probabilistic context-free grammars
  5. schenkerian analysis

Qualifiers

  • Research-article

Conference

KDD '23
Sponsor:

Acceptance Rates

Overall Acceptance Rate 1,133 of 8,635 submissions, 13%

Upcoming Conference

KDD '25

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)299
  • Downloads (Last 6 weeks)13
Reflects downloads up to 24 Jan 2025

Other Metrics

Citations

Cited By

View all
  • (2024)Artificial intelligence in music: recent trends and challengesNeural Computing and Applications10.1007/s00521-024-10555-x37:2(801-839)Online publication date: 16-Nov-2024

View Options

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Login options

Figures

Tables

Media

Share

Share

Share this Publication link

Share on social media