research-article

Open access

Novice-AI Music Co-Creation via AI-Steering Tools for Deep Generative Models

Authors:

Cheng Zhi Huang,

Carrie J. CaiAuthors Info & Claims

CHI '20: Proceedings of the 2020 CHI Conference on Human Factors in Computing Systems

Pages 1 - 13

https://doi.org/10.1145/3313831.3376739

Published: 23 April 2020 Publication History

All formats PDF

Abstract

While generative deep neural networks (DNNs) have demonstrated their capacity for creating novel musical compositions, less attention has been paid to the challenges and potential of co-creating with these musical AIs, especially for novices. In a needfinding study with a widely used, interactive musical AI, we found that the AI can overwhelm users with the amount of musical content it generates, and frustrate them with its non-deterministic output. To better match co-creation needs, we developed AI-steering tools, consisting of Voice Lanes that restrict content generation to particular voices; Example-Based Sliders to control the similarity of generated content to an existing example; Semantic Sliders to nudge music generation in high-level directions (happy/sad, conventional/surprising); and Multiple Alternatives of generated content to audition and choose from. In a summative study (N=21), we discovered the tools not only increased users' trust, control, comprehension, and sense of collaboration with the AI, but also contributed to a greater sense of self-efficacy and ownership of the composition relative to the AI.

Supplemental Material

MP4 File

Download
44.56 MB

MP4 File

Preview video

Download
3.78 MB

MP4 File

Supplemental video

Download
10.01 MB

SRT File

Preview video captions

Download
.90 KB

References

[1]

Saleema Amershi, Dan Weld, Mihaela Vorvoreanu, Adam Fourney, Besmira Nushi, Penny Collisson, Jina Suh, Shamsi Iqbal, Paul N. Bennett, Kori Inkpen, Jaime Teevan, Ruth Kikin-Gil, and Eric Horvitz. 2019. Guidelines for Human-AI Interaction. In Proceedings of the 2019 CHI Conference on Human Factors in Computing Systems (CHI '19). ACM, NY, NY, USA, Article 3, 13 pages.

Digital Library

[2]

Kristina Andersen and Peter Knees. 2016. The Dial: Exploring Computational Strangeness. In Proceedings of the 2016 CHI Conference Extended Abstracts on Human Factors in Computing Systems (CHI EA '16). Association for Computing Machinery, New York, NY, USA, 1352--1358.

Digital Library

[3]

Yoav Benjamini and Yosef Hochberg. 1995. Controlling the False Discovery Rate: a Practical and Powerful Approach to Multiple Testing. Journal of the Royal Statistical Society: Series B (Methodological) 57, 1 (1995), 289--300.

[4]

Nicolas Boulanger-Lewandowski, Yoshua Bengio, and Pascal Vincent. 2012. Modeling Temporal Dependencies in High-Dimensional Sequences: Application to Polyphonic Music Generation and Transcription. International Conference on Machine Learning (2012).

Digital Library

[5]

Virginia Braun and Victoria Clarke. 2006. Using Thematic Analysis in Psychology. Qualitative Research in Psychology 3, 2 (2006), 77--101.

[6]

Carrie J. Cai, Samantha Winter, David Steiner, Lauren Wilcox, and Michael Terry. 2019. "Hello AI": Uncovering the Onboarding Needs of Medical Practitioners for Human-AI Collaborative Decision-Making. Proc. ACM Hum.-Comput. Interact. 3, CSCW, Article 104 (Nov. 2019), 24 pages.

Digital Library

[7]

Elizabeth Clark, Anne Spencer Ross, Chenhao Tan, Yangfeng Ji, and Noah A Smith. 2018. Creative Writing with a Machine in the Loop: Case Studies on Slogans and Stories. In Proceedings of the 23rd International Conference on Intelligent User Interfaces. ACM, 329--340.

Digital Library

[8]

Kate Compton and Michael Mateas. 2015. Casual Creators. In Proceedings of the Sixth International Conference on Computational Creativity (ICCC 2015), Hannu Toivonen, Simon Colton, Michael Cook, and Dan Ventura (Eds.). Brigham Young University, Park City, Utah, 228--235. http://computationalcreativity.net/iccc2015/proceedings/10_2Compton.pdf

[9]

Nicholas Davis, Chih-PIn Hsiao, Kunwar Yashraj Singh, Lisa Li, and Brian Magerko. 2016. Empirically Studying Participatory Sense-Making in Abstract Drawing with a Co-Creative Cognitive Agent. In Proceedings of the 21st International Conference on Intelligent User Interfaces (IUI '16). ACM, NY, NY, USA, 196--207.

Digital Library

[10]

Monica Dinculescu, Jesse Engel, and Adam Roberts. 2019. MidiMe: Personalizing a MusicVAE Model with User Data. In Workshop on Machine Learning for Creativity and Design, NeurIPS.

[11]

Monica Dinculescu and Cheng-Zhi Anna Huang. 2019. Coucou: An Expanded Interface for Interactive Composition with Coconet, through Flexible Inpainting. (2019). https://coconet.glitch.me/

[12]

Chris Donahue, Ian Simon, and Sander Dieleman. 2019. Piano Genie. In Proceedings of the 24th International Conference on Intelligent User Interfaces (IUI '19). ACM, NY, NY, USA, 160--164.

Digital Library

[13]

Douglas Eck and Juergen Schmidhuber. 2002. Finding Temporal Structure in Music: Blues Improvisation with LSTM Recurrent Networks. In Proceedings of the 12th IEEE Workshop on Neural Networks for Signal Processing.

[14]

Judith E Fan, Monica Dinculescu, and David Ha. 2019. collabdraw: An Environment for Collaborative Sketching with an Artificial Agent. In Proceedings of the 2019 on Creativity and Cognition. ACM, 556--561.

Digital Library

[15]

Morwaread M Farbood, Egon Pasztor, and Kevin Jennings. 2004. Hyperscore: a Graphical Sketchpad for Novice Composers. IEEE Computer Graphics and Applications 24, 1 (2004), 50--54.

Digital Library

[16]

Rebecca Anne Fiebrink. 2011. Real-time Human Interaction with Supervised Learning Algorithms for Music Composition and Performance. PhD dissertation, Princeton University (2011).

Digital Library

[17]

Satoru Fukayama, Kazuyoshi Yoshii, and Masataka Goto. 2013. Chord-Sequence-Factory: A Chord Arrangement System Modifying Factorized Chord Sequence Probabilities. International Society for Music Information Retrieval (2013).

[18]

Katy Ilonka Gero and Lydia B Chilton. 2019. Metaphoria: An Algorithmic Companion for Metaphor Creation. In Proceedings of the 2019 CHI Conference on Human Factors in Computing Systems. ACM, 296.

Digital Library

[19]

Jon Gillick, Adam Roberts, Jesse Engel, Douglas Eck, and David Bamman. 2019. Learning to Groove with Inverse Sequence Transformations. arXiv preprint arXiv:1905.06118 (2019).

[20]

Ian Goodfellow, Yoshua Bengio, and Aaron Courville. 2016. Deep Learning. MIT press.

Digital Library

[21]

James Granger, Mateo Aviles, Joshua Kirby, Austin Griffin, Johnny Yoon, Raniero Lara-Garduno, and Tracy Hammond. 2018. Lumanote: A Real-Time Interactive Music Composition Assistant. In Intelligent User Interfaces Workshops.

[22]

Florian Grote, Kristina Andersen, and Peter Knees. 2015. Collaborating with Intelligent Machines: Interfaces for Creative Sound. In Proceedings of the 33rd Annual ACM Conference Extended Abstracts on Human Factors in Computing Systems (CHI EA '15). Association for Computing Machinery, New York, NY, USA, 2345--2348.

Digital Library

[23]

Matthew Guzdial, Nicholas Liao, Jonathan Chen, Shao-Yu Chen, Shukan Shah, Vishwa Shah, Joshua Reno, Gillian Smith, and Mark O. Riedl. 2019. Friend, Collaborator, Student, Manager: How Design of an AI-Driven Game Level Editor Affects Creators. In Proceedings of the 2019 CHI Conference on Human Factors in Computing Systems (CHI '19). Association for Computing Machinery, New York, NY, USA, Article Paper 624, 13 pages.

Digital Library

[24]

Gaëtan Hadjeres, François Pachet, and Frank Nielsen. 2017. DeepBach: a Steerable Model for Bach Chorales Generation. In International Conference on Machine Learning. 1362--1371.

[25]

Sandra G Hart and Lowell E Staveland. 1988. Development of NASA-TLX (Task Load Index): Results of Empirical and Theoretical Research. In Advances in Psychology. Vol. 52. Elsevier, 139--183.

[26]

Curtis Hawthorne, Andriy Stasyuk, Adam Roberts, Ian Simon, Cheng-Zhi Anna Huang, Sander Dieleman, Erich Elsen, Jesse Engel, and Douglas Eck. 2019. Enabling Factorized Piano Music Modeling and Generation with the MAESTRO Dataset. In International Conference on Learning Representations.

[27]

Cheng-Zhi Anna Huang, Tim Cooijmnas, Adam Roberts, Aaron Courville, and Douglas Eck. 2017. Counterpoint by Convolution. International Society for Music Information Retrieval. (2017).

[28]

Cheng-Zhi Anna Huang, David Duvenaud, and Krzysztof Z Gajos. 2016. Chordripple: Recommending Chords to Help Novice Composers Go Beyond the Ordinary. In Proceedings of the 21st International Conference on Intelligent User Interfaces. ACM, 241--250.

Digital Library

[29]

Cheng-Zhi Anna Huang, Curtis Hawthorne, Adam Roberts, Monica Dinculescu, James Wexler, Leon Hong, and Jacob Howcroft. 2019a. The Bach Doodle: Approachable Music Composition with Machine Learning at Scale. International Society for Music Information Retrieval. (2019).

[30]

Cheng-Zhi Anna Huang, Ashish Vaswani, Jakob Uszkoreit, Ian Simon, Curtis Hawthorne, Noam Shazeer, Andrew M Dai, Matthew D Hoffman, Monica Dinculescu, and Douglas Eck. 2019b. Music Transformer. In International Conference on Learning Representations.

[31]

Mikhail Jacob and Brian Magerko. 2015. Interaction-based Authoring for Scalable Co-creative Agents. In Proceedings of the Sixth International Conference on Computational Creativity (ICCC 2015), Hannu Toivonen, Simon Colton, Michael Cook, and Dan Ventura (Eds.). Brigham Young University, Park City, Utah, 236--243. http://computationalcreativity.net/iccc2015/proceedings/10_3Jacob.pdf

[32]

Pegah Karimi, Mary Lou Maher, Nicholas Davis, and Kazjon Grace. 2019. Deep Learning in a Computational Model for Conceptual Shifts in a Co-Creative Design System. arXiv preprint arXiv:1906.10188 (2019).

[33]

Janin Koch, Andrés Lucero, Lena Hegemann, and Antti Oulasvirta. 2019. May AI? Design Ideation with Cooperative Contextual Bandits. In Proceedings of the 2019 CHI Conference on Human Factors in Computing Systems (CHI '19). Association for Computing Machinery, New York, NY, USA, Article Paper 633, 12 pages.

Digital Library

[34]

Feynman Liang. 2016. BachBot: Automatic Composition in the Style of Bach Chorales. Masters thesis, University of Cambridge (2016).

[35]

Roger C Mayer, James H Davis, and F David Schoorman. 1995. An Integrative Model of Organizational Trust. Academy of Management Review 20, 3 (1995), 709--734.

[36]

Changhoon Oh, Jungwoo Song, Jinhan Choi, Seonghyeon Kim, Sungwoo Lee, and Bongwon Suh. 2018. I Lead, You Help but Only with Enough Details: Understanding User Experience of Co-Creation with Artificial Intelligence. In Proceedings of the 2018 CHI Conference on Human Factors in Computing Systems (CHI '18). ACM, NY, NY, USA, Article 649, 13 pages.

Digital Library

[37]

Christine Payne. 2019. MuseNet. (2019). https://openai.com/blog/musenet

[38]

Adam Roberts, Jesse Engel, Colin Raffel, Curtis Hawthorne, and Douglas Eck. 2018a. A Hierarchical Latent Vector Model for Learning Long-Term Structure in Music. In International Conference on Machine Learning (ICML). http://proceedings.mlr.press/v80/roberts18a.html

[39]

Adam Roberts, Curtis Hawthorne, and Ian Simon. 2018b. Magenta.js: A JavaScript API for Augmenting Creativity with Deep Learning. In Joint Workshop on Machine Learning for Music (ICML).

[40]

Ralf Schwarzer and Matthias Jerusalem. 1995. Generalized Self-efficacy Scale. Measures in Health Psychology: A User's Portfolio. Causal and Control Beliefs 1, 1 (1995), 35--37.

[41]

Ian Simon, Dan Morris, and Sumit Basu. 2008. MySong: Automatic Accompaniment Generation for Vocal Melodies. In Proceedings of the SIGCHI Conference on Human Factors in Computing Systems (CHI '08). Association for Computing Machinery, New York, NY, USA, 725--734.

Digital Library

[42]

Wikipedia contributors. 2019. Dixit (card game) - Wikipedia, The Free Encyclopedia. https://en.wikipedia.org/w/index.php?title=Dixit_ (card_game)&oldid=908027531. (2019). [Online; accessed 19-September-2019].

Cited By

Cheng L(2025)The impact of generative AI on school music education: Challenges and recommendationsArts Education Policy Review10.1080/10632913.2025.2451373(1-8)Online publication date: 10-Jan-2025
https://doi.org/10.1080/10632913.2025.2451373
Diro AKaisar SSaini AFatima SHiep PErba F(2025)Workplace security and privacy implications in the GenAI age: A surveyJournal of Information Security and Applications10.1016/j.jisa.2024.10396089(103960)Online publication date: Mar-2025
https://doi.org/10.1016/j.jisa.2024.103960
Choi YLee CChung SCho EYoo SHong J(2025)Enhancing collaborative signing songwriting experience of the d/Deaf individualsInternational Journal of Human-Computer Studies10.1016/j.ijhcs.2024.103382193:COnline publication date: 1-Jan-2025
https://dl.acm.org/doi/10.1016/j.ijhcs.2024.103382
Show More Cited By

Index Terms

Novice-AI Music Co-Creation via AI-Steering Tools for Deep Generative Models
1. Human-centered computing
  1. Human computer interaction (HCI)
    1. HCI design and evaluation methods
      1. User studies
    2. Interaction paradigms
      1. Collaborative interaction

Recommendations

Exploring the Design of Generative AI in Supporting Music-based Reminiscence for Older Adults
CHI '24: Proceedings of the 2024 CHI Conference on Human Factors in Computing Systems

Music-based reminiscence has the potential to positively impact the psychological well-being of older adults. However, the aging process and physiological changes, such as memory decline and limited verbal communication, may impede the ability of older ...
The Psychological Education Strategy of Music Generation and Creation by Generative Confrontation Network under Deep Learning
In order to study the role of generative adversarial network (GAN) in music generation, this article creates a convolutional GAN-based Midinet as a baseline model through the music generation process and creative psychological education and GAN principle. ...
Coarse-To-Fine Framework For Music Generation via Generative Adversarial Networks
HPCCT & BDAI '20: Proceedings of the 2020 4th High Performance Computing and Cluster Technologies Conference & 2020 3rd International Conference on Big Data and Artificial Intelligence

Automatic music generation is highly related to Natural Language Processing (NLP). A current note in melody always depends on its context, just like a word in NLP. Yet the difference is that music is built upon a set of special chords that formulates ...

Comments

Information & Contributors

Information

Published In

cover image ACM Conferences

CHI '20: Proceedings of the 2020 CHI Conference on Human Factors in Computing Systems

April 2020

10688 pages

ISBN:9781450367080

DOI:10.1145/3313831

General Chairs:
Regina Bernhaupt
Eindhoven University of Technology, Netherlands
,
Florian 'Floyd' Mueller
Monash University, Australia
,
David Verweij
Newcastle University, UK
,
Josh Andres
RMIT, Australia
,
Program Chairs:
Joanna McGrenere
University of British Columbia, Canada
,
Andy Cockburn
University of Canterbury, New Zealand
,
Ignacio Avellino
University of Maryland Baltimore County, USA
,
Alix Goguey
Grenoble Alpes University, France
,
Pernille Bjørn
University of Copenhagen, Denmark
,
Shengdong (Shen) Zhao
National University of Singapore, Singapore
,
Briane Paul Samson
Future University Hakodate, Japan & De La Salle University, Philippines
,
Rafal Kocielnik
University of Washington, USA

Copyright © 2020 Owner/Author.

This work is licensed under a Creative Commons Attribution-ShareAlike International 4.0 License.

Sponsors

SIGCHI: ACM Special Interest Group on Computer-Human Interaction

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 23 April 2020

Check for updates

Author Tags

Qualifiers

Research-article

Conference

CHI '20

Sponsor:

SIGCHI

CHI '20: CHI Conference on Human Factors in Computing Systems

April 25 - 30, 2020

HI, Honolulu, USA

Acceptance Rates

Overall Acceptance Rate 6,199 of 26,314 submissions, 24%

Upcoming Conference

CHI 2025

Sponsor:
sigchi

ACM CHI Conference on Human Factors in Computing Systems

April 26 - May 1, 2025

Yokohama , Japan

Contributors

Other Metrics

View Article Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

132
Total Citations
View Citations
9,621
Total Downloads

Downloads (Last 12 months)3,049
Downloads (Last 6 weeks)234

Reflects downloads up to 08 Feb 2025

Other Metrics

View Author Metrics

Citations

Cited By

Cheng L(2025)The impact of generative AI on school music education: Challenges and recommendationsArts Education Policy Review10.1080/10632913.2025.2451373(1-8)Online publication date: 10-Jan-2025
https://doi.org/10.1080/10632913.2025.2451373
Diro AKaisar SSaini AFatima SHiep PErba F(2025)Workplace security and privacy implications in the GenAI age: A surveyJournal of Information Security and Applications10.1016/j.jisa.2024.10396089(103960)Online publication date: Mar-2025
https://doi.org/10.1016/j.jisa.2024.103960
Choi YLee CChung SCho EYoo SHong J(2025)Enhancing collaborative signing songwriting experience of the d/Deaf individualsInternational Journal of Human-Computer Studies10.1016/j.ijhcs.2024.103382193:COnline publication date: 1-Jan-2025
https://dl.acm.org/doi/10.1016/j.ijhcs.2024.103382
Rapp ADi Lodovico CTorrielli FDi Caro L(2025)How do people experience the images created by generative artificial intelligence? An exploration of people's perceptions, appraisals, and emotions related to a Gen-AI text-to-image model and its creationsInternational Journal of Human-Computer Studies10.1016/j.ijhcs.2024.103375193:COnline publication date: 1-Jan-2025
https://dl.acm.org/doi/10.1016/j.ijhcs.2024.103375
Liu BLiao Y(2025)Integrating IBM Watson BEAT generative AI software into flute music learning: the impact of advanced AI tools on students’ learning strategiesEducation and Information Technologies10.1007/s10639-025-13394-yOnline publication date: 31-Jan-2025
https://doi.org/10.1007/s10639-025-13394-y
Ch'ng L(2024)Standing on the Shoulders of Generative AITransforming Education With Generative AI10.4018/979-8-3693-1351-0.ch001(1-21)Online publication date: 7-Feb-2024
https://doi.org/10.4018/979-8-3693-1351-0.ch001
Almufarreh A(2024)Determinants of Students’ Satisfaction with AI Tools in Education: A PLS-SEM-ANN ApproachSustainability10.3390/su1613535416:13(5354)Online publication date: 24-Jun-2024
https://doi.org/10.3390/su16135354
Gu X(2024)Enhancing social media engagement using AI-modified background music: examining the roles of event relevance, lyric resonance, AI-singer origins, audience interpretation, emotional resonance, and social media engagementFrontiers in Psychology10.3389/fpsyg.2024.126751615Online publication date: 15-Apr-2024
https://doi.org/10.3389/fpsyg.2024.1267516
Jin YYoon JSelf JLee K(2024)Understanding Fashion Designers’ Behavior Using Generative AI for Early-Stage Concept Ideation and RevisionArchives of Design Research10.15187/adr.2024.07.37.3.2537:3(25-45)Online publication date: 31-Jul-2024
https://doi.org/10.15187/adr.2024.07.37.3.25
Jackson VVasilescu BRusso DRalph PIzadi MPrikladnicki RD’angelo SInman SAndrade Avan der Hoek A(2024)The Impact of Generative AI on Creativity in Software Development: A Research AgendaACM Transactions on Software Engineering and Methodology10.1145/3708523Online publication date: 19-Dec-2024
https://dl.acm.org/doi/10.1145/3708523
Show More Cited By

View Options

View options

PDF

View or Download as a PDF file.

eReader

View online with eReader.

HTML Format

View this article in HTML Format.

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

Figures

Tables

Media

View Table of Conten