Completed a Ph.D. in computer science, cognitive science, Dissertation involved a computer model of music cognition called "Musicat". Supervisors: Douglas Hofstadter
We present a technique for partitioning an audio file into maximally-sized segments having nearly... more We present a technique for partitioning an audio file into maximally-sized segments having nearly uniform spectral content, ideally corresponding to notes or chords. Our method uses dynamic programming to globally optimize a measure of simplicity or homogeneity of the intervals in the partition. Here we have focused on an entropy-like measure, though there is considerable flexibility in choosing this measure. Experiments are presented for several musical scenarios.
So many people have been friends and colleagues, providing emotional and academic support over my... more So many people have been friends and colleagues, providing emotional and academic support over my eight years in Bloomington, it seems di cult to list them all. Please accept my apologies for any omissions. First, I would like to thank my immediate family for a lifetime of support: my parents Ann and Clay and my three little sisters Jenni, Sara, and Julie. I'm lucky to have fairly frequent communication with everyone thanks to email and cell phones; in a way I feel closer to my family in Bloomington, thanks to technology, than when I was an undergraduate at Montana State in Bozeman. A side note: my mom used to run a cat-breeding business, and we gave musical names to all the cats. e business was named "Mewsicats Cattery", an amusing source of inspiration for the name of my project (Musicat). Second, I owe a debt of gratitude to a host of my professors at Indiana University. My rst class in Bloomington was an excellent arti cial intelligence course taught by David Leake that got me o to a great start. Soon after I started at IU, Chris Raphael joined the faculty, and after auditing some of his courses, he and I had some very successful collaborations on various music informatics projects. I consider him an uno cial member of my thesis committee, and I'm happy about the hours of mathematical and musical discussions we've had over the years. My o cial committee has also been extremely helpful. Eric Isaacson taught a seminar on Music Cognition during my rst semester, which, in combination with Dr. Leake's course, was a perfect way to start my grad school career; it gave me a great introduction to the eld. Mike Gasser's courses on biologically-inspired computing and natural language processing were also extremely insightful. Don Byrd was a great musical presence in Music Informatics group meetings and at ISMIR conferences, and always has given me great feedback on this project. Finally, it goes without saying that my x main advisor, Doug Hofstadter, has been a tremendous in uence-not only during my eight years at IU, but indeed ever since I was 18 and started reading Gödel, Escher, Bach. I'm especially grateful that he made it possible for me to tag along to Paris in 2010-il m'a emmenné dans ses valises, as they say-when he spent a semester there on sabbatical. I expected to learn about arti cial intelligence when I came to Bloomington, but Doug actually taught me about cognitive science, communication, writing, and even French pronunciation, among many other things. It's truly been an honor and a privilege! ird, thanks to my many grad student compatriots from Bloomington: current and past students including
Operations Research and Cyber-Infrastructure, 2009
We consider the problem of finding an optimal path through a trellis graph when the arc costs are... more We consider the problem of finding an optimal path through a trellis graph when the arc costs are linear functions of an unknown parameter vector. In this context we develop an algorithm, Linear Dynamic Programming (LDP), that simultaneously computes the optimal path for all values of the parameter. We show how the LDP algorithm can be used for supervised learning of the arc costs for a dynamic-programming-based sequence estimator by minimizing empirical risk. We present an application to musical harmonic analysis in which we optimize the performance of our estimator by seeking the parameter value generating the sequence best agreeing with hand-labeled data.
2012 IEEE 12th International Conference on Data Mining, 2012
Online video presents a great opportunity for upand-coming singers and artists to be visible to a... more Online video presents a great opportunity for upand-coming singers and artists to be visible to a worldwide audience. However, the sheer quantity of video makes it difficult to discover promising musicians. We present a novel algorithm to automatically identify talented musicians using machine learning and acoustic analysis on a large set of "home singing" videos. We describe how candidate musician videos are identified and ranked by singing quality. To this end, we present new audio features specifically designed to directly capture singing quality. We evaluate these visa -vis a large set of generic audio features and demonstrate that the proposed features have good predictive performance. We also show that this algorithm performs well when videos are normalized for production quality.
Promiscuous plasmids replicate in a wide range of bacteria and therefore play a key role in the d... more Promiscuous plasmids replicate in a wide range of bacteria and therefore play a key role in the dissemination of various host-beneficial traits, including antibiotic resistance. Despite the medical relevance, little is known about the evolutionary dynamics through which drug resistance plasmids adapt to new hosts and thereby persist in the absence of antibiotics. We previously showed that the incompatibility group P-1 (IncP-1) minireplicon pMS0506 drastically improved its stability in novel host Shewanella oneidensis MR-1 after 1,000 generations under antibiotic selection for the plasmid. The only mutations found were those affecting the N terminus of the plasmid replication initiation protein TrfA1. Our aim in this study was to gain insight into the dynamics of plasmid evolution. Changes in stability and genotype frequencies of pMS0506 were monitored in evolving populations of MR-1 (pMS0506). Genotypes were determined by sequencing trfA1 amplicons from individual clones and by 454 ...
Risky sexual behaviors, including the decision to have unprotected sex, result from interactions ... more Risky sexual behaviors, including the decision to have unprotected sex, result from interactions between individuals and their environment. The current study explored the use of Agent-Based Modeling (ABM)-a methodological approach in which computer-generated artificial societies simulate human sexualnetworks-toassesstheinfluenceofheterogeneityofsexual motivation on the risk of contracting HIV. The models successfully simulated some characteristics of human sexual systems, such as the relationship between individual differences in sexual motivation (sexual excitation and inhibition) and sexual risk, but failed to reproduce the scale-free distribution of number of partners observed in the real world. ABM has the potential to inform intervention strategies that target the interaction between an individual and his or her social environment.
In this talk we present the Fluid Concepts cognitive model, developed by Douglas Hofstadter and h... more In this talk we present the Fluid Concepts cognitive model, developed by Douglas Hofstadter and his students over the last 30 years. Models in the Fluid Concepts tradition aim to provide a psychologically plausible description of perception and analogy-making, incorporating ...
Proceedings of the 14th international conference on Intelligent user interfaces, 2009
We present data-driven methods for supporting musical creativity by capturing the statistics of a... more We present data-driven methods for supporting musical creativity by capturing the statistics of a musical database. Specifically, we introduce a system that supports users in exploring the high-dimensional space of musical chord sequences by parameterizing the variation among chord sequences in popular music. We provide a novel user interface that exposes these learned parameters as control axes, and we propose two automatic approaches for defining these axes. One approach is based on a novel clustering procedure, the other on principal components analysis. A user study compares our approaches for defining control axes both to each other and to an approach based on manually-assigned genre labels. Results show that our automatic methods for defining control axes provide a subjectively better user experience than axes based on manual genre labeling.
We propose a system for contrapuntal music generation based on a Neural Machine Translation (NMT)... more We propose a system for contrapuntal music generation based on a Neural Machine Translation (NMT) paradigm. We consider Baroque counterpoint and are interested in modeling the interaction between any two given parts as a mapping between a given source material and an appropriate target material. Like in translation, the former imposes some constraints on the latter, but doesn't define it completely. We collate and edit a bespoke dataset of Baroque pieces, use it to train an attention-based neural network model, and evaluate the generated output via BLEU score and musicological analysis. We show that our model is able to respond with some idiomatic trademarks, such as imitation and appropriate rhythmic offset, although it falls short of having learned stylistically correct contrapuntal motion (e.g., avoidance of parallel fifths) or stricter imitative rules, such as canon.
ICASSP 2020 - 2020 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP)
Realistic recordings of soundscapes often have multiple sound events co-occurring, such as car ho... more Realistic recordings of soundscapes often have multiple sound events co-occurring, such as car horns, engine and human voices. Sound event retrieval is a type of contentbased search aiming at finding audio samples, similar to an audio query based on their acoustic or semantic content. State of the art sound event retrieval models have focused on single-label audio recordings, with only one sound event occurring, rather than on multi-label audio recordings (i.e., multiple sound events occur in one recording). To address this latter problem, we propose different Deep Learning architectures with a Siamesestructure and a Pairwise Presence Matrix. The networks are trained and evaluated using the SONYC-UST dataset containing both single-and multi-label soundscape recordings. The performance results show the effectiveness of our proposed model.
Music cognition research in the past several decades has invoked many disparate paradigms in an a... more Music cognition research in the past several decades has invoked many disparate paradigms in an attempt to explain the cognitive basis of musical syntax and semantics. Rather than focusing on any particular established model of music cognition, this paper serves to highlight the centrality of analogy in music cognition. It presents a collection of specific examples of musical analogy-making on different hierarchical levels, pointing the way towards future work in music cognition driven by recognition of the centrality of analogy. A New Answer for Bernstein In 1976 Leonard Bernstein gave a series of lectures at Harvard entitled The Unanswered Question, in which he discussed possible relationships between Chomskian linguistics and music. In the second of the six lectures he asserts that “there are similar functions, cognate processes operating in both music and language which are discoverable by linguistic method.” He then proceeds to construct a “quasi-scientific analogy between verb...
We have applied a Long Short-Term Memory neural network to model S&P 500 volatility, incorporatin... more We have applied a Long Short-Term Memory neural network to model S&P 500 volatility, incorporating Google domestic trends as indicators of the public mood and macroeconomic factors. In a held-out test set, our Long Short-Term Memory model gives a mean absolute percentage error of 24.2%, outperforming linear Ridge/Lasso and autoregressive GARCH benchmarks by at least 31%. This evaluation is based on an optimal observation and normalization scheme which maximizes the mutual information between domestic trends and daily volatility in the training set. Our preliminary investigation shows strong promise for better predicting stock behavior via deep learning and neural network models.
ICASSP 2020 - 2020 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), May 1, 2020
Realistic recordings of soundscapes often have multiple sound events co-occurring, such as car ho... more Realistic recordings of soundscapes often have multiple sound events co-occurring, such as car horns, engine and human voices. Sound event retrieval is a type of contentbased search aiming at finding audio samples, similar to an audio query based on their acoustic or semantic content. State of the art sound event retrieval models have focused on single-label audio recordings, with only one sound event occurring, rather than on multi-label audio recordings (i.e., multiple sound events occur in one recording). To address this latter problem, we propose different Deep Learning architectures with a Siamesestructure and a Pairwise Presence Matrix. The networks are trained and evaluated using the SONYC-UST dataset containing both single-and multi-label soundscape recordings. The performance results show the effectiveness of our proposed model.
We present a novel method for assigning fingers to notes in a polyphonic piano score. Such a mapp... more We present a novel method for assigning fingers to notes in a polyphonic piano score. Such a mapping (called a "fingering") is of great use to performers. To accommodate performers' unique hand sha our method relies on a simple, user function. We use dynamic programming to search the space of all possible fingerings for the optimal fingering under this cost function. Despite the simplicity of the algorithm we achieve reasonable and useful results.
Composers of popular music weave lyrics, melody, and instrumentation together to create a consist... more Composers of popular music weave lyrics, melody, and instrumentation together to create a consistent and compelling emotional scene. The relationships among these elements are critical to musical communication, and understanding the statistics behind these relationships can contribute to numerous problems in music information retrieval and creativity support. In this paper, we present the results of an observational study on a large symbolic database of popular music; our results identify several patterns in the relationship between lyrics and melody.
We present a method for transcribing arbitrary pitched music into a piano-roll-like representatio... more We present a method for transcribing arbitrary pitched music into a piano-roll-like representation that also tracks the amplitudes of the notes over time. We develop a probabilistic model that gives the likelihood of a frame of audio data given a vector of amplitudes for the possible notes. Using an approximation of the log likelihood function, we develop an objective function that is quadratic in the timevarying amplitude variables, while also depending on the discrete piano-roll variables. We optimize this function using a variant of dynamic programming, by repeatedly growing and pruning our histories. We present results on a variety of different examples using several measures of performance including an edit-distance measure as well as a frame-by-frame measure.
We present a technique for partitioning an audio file into maximally-sized segments having nearly... more We present a technique for partitioning an audio file into maximally-sized segments having nearly uniform spectral content, ideally corresponding to notes or chords. Our method uses dynamic programming to globally optimize a measure of simplicity or homogeneity of the intervals in the partition. Here we have focused on an entropy-like measure, though there is considerable flexibility in choosing this measure. Experiments are presented for several musical scenarios.
So many people have been friends and colleagues, providing emotional and academic support over my... more So many people have been friends and colleagues, providing emotional and academic support over my eight years in Bloomington, it seems di cult to list them all. Please accept my apologies for any omissions. First, I would like to thank my immediate family for a lifetime of support: my parents Ann and Clay and my three little sisters Jenni, Sara, and Julie. I'm lucky to have fairly frequent communication with everyone thanks to email and cell phones; in a way I feel closer to my family in Bloomington, thanks to technology, than when I was an undergraduate at Montana State in Bozeman. A side note: my mom used to run a cat-breeding business, and we gave musical names to all the cats. e business was named "Mewsicats Cattery", an amusing source of inspiration for the name of my project (Musicat). Second, I owe a debt of gratitude to a host of my professors at Indiana University. My rst class in Bloomington was an excellent arti cial intelligence course taught by David Leake that got me o to a great start. Soon after I started at IU, Chris Raphael joined the faculty, and after auditing some of his courses, he and I had some very successful collaborations on various music informatics projects. I consider him an uno cial member of my thesis committee, and I'm happy about the hours of mathematical and musical discussions we've had over the years. My o cial committee has also been extremely helpful. Eric Isaacson taught a seminar on Music Cognition during my rst semester, which, in combination with Dr. Leake's course, was a perfect way to start my grad school career; it gave me a great introduction to the eld. Mike Gasser's courses on biologically-inspired computing and natural language processing were also extremely insightful. Don Byrd was a great musical presence in Music Informatics group meetings and at ISMIR conferences, and always has given me great feedback on this project. Finally, it goes without saying that my x main advisor, Doug Hofstadter, has been a tremendous in uence-not only during my eight years at IU, but indeed ever since I was 18 and started reading Gödel, Escher, Bach. I'm especially grateful that he made it possible for me to tag along to Paris in 2010-il m'a emmenné dans ses valises, as they say-when he spent a semester there on sabbatical. I expected to learn about arti cial intelligence when I came to Bloomington, but Doug actually taught me about cognitive science, communication, writing, and even French pronunciation, among many other things. It's truly been an honor and a privilege! ird, thanks to my many grad student compatriots from Bloomington: current and past students including
Operations Research and Cyber-Infrastructure, 2009
We consider the problem of finding an optimal path through a trellis graph when the arc costs are... more We consider the problem of finding an optimal path through a trellis graph when the arc costs are linear functions of an unknown parameter vector. In this context we develop an algorithm, Linear Dynamic Programming (LDP), that simultaneously computes the optimal path for all values of the parameter. We show how the LDP algorithm can be used for supervised learning of the arc costs for a dynamic-programming-based sequence estimator by minimizing empirical risk. We present an application to musical harmonic analysis in which we optimize the performance of our estimator by seeking the parameter value generating the sequence best agreeing with hand-labeled data.
2012 IEEE 12th International Conference on Data Mining, 2012
Online video presents a great opportunity for upand-coming singers and artists to be visible to a... more Online video presents a great opportunity for upand-coming singers and artists to be visible to a worldwide audience. However, the sheer quantity of video makes it difficult to discover promising musicians. We present a novel algorithm to automatically identify talented musicians using machine learning and acoustic analysis on a large set of "home singing" videos. We describe how candidate musician videos are identified and ranked by singing quality. To this end, we present new audio features specifically designed to directly capture singing quality. We evaluate these visa -vis a large set of generic audio features and demonstrate that the proposed features have good predictive performance. We also show that this algorithm performs well when videos are normalized for production quality.
Promiscuous plasmids replicate in a wide range of bacteria and therefore play a key role in the d... more Promiscuous plasmids replicate in a wide range of bacteria and therefore play a key role in the dissemination of various host-beneficial traits, including antibiotic resistance. Despite the medical relevance, little is known about the evolutionary dynamics through which drug resistance plasmids adapt to new hosts and thereby persist in the absence of antibiotics. We previously showed that the incompatibility group P-1 (IncP-1) minireplicon pMS0506 drastically improved its stability in novel host Shewanella oneidensis MR-1 after 1,000 generations under antibiotic selection for the plasmid. The only mutations found were those affecting the N terminus of the plasmid replication initiation protein TrfA1. Our aim in this study was to gain insight into the dynamics of plasmid evolution. Changes in stability and genotype frequencies of pMS0506 were monitored in evolving populations of MR-1 (pMS0506). Genotypes were determined by sequencing trfA1 amplicons from individual clones and by 454 ...
Risky sexual behaviors, including the decision to have unprotected sex, result from interactions ... more Risky sexual behaviors, including the decision to have unprotected sex, result from interactions between individuals and their environment. The current study explored the use of Agent-Based Modeling (ABM)-a methodological approach in which computer-generated artificial societies simulate human sexualnetworks-toassesstheinfluenceofheterogeneityofsexual motivation on the risk of contracting HIV. The models successfully simulated some characteristics of human sexual systems, such as the relationship between individual differences in sexual motivation (sexual excitation and inhibition) and sexual risk, but failed to reproduce the scale-free distribution of number of partners observed in the real world. ABM has the potential to inform intervention strategies that target the interaction between an individual and his or her social environment.
In this talk we present the Fluid Concepts cognitive model, developed by Douglas Hofstadter and h... more In this talk we present the Fluid Concepts cognitive model, developed by Douglas Hofstadter and his students over the last 30 years. Models in the Fluid Concepts tradition aim to provide a psychologically plausible description of perception and analogy-making, incorporating ...
Proceedings of the 14th international conference on Intelligent user interfaces, 2009
We present data-driven methods for supporting musical creativity by capturing the statistics of a... more We present data-driven methods for supporting musical creativity by capturing the statistics of a musical database. Specifically, we introduce a system that supports users in exploring the high-dimensional space of musical chord sequences by parameterizing the variation among chord sequences in popular music. We provide a novel user interface that exposes these learned parameters as control axes, and we propose two automatic approaches for defining these axes. One approach is based on a novel clustering procedure, the other on principal components analysis. A user study compares our approaches for defining control axes both to each other and to an approach based on manually-assigned genre labels. Results show that our automatic methods for defining control axes provide a subjectively better user experience than axes based on manual genre labeling.
We propose a system for contrapuntal music generation based on a Neural Machine Translation (NMT)... more We propose a system for contrapuntal music generation based on a Neural Machine Translation (NMT) paradigm. We consider Baroque counterpoint and are interested in modeling the interaction between any two given parts as a mapping between a given source material and an appropriate target material. Like in translation, the former imposes some constraints on the latter, but doesn't define it completely. We collate and edit a bespoke dataset of Baroque pieces, use it to train an attention-based neural network model, and evaluate the generated output via BLEU score and musicological analysis. We show that our model is able to respond with some idiomatic trademarks, such as imitation and appropriate rhythmic offset, although it falls short of having learned stylistically correct contrapuntal motion (e.g., avoidance of parallel fifths) or stricter imitative rules, such as canon.
ICASSP 2020 - 2020 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP)
Realistic recordings of soundscapes often have multiple sound events co-occurring, such as car ho... more Realistic recordings of soundscapes often have multiple sound events co-occurring, such as car horns, engine and human voices. Sound event retrieval is a type of contentbased search aiming at finding audio samples, similar to an audio query based on their acoustic or semantic content. State of the art sound event retrieval models have focused on single-label audio recordings, with only one sound event occurring, rather than on multi-label audio recordings (i.e., multiple sound events occur in one recording). To address this latter problem, we propose different Deep Learning architectures with a Siamesestructure and a Pairwise Presence Matrix. The networks are trained and evaluated using the SONYC-UST dataset containing both single-and multi-label soundscape recordings. The performance results show the effectiveness of our proposed model.
Music cognition research in the past several decades has invoked many disparate paradigms in an a... more Music cognition research in the past several decades has invoked many disparate paradigms in an attempt to explain the cognitive basis of musical syntax and semantics. Rather than focusing on any particular established model of music cognition, this paper serves to highlight the centrality of analogy in music cognition. It presents a collection of specific examples of musical analogy-making on different hierarchical levels, pointing the way towards future work in music cognition driven by recognition of the centrality of analogy. A New Answer for Bernstein In 1976 Leonard Bernstein gave a series of lectures at Harvard entitled The Unanswered Question, in which he discussed possible relationships between Chomskian linguistics and music. In the second of the six lectures he asserts that “there are similar functions, cognate processes operating in both music and language which are discoverable by linguistic method.” He then proceeds to construct a “quasi-scientific analogy between verb...
We have applied a Long Short-Term Memory neural network to model S&P 500 volatility, incorporatin... more We have applied a Long Short-Term Memory neural network to model S&P 500 volatility, incorporating Google domestic trends as indicators of the public mood and macroeconomic factors. In a held-out test set, our Long Short-Term Memory model gives a mean absolute percentage error of 24.2%, outperforming linear Ridge/Lasso and autoregressive GARCH benchmarks by at least 31%. This evaluation is based on an optimal observation and normalization scheme which maximizes the mutual information between domestic trends and daily volatility in the training set. Our preliminary investigation shows strong promise for better predicting stock behavior via deep learning and neural network models.
ICASSP 2020 - 2020 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), May 1, 2020
Realistic recordings of soundscapes often have multiple sound events co-occurring, such as car ho... more Realistic recordings of soundscapes often have multiple sound events co-occurring, such as car horns, engine and human voices. Sound event retrieval is a type of contentbased search aiming at finding audio samples, similar to an audio query based on their acoustic or semantic content. State of the art sound event retrieval models have focused on single-label audio recordings, with only one sound event occurring, rather than on multi-label audio recordings (i.e., multiple sound events occur in one recording). To address this latter problem, we propose different Deep Learning architectures with a Siamesestructure and a Pairwise Presence Matrix. The networks are trained and evaluated using the SONYC-UST dataset containing both single-and multi-label soundscape recordings. The performance results show the effectiveness of our proposed model.
We present a novel method for assigning fingers to notes in a polyphonic piano score. Such a mapp... more We present a novel method for assigning fingers to notes in a polyphonic piano score. Such a mapping (called a "fingering") is of great use to performers. To accommodate performers' unique hand sha our method relies on a simple, user function. We use dynamic programming to search the space of all possible fingerings for the optimal fingering under this cost function. Despite the simplicity of the algorithm we achieve reasonable and useful results.
Composers of popular music weave lyrics, melody, and instrumentation together to create a consist... more Composers of popular music weave lyrics, melody, and instrumentation together to create a consistent and compelling emotional scene. The relationships among these elements are critical to musical communication, and understanding the statistics behind these relationships can contribute to numerous problems in music information retrieval and creativity support. In this paper, we present the results of an observational study on a large symbolic database of popular music; our results identify several patterns in the relationship between lyrics and melody.
We present a method for transcribing arbitrary pitched music into a piano-roll-like representatio... more We present a method for transcribing arbitrary pitched music into a piano-roll-like representation that also tracks the amplitudes of the notes over time. We develop a probabilistic model that gives the likelihood of a frame of audio data given a vector of amplitudes for the possible notes. Using an approximation of the log likelihood function, we develop an objective function that is quadratic in the timevarying amplitude variables, while also depending on the discrete piano-roll variables. We optimize this function using a variant of dynamic programming, by repeatedly growing and pruning our histories. We present results on a variety of different examples using several measures of performance including an edit-distance measure as well as a frame-by-frame measure.
Uploads
Papers by Eric Nichols