Keywords

These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.

1 Introduction

Machine learning allows a machine to acquire knowledge from data forming concrete conceptual spaces, while conceptual blending [10] between two input spaces allows new spaces to be constructed expressed as new structural relations or even new elements, creating new and potentially unforeseen output [27]. In music, harmony is an characteristic and well-circumscribed element of an idiom that can be learned from human annotated musical data using techniques such as Hidden Markov Models, N-grams, probabilistic grammars, inductive logic programming (see [6, 7, 12, 15, 17, 2023, 25, 26] among others). In the context of computational creativity in music, a challenging task tackled in the Concept Invention Theory (COINVENT) [3, 18, 24] project is to blend different/diverse input harmonic idioms learned from data to create new idioms that are creative supersets of the input ones.

The paper at hand briefly presents the extension of a melodic harmonisation assistant (introduced in [15]) that learns harmonic idioms by statistical learning on human data, for inventing new harmonic spaces by blending transitions between chords. The blended transitions are created by combining the features characterising pairs of transitions belonging to two idioms (expressed as sets of potentially learned transitions) according to an amalgam-based algorithm [5, 9] that implements the theory presented in [10] for conceptual blending, through the categorical-based methodology presented in [11]. The transitions are then used in an extended harmonic space that accommodates the two initial harmonic spaces, linked with the new blended transitions.

Fig. 1.
figure 1

System overview.

2 Statistical Learning of Harmonies from Human Annotated Datasets

Before blending harmonies, the system learns different aspects of harmony through annotated training data, while it produces new harmonisations according to guidelines provided in a melody input file given by the user. The system learns the available chord types within diverse dataset (according to their root notes) based on the General Chord Type (GCT) [2] representation, which can be used not only to represent but also to describe meaningful relations between harmonic labels [16] – even in non-tonal music idioms [1, 14]. The training data include musical scores from many idioms (from Modal harmonisations in the Middle Ages to harmonisations of popular music and jazz in the 20th century), with expert annotations. Specifically, the notes of harmonic manually annotated reductions are regarded for the harmonic learning process, where only the most important harmonic notes are included; additional annotated layers of information are given regarding the tonality and the metric structure of each piece. Accordingly, the format of the user melody input file includes indications of several desired attributes that the resulting harmonisation should have.

After the system is trained, it is able to harmonise a user-given melody that, in this stage, includes manual annotations about harmonic rhythm, harmonically important notes, key and phrase structure. Additionally, the user has the freedom to choose specific chords at desired locations (constraint chords), forcing the system creatively to produce chord sequences that comply with the user-provided constraints, therefore allowing the user to ‘manually’ increase the interestingness of the produced output.

The cHMM [17] algorithm is used for modelling/learning chord progression probabilities for a given idiom. Then statistical information from the user-defined melody is combined with the chord progression model to generate chord progressions that best represent the idiom. Additionally the algorithm offers the possibility for prior determination of intermediate ‘checkpoint’ chords [4]). The fixed intermediate chords on the one hand help towards preserving some essence of higher level harmonic structure through the imposition of intermediate and final cadences, while on the other hand allow interactivity by enabling the user to place desired chord at any position. Statistics for cadences are learned during the training process, where expert annotated files including annotations for phrase endings are given as training material to the system. After collecting the statistics about cadences from all idioms, the system, before employing the cHMM algorithm, assigns cadences as fixed chords to the locations indicated by user input. The cadence to be imported is selected based on three criteria: (a) whether it is a final or an intermediate cadence; (b) the cadence likelihood (how often it occurs in the training pieces); and (c) how well it fits with the melody notes that are harmonised by the cadence’s chords. Direct human intervention allows the user of the system to specify a harmonic ‘spinal chord’ of anchor chords that are afterwards connected by chord sequences that give stylistic reference to a learned idiom.

Regarding voice leading, experimental evaluation of methodologies that utilise statistical machine learning techniques demonstrated that an efficient way to harmonise a melody is to add the bass line first [26]. The presented harmoniser, having defined the optimal sequence of GCT chords, uses a modular methodology for determining the bass voice leading presented in [19], which utilises independently trained modules that include (a) a hidden Markov model (HMM) deciding for the bass contour (hidden states), given the melody contour (observations); (b) distributions on the distance between the bass and the melody voice; and (c) statistics regarding the inversions of the chords in the given chord sequence.

The bass voice motion provides abstract information about the motion of the bass, however, assigning actual pitches for a given set of GCT chords requires additional information: inversions and melody-to-bass distance distributions are also learned from data. The inversions of a chord play an important role in determining how eligible is each chord’s pitch class to be a bass note, while the melody-to-bass distance captures statistics about the pitch height region that the bass voice is allowed to move according to the melody. After obtaining the exact bass pitch, the exact voicing layout, i.e. exact pitches for all chord notes, for each GCT chord is defined. To this end, a simple statistical model is utilised that finds the best combination of the intermediate voices for every chord according to some simple criteria. These criteria include proximity to a pitch-attractor, evenness of neighbouring notes distances and inner voice movement distances between chords. These criteria form an aggregate wighted sum that defines the optimal setting for all the intermediate notes (between the bass and the melody) in every GCT chord.

3 Blending Learned Harmonies

In the presented system, the harmony of an idiom is represented by first order Markov matrices, which include one respective row and column for each chord in the examined idiom. The probability value in the i-th row and the j-th column exhibits the probability of the i-th chord going to the j-th—the probabilities of each row sum to one. Figure 2(a) illustrates a grayscale interpretation of the transition in a set of major-mode Bach chorales. An important question is: Given two input idioms as chord transition matrices, how would a blended idiom be expressed in terms of a transition matrix? The idea examined in the present system is to create an extended transition matrix that includes new transitions that allow moving across chords of the initial idioms by potentially using new chords. The examined methodology uses transition blending to create new transitions that incorporate blended characteristics for creating a smooth ‘morphing’ harmonic effect when moving from chords of one space to chords of the other. An abstract illustration of an extended matrix is given in Fig. 2(b).

Fig. 2.
figure 2

Graphical description of (a) the transition matrix in a set of major-mode Bach chorales and (b) an extended matrix that includes transition probabilities of two initial idioms – like the ones depicted in (a) – and of several new transitions generated through transition blending.

In an extended matrix (Fig. 2), when using transitions in \(\mathtt {I_i}\) only chords of the i-th idiom are used, while (blended) transitions in \(\mathtt {A_{i-j}}\) create direct jumps from chords of the i-th to chords of the j-th idiom. Transitions in \(\mathtt {B_{i-X}}\) constitute harmonic motions from a chord of idiom i to a newly created chord by blending. For moving from idiom i to idiom j using one external chord \(c_x\) that was produced by blending, a chain of two consecutive transitions is needed (\(\mathtt {B_{i-X-j}}\)): \(c_i\) \(\rightarrow \) \(c_x\) followed by a transition \(c_x\) \(\rightarrow \) \(c_j\), where \(c_i\) in idiom i and \(c_j\) in idiom j respectively. Transitions in \(\mathtt {C}\) are disregarded since they incorporate pairs of chords that exist outside the i-th and j-th idioms.

Based on this analysis of the extended matrix, a methodology is proposed for using blends between transitions in \(\mathtt {I_1}\) and \(\mathtt {I_2}\). Thereby, transitions in \(\mathtt {I_1}\) are blended with ones in \(\mathtt {I_2}\) and a number of the best blends is stored for further investigation, creating a pool of best blends. A greater number of blends in the pool of best blends introduces a larger number of possible commuting paths in \(\mathtt {A_{i-j}}\) or in \(\mathtt {B_{i-X-j}}\). Transition blending is performed through amalgam-based conceptual blending that has already been applied to invent chord cadences [8, 28]; in this setting, cadences are considered as special cases of chord transitions—pairs of chords, where the first chord is followed by the second one—that are described by means of features such as the roots or types of the involved chords, or intervals between voice motions, among others. For more information on transition blending the reader is referred to [13].

4 Harmonisation Examples

To briefly demonstrate the effect that transition blending has on forming the extended matrix that combines two initial idioms, harmonisations of the first part of ‘Ode to Joy’ by Beethoven are illustrated in Fig. 3. Initially, one can observe that the learned harmonic features from the Bach chorales and the jazz sets (Fig. 3(a) and (b) respectively) are reflected in the harmonisations that the system produces. In the case of the Bach chorales, the most convenient (yet not so musically impressive) sequence of chords is generated, where the V-I pattern is repeated. The jazz harmonisation includes modifications of the usual ii-V-I pattern. By blending the transitions of the two initial idioms, the produced harmonisation (illustrated in Fig. 3(c)) features new structural relations, incorporating chords and chord sequences that are not usual in any of the initial idioms. However, even though the chords and chord transitions per se are unusual, their encompassed features reflect musical attributes of the initial idioms.

Fig. 3.
figure 3

Beethoven’s Ode to joy theme harmonised by the system with learned idioms and their blend.

5 Conclusions

This paper describes a melodic harmonisation system which receives as inputs a melody file and two harmonic idioms and produces a melodic harmonisation with the blended harmony of two input idioms. To this end, diverse harmonic datasets were compiled and annotated by experts, while the harmonic description of each idiom was based on chords extracted with the General Chord Type (GCT) algorithm, statistical learning of chord progressions, cadences and chord voicing over these data. Blending of harmonies is performed through blending chord transitions (one chord leading to another) from the input idioms using an algorithmic realisation of conceptual blending based on category theory. Chord transition blending combines features between pairs of transitions from the two input harmonic idioms, producing new transitions that potentially include new chords for both idioms and incorporate blended features. These new blended transitions act as connection points between the two input harmonic idioms, generating the extended idiom that is a blended harmonic superset of the two input ones.

A thorough experimental process that evaluates the usefulness of the produced harmonisations in real-world applications (e.g. when the system is used as an assistant for composers) is underway. Initial experimental results indicate that the blended melodic harmonisations are more interesting than the ones produced by using each input idiom separately. Additional experimental processes are expected to provide insights into whether the blended harmonic spaces are perceived as alterations of one of the input spaces (one-sided blends), balanced blends or radically new harmonic idioms, as well as into the role of the user-defined melody in the process of using blended or non-blended harmonising idioms.