Discrimination of interval size in short tone sequences

Toby J. W. Hill and Ian R. Summersa兲

Biomedical Physics Group, School of Physics, University of Exeter, Exeter EX4 4QL, United Kingdom
共Received 2 August 2006; revised 16 November 2006; accepted 21 January 2007兲
This study investigates the discrimination of small changes of interval size in short sequences of
musical tones. Major, minor and neutral thirds were varied in increments of 15 cents. The nine
subjects had varying degrees of amateur musical experience—their level of musical training was
lower than that of professional musicians. In some experiments the stimuli were presented purely
melodically and in others they were presented together with a sustained tone at a higher pitch. Some
subjects were able to make use of the additional cues from beats in the latter case. Category widths
for identification were measured at around 70 cents and just-noticeable differences in frequency
were measured at around 10 cents. Little significant variation of inter-stimulus sensitivity index d⬘
was observed across the stimulus sets, i.e., there was little evidence for “anchors” or “landmarks”
within the range of tunings employed. However, for major thirds, discrimination of the 15 cent
increment between 400 and 415 cents was reduced compared to discrimination of other 15 cent
increments within the stimulus sets. © 2007 Acoustical Society of America.
关DOI: 10.1121/1.2697059兴
PACS number共s兲: 43.75.Cd, 43.75.Bc, 43.66.Hg, 43.66.Fe 关DD兴 Pages: 2376–2383

I. INTRODUCTION inherent complexity of the musical material and the listener’s

cognitive response means that subjects’ performance is diffi-
In Western music, the intervals used within the musical cult to analyze. Hence, investigators 共e.g., Rasch, 1985; Vos,
scale 共minor second, major second, minor third, etc.兲 may be 1988兲 have chosen to work on musical fragments or short
subject to small variations of tuning according to particular tone sequences which provide a quasi-musical context for
performance practices 共Rasch, 1983兲. For example, when us- measurements, while avoiding undue complexity.
ing equal temperament 共based on a logarithmic division of Rasch 共1985兲 used musical fragments in the form of two
the octave into 12 equal steps兲 the major third is intended to simultaneous tone sequences, with experiments involving the
be 400 cents; when using a tuning scheme with a “just” ma-
mistuning of tones in one or both of the sequences. Melodic
jor third 共i.e., a fundamental-frequency ratio of 5:4兲 this in-
共interval-width兲 and harmonic 共beat兲 cues were found to con-
terval is intended to be 386 cents. 共Note: One cent is the unit
tribute to the detection of mistuning. Vos 共1988兲 used similar
obtained by logarithmic division of the octave into 1200
equal steps.兲 The present study is concerned with the extent musical fragments, with experiments involving rating and
to which such small changes in tuning are apparent to the paired comparison of six common intonation systems. Over-
listener. all acceptability was found to relate to the “purity” of the
The literature documents a large number of studies re- intervals 共i.e., their closeness to just intonation兲, suggesting
lated to perception of the tuning of musical intervals. Experi- that harmonic 共beat兲 cues were dominant in this case.
ments cover measurements of just-noticeable difference in An aim of the present study is to further explore this
frequency for pure and complex tones 共e.g., Harris, 1952; “middle ground” between experiments on single intervals in
Nelson et al., 1983; Moore and Glasberg, 1990; Sek and isolation and experiments based on real music. Four experi-
Moore, 1995兲, measurements of consonance and dissonance ments have been carried out using short tone sequences to
共e.g., Kameoka and Kuriyagawa, 1969a, 1969b; Tufts et al. provide a quasi-musical context. Identification tasks were
2005兲, and a wide range of topics relating to the perception used to determine subjects’ ability to detect small changes in
of tone sequences and information transfer via such se- tuning. There is evidence 共Burns and Ward, 1978; Burns and
quences 共Watson et al., 1975, 1976; Deutsch, 1980; Spiegel Campbell, 1994; Perlman and Krumhansl, 1996兲 that the per-
and Watson, 1984; Kidd and Watson, 1992; Schellenberg and ceptual “landscape” contains “anchors” or “landmarks” at
Trehub, 1994; Parncutt and Cohen, 1995; Thompson et al., particular positions in the pitch range. For example, sensitiv-
2001; Creel et al., 2004; Smith and Schmuckler, 2004兲. ity to small pitch changes might be less around unfamiliar
Several previous studies have investigated perception of intervals such as the “neutral” third 共approximately
the tuning of melodic or harmonic intervals presented in iso-
350 cents兲 and greater around familiar intervals such as the
lation 共e.g., Burns and Ward, 1978; Hall and Hess, 1984;
major and minor thirds 共equal tempered at 400 and 300 cents
Vos, 1986; Burns and Campbell, 1994兲. Equivalent experi-
and just intonation at 386 and 316 cents兲. However, other
ments in the context of “real” music are problematical—the
investigators 共Parncutt and Cohen, 1995兲 have not observed
this effect. In the present study, stimuli were designed to
Author to whom correspondence should be addressed. Electronic mail: investigate the possibility of anchors or landmarks within the
i.r.summers@exeter.ac.uk range of tunings employed.

The tone sequences are illustrated in Fig. 1. For Experi-
ment 1 each sequence was of three tones, rising and then
falling by an interval of a third. For Experiment 2 these
three-note sequences were accompanied by a sustained tone
at a higher pitch 共at an interval of a perfect fifth above the
start note of the sequence兲. For Experiment 3 each sequence
was of seven tones, rising and falling in a scale-like manner.
For Experiment 4 these scale-like sequences were accompa-
nied by a sustained tone at a higher pitch 共again, at an inter-
val of a perfect fifth above the start note of the sequence兲.
The double-headed arrows in Fig. 1 indicate the tones whose
pitch was varied to achieve the different tuning systems—the
2nd tone within each sequence of three in Experiments 1 and
2, and the 2nd, 3rd, 5th and 6th tones within each sequence
of seven in Experiments 3 and 4. Experiments 3 and 4 were
intended to provide a more complex task than Experiments 1
and 2, but with additional cues. An important issue in rela-
tion to experiments on intonation is the relative importance
FIG. 1. Schematic pitch-time diagrams for the tone sequences employed in of melodic 共or interval-width兲 and harmonic 共or beat兲 cues in
the four experiments: 共a兲 Experiment 1, melodic third; 共b兲 Experiment 2, the perception of the various stimuli. The intention in the
harmonized third; 共c兲 Experiment 3, melodic scale-like sequence; 共d兲 Ex-
periment 4, harmonized scale-like sequence. The double-headed arrows in-
present study was to provide melodic cues to pitch changes
dicate tones whose pitch was varied. in Experiments 1 and 3 and both harmonic and melodic cues
in Experiments 2 and 4.
II. METHOD Details of the stimulus tunings are given in Tables I and
II. The experimental pitches were designed in sets of five
A. Overview
based around the equal-tempered major third 共400 cents兲, the
This study includes four related experiments. In each equal-tempered minor third 共300 cents兲 and the neutral third
experiment subjects were presented with short sequences of 共350 cents兲. Preliminary pilot studies indicated that incre-
musical tones, each sequence being identical except for small ments of 15 cents would not only yield meaningful results
changes in the tuning of the component tones. The task was but would also allow close approximations of the just-
to identify the tuning system used for each sequence—a intonation major and minor thirds 共386 and 316 cents, re-
choice from five alternatives. As indicated above, the experi- spectively兲 to be incorporated. In the tone sequences for Ex-
mental design is intended to provide a quasi-musical context periments 3 and 4 the tones at a major second above the start
within which specific psychophysical measurements may be note 共i.e., tones 2 and 6 in the sequence—see Fig. 1兲 were
made. It is not the intention to measure pre-learned notions tuned so as to bisect the interval of the major third above the
of “good” or “bad” intonation, but rather to see whether par- start note 共see Table II兲, following the scheme adopted in
ticular intonations are perceptually more distinct 共perhaps as many tuning systems by which a major third contains two
a result of being good or bad兲. equal-size major seconds 共Rasch, 1983兲.

TABLE I. Stimulus tunings used in Experiments 1 and 2. Each variant of the test involves five stimuli. Stimuli
are presented melodically in Experiment 1 and accompanied by a higher-pitch tone in Experiment 2.

Tone 1 in Tone 2 in Tone 3 in Higher pitch in

sequence sequence sequence Experiment 2
Test variant Stimulus label 共cents re A3兲 共cents re A3兲 共cents re A3兲 共cents re A3兲

Major 3rd a 0 430 0 700

b 0 415 0 700
c 0 400 0 700
d 0 385 0 700
e 0 370 0 700
Neutral 3rd ᐉ 0 380 0 700
m 0 365 0 700
n 0 350 0 700
o 0 335 0 700
p 0 320 0 700
Minor 3rd v 0 330 0 700
w 0 315 0 700
x 0 300 0 700
y 0 285 0 700
z 0 270 0 700

TABLE II. Stimulus tunings used in Experiments 3 and 4. Each test involves five stimuli. Stimuli are presented
melodically in Experiment 3 and accompanied by a higher-pitch tone in Experiment 4.

Tones 1, 7 Tones 2, 6 Tone 3, 5 Tone 4 Higher pitch in

in sequence in sequence in sequence in sequence Experiment 4
Stimulus label 共cents re A3兲 共cents re A3兲 共cents re A3兲 共cents re A3兲 共cents re A3兲

a 0 215 430 500 700

b 0 207.5 415 500 700
c 0 200 400 500 700
d 0 192.5 385 500 700
e 0 185 370 500 700

The wave forms used for the tones in these experiments employed for each of a series of melodic fragments, using a
were designed to have an unambiguous pitch, to sound “mu- labeling system 共see Tables I and II兲 which was demonstrated
sical” but not like any instrument in particular, and to be of a at the start of the experiment. The experimental protocol
duration 共400 ms兲 representative of “real” music. The pitch comprised a sequence of three sections that was repeated
on which the intervals were constructed was A3, with a fun- within each test session. First, in a demonstration block, sub-
damental frequency of 220 Hz. This pitch was found to be jects were presented with the stimuli and their identifying
relatively comfortable to listen to repetitively and is in the labels 共e.g., ᐉ, m, n, o, p—each variant of the test involved
midrange of pitches used melodically in music. Moreover, it five stimuli兲. Subjects were able to repeat this demonstration
has been suggested that the perception of tones at this pitch block on request. Second, subjects received a training block
is representative of perception over a wide range of other in which they identified stimuli and received trial-by-trial
musical pitches 共Vos and van Vianen, 1985兲. Each tone com- feedback and an overall score—the initial training block in a
prised ten frequency components, the fundamental and the session comprised 25 items and all subsequent training
next nine harmonics, with a spectral amplitude envelope fall- blocks comprised ten items. At the end of a training block
ing at 6 dB oct−1 关equivalent to 1/共harmonic number兲兴. The subjects were offered further demonstration or training or
tones had rise and release times of 40 ms, giving them a they could proceed to the third section of the protocol: the
bland, organ-like timbre. Stimuli were specified in software, test blocks. Each test block comprised 25 stimuli without
using a 40 kHz sample rate. feedback, presented in a balanced, pseudo-random order in
which each stimulus occurred five times. At the end of each
B. Apparatus test block subjects were informed of their performance, with
The experiments were conducted in a room designed for a breakdown of score for each of the five stimuli. This pro-
audiological purposes, with low reverberation. The subjects tocol, which is similar to that reported by Mori and Ward
were tested singly and were seated in the middle of the room 共1995兲, was intended to assist subjects in self-motivation 共by
in front of a computer monitor and keyboard on which they being aware of their performance兲 and in maintaining their
received instructions and entered their responses. The experi- decision criteria and concentration levels. The modular struc-
ments were self-paced by the subject. The stimuli were pre- ture also facilitated the statistical assessment of learning ef-
sented to the subject via two loudspeakers, which carried fects. It was not the intention to intensively train subjects in
identical signals, positioned on either side of the monitor. order to measure their optimal performance—rather the in-
The sound level at the subject’s head was measured to be tention was to overcome any asymptotic tendencies in the
62 dB共A兲 for the melodic stimuli and 64 dB共A兲 for the har- response function due to subjects gaining familiarity with the
monized stimuli. This represents a comfortable listening task and stimuli, in order to measure their latent ability to do
level. the task.
Computer simulations 共Hill, 2000兲 of absolute identifi-
C. Subjects cation tasks suggested that a minimum of 40 trials per stimu-
lus category 共i.e., 200 trials in the case of five stimulus cat-
Subjects were unpaid volunteers, predominantly gradu-
egories兲 are required per subject in order to satisfactorily
ate students in physics, with varying degrees of amateur mu-
extract values for information transfer and sensitivity index
sical experience. Their level of musical training was lower
from individual confusion matrices 共see following sections兲.
than that of professional musicians. The same nine subjects,
Consequently, nine test blocks 共each of 25 trials兲 were used
eight male and one female, participated in each of the four
for each variant 共major third, neutral third and minor third兲
experiments. Subjects’ ages ranged from 20 to 45. None of
of Experiments 1 and 2. In each case this gave an experi-
the subjects reported any hearing impairment, and none pos-
mental session lasting typically between 45 and 60 min. Pilot
sessed the ability of absolute pitch determination.
studies suggested that longer sessions might produce prob-
lems with subjects maintaining their attention. Therefore, for
D. Paradigm and protocol
the longer, more complex stimuli of Experiments 3 and 4
A single-interval, five-alternative, forced-choice identifi- 共see Fig. 1兲 it was decided to use two test sessions of five
cation paradigm was employed in each of the four experi- blocks in order to keep the average session time below
ments. Subjects were asked to identify the tuning system 60 min. Subjects were free to pause or take a short break in

TABLE III. Results from Experiment 2: pooled confusion matrix for the major-third variant of the test.

Stimulus Response

R1 共a兲 R2 共b兲 R3 共c兲 R4 共d兲 R5 共e兲

S1 共a兲 306 86 10 3
S2 共b兲 63 226 85 26 5
S3 共c兲 17 83 264 37 4
S4 共d兲 3 21 98 252 31
S5 共e兲 4 7 76 318

between test blocks and most subjects opted to take one this method pools all “high” errors and all “low” errors for a
break halfway through a session. In Experiments 1 and 2 the given stimulus category in order to produce more reliable
order in which subjects completed major-, neutral- and estimates of inter-stimulus d⬘s.
minor-third tests was permutated among the subjects to bal- In the present study, d⬘ values are calculated from
ance learning effects. individual-subject confusion matrices. In a very few in-
stances a negative d⬘ is obtained between neighboring stimu-
lus categories because of a subject’s anomalous response pat-
E. Data analyses
tern, in which cases the negative value has been set to zero.
Individual-subject data for each identification task were For some of the higher scoring subjects, a given stimulus
analyzed to give values for information transfer IT over the category may produce a zero error count, in which cases an
stimulus set and sensitivity index d⬘ between stimuli within infinite inter-stimulus d⬘ is suggested. However, this is a con-
the set. These quantities indicate the extent to which the sequence of the quantized nature of the subjects’ response
stimulus categories are perceptually distinct. The full-range patterns—a more realistic d⬘ estimate is obtained by attrib-
sensitivity index D⬘ 共i.e., cumulative d⬘ across the full stimu- uting a zero error count to a “true” count in the range 0–0.5
lus range兲 is closely related to IT, since both D⬘and IT indi- and hence calculating d⬘ from an error count of 0.25.
cate the number of discriminable categories within the stimu-
lus range 共Braida and Durlach, 1972兲. 3. Calculation of just-noticeable differences and
category widths
1. Calculation of IT A just-noticeable difference JND can be defined as the
stimulus change which corresponds to d⬘ = 1. For a particular
For an identification task with s stimulus categories, s set of experimental data, a JND value 共in cents兲 can thus be
response categories and n test items, experimental data may calculated as the inverse slope of a cumulative plot of d⬘ vs
be represented by an s ⫻ s confusion matrix with n entries. In stimulus separation 共in cents兲 or, equivalently, as the quotient
this case, s = 5 and n = 225 or 250 共for a single subject兲. In- R / D⬘ of the stimulus range R and the full-range sensitivity
formation transfer to the subject was calculated from the index D⬘. 共R = 60 cents for the experiments in this study.兲
confusion matrix according to the formula given by Miller The IT calculated for a particular stimulus set is conven-
and Nicely 共1955兲, incorporating the correction suggested by tionally interpreted as indicating the number of stimulus cat-
Miller 共1955兲 for the case when n is not large. egories 2IT within the stimulus range R. The category width
r, i.e., the separation 共in cents兲 required for two stimuli to be
2. Calculation of d⬘ and D⬘ reliably categorized as different, may thus be calculated us-
ing the formula r = R / 共2IT − 1兲.
共Implicit in the discussion which follows is the assump-
tion that the identification task is one dimensional, i.e., that
the various stimuli within the stimulus set are ranged along a
single perceptual dimension. This assumption is necessary in Data are available for eight identification tasks: for the
order for d⬘analysis of confusion matrices to be tractable. various major 3rds in Experiments 1, 2, 3 and 4, for the
The experimental results provide evidence that the tasks in various neutral thirds in Experiments 1 and 2, and for the
the present study are, to some extent, multidimensional—see various minor thirds in Experiments 1 and 2. All eight tasks
below for discussion of this—but do not suggest that the produce broadly similar confusion matrices. An example is
one-dimensional assumption represents a major distortion of given in Table III, which shows data pooled over subjects for
subjects’ strategies.兲 identification of major thirds in Experiment 2. It can be seen
Braida and Durlach 共1972兲 proposed a method by which that the degree of difficulty of the task produces an accept-
d⬘ values can be calculated from a confusion matrix. This able distribution of errors. Mean values of d⬘ for discrimina-
method works well for well-populated matrices, i.e., when n tion within the various stimulus sets 共averaged over single-
is very large. However, there are numerical problems for subject values calculated from individual confusion
smaller values of n. For this reason a similar but more robust matrices兲 are given in Figs. 2 and 3. These figures show
method 共Hill, 2000兲, which is better able to cope with mean d⬘ for nearest neighbor stimuli 共a : b, b : c, c : d, etc.兲,
smaller n, has been used in the present study. In summary, i.e., the perceptual distance corresponding to a 15 cent incre-

FIG. 2. Sensitivity indices d⬘ for nearest-neighbor stimulus pairs: data from FIG. 3. Sensitivity indices d⬘ for nearest-neighbor stimulus pairs: data from
Experiments 1 and 2, averaged over nine subjects; error bars show the Experiments 3 and 4, averaged over nine subjects; error bars show the
standard error. The stimulus pairs are labeled according to the mean of the standard error. Corresponding data from Experiments 1 and 2 are shown for
principal thirds, e.g., a:b as 422.5 cents, y : z as 277.5 cents. comparison. The stimulus pairs are labeled according to the mean of the
principal thirds, e.g., a : b as 422.5 cents, d : e as 377.5 cents.
ment, as it varies with the position of the increment within
the overall pitch range of the stimulus set. Cumulative d⬘
values 共corresponding to larger pitch increments兲 may be cal- showed no significant learning effects. The variation of inter-
culated from these data, if required. Note that in Experiments stimulus d⬘ across the stimulus set is shown in Fig. 2. Dif-
3 and 4, in addition to cues available from pitch increments ferences between the three variants of the experiment—
in the principal third, cues are also available from pitch in- major, neutral and minor thirds—were found not to be
crements in the tones at a major second above the start note. significant at the p = 0.05 level 关one-way ANOVA on data
Hence, in these experiments the experimental results relate to pooled over each class of third; three levels of “interval”
discrimination of changes in the overall tuning system, al- variable; F共2 , 16兲 = 3.26, p = 0.065兴. Differences within the
though for convenience these changes are labeled according three variants of the experiment were investigated using
to pitch changes in the principal third. three separate one-way ANOVAs 共four levels of “subinter-
Table IV shows mean values of IT 共averaged over val” variable兲—a significant effect was found at the p
single-subject values calculated from individual confusion = 0.05 level in the neutral-third case 关F共3 , 24兲 = 6.17, p
matrices兲 and mean values for full-range sensitivity index D⬘ = 0.003兴 but not in the major-third case 关F共3 , 24兲 = 1.65, p
共averaged over single-subject values calculated from slopes = 0.20兴 or the minor-third case 关F共3 , 24兲 = 0.99, p = 0.41兴.
of individual cumulative d⬘ plots兲. Table IV also gives values 共Note: No attempt was made to adjust the threshold on the
for category width r 共calculated from IT兲, for JND 共calcu- test statistic to take account of multiple implementations of
lated from D⬘兲 and for the quotient r / JND. The quoted un- the ANOVA.兲
certainties are standard errors. Inter-stimulus d⬘ values have a mean of around 1.5 for
the pitch increments of 15 cents. Hence the values of full-
A. Results from Experiment 1: Three-tone melodic range sensitivity index D⬘ 共for 60 cent range兲 are around 6.0,
and the JND values 共for d⬘ = 1兲 are around 10 cents 共Table
Analysis of variance 共ANOVA兲 on the overall scores for IV兲. The mean values for IT are around 1 bit, equivalent to
each test block, between and within each test session, perfect transmission of around two categories. Hence the cat-

TABLE IV. Summary results from Experiments 1, 2, 3 and 4: mean values for information transfer IT, category
width r, full-range sensitivity index D⬘; just-noticeable difference JND and the quotient r / JND; the quoted
uncertainties are standard errors.

Experiment IT 共bits兲 r 共cents兲 D⬘ 共Cents兲 r / JND

1 major 1.020± 0.065 58± 5 6.3± 0.5 9.5± 0.8 6.1

1 neutral 0.897± 0.055 70± 6 5.3± 0.4 11.3± 0.8 6.2
1 minor 0.900± 0.045 69± 5 5.5± 0.4 10.9± 0.8 6.3
2 major 1.153± 0.132 49± 8 8.0± 1.1 7.5± 1.1 6.5
2 neutral 1.132± 0.112 50± 7 7.5± 0.9 8.0± 1.0 6.3
2 minor 1.153± 0.146 49± 9 7.9± 1.2 7.6± 1.2 6.4
3 major 0.698± 0.116 96± 21 4.9± 0.7 12.2± 1.8 7.9
4 major 0.896± 0.180 70± 20 6.8± 1.3 8.8± 1.7 8.0

egory width r is approximately equal to the stimulus range R, 12.2± 1.8 cents—the latter correspondingly higher than in
i.e., 60 cents. Experiment 1 共Table IV兲. The mean value for IT is approxi-
mately 0.7 bits. The category width r is consequently greater
B. Results from Experiment 2: Three-tone than the stimulus range R and is calculated to be around
harmonized sequence 100 cents.
As with Experiment 1, an ANOVA found no significant
D. Results from Experiment 4: Harmonized scale-like
learning effects over the course of the experiment. The varia-
tion of inter-stimulus d⬘ across the stimulus set is shown in
Fig. 2. Differences between the three variants of the The results from Experiment 4 similarly exhibit no
experiment—major, neutral and minor thirds—were found learning effects when examined by a two-tailed t test and a
not to be significant at the p = 0.05 level 关one-way ANOVA nonparametric runs test. The variation of inter-stimulus d⬘
on data pooled over each class of third; three levels of inter- across the stimulus set is shown in Fig. 3. Differences across
val variable; F共2 , 16兲 = 0.35, p = 0.71兴. Differences within the the stimulus range were investigated using a one-way
three variants of the experiment were investigated using ANOVA 共four levels of subinterval variable兲 and, as for Ex-
three separate one-way ANOVAs 共four levels of “subinter- periment 3, a significant effect was found 关F共3 , 24兲 = 12.11,
val” variable兲—a significant effect was found at the p p ⬍ 0.001兴. Inter-stimulus d⬘ values have a mean of around
= 0.05 level in the major-third case 关F共3 , 24兲 = 4.34, p 1.6 for the pitch increments of 15 cents—lower than in Ex-
= 0.014兴 but not in the neutral-third case 关F共3 , 24兲 = 0.31, p periment 2 共reflecting the additional complexity of the task兲,
= 0.82兴 or the minor-third case 关F共3 , 24兲 = 0.15, p = 0.93兴. but higher than in Experiment 3 共indicating the effect of the
共Note: No attempt was made to adjust the threshold on the extra cues provided by the additional sustained tone兲. The
test statistic to take account of multiple implementations of value of full-range sensitivity index D⬘ 共for 60 cent range兲 is
the ANOVA.兲 6.8± 1.3, and the JND is 8.8± 1.7 cents—the latter corre-
Inter-stimulus d⬘ values have a mean of around 2.0 for spondingly higher than in Experiment 2 and lower than in
the pitch increments of 15 cents—higher than in Experiment Experiment 3 共Table IV兲. A two-way ANOVA on the data
1, indicating the effect of the extra cues provided by the sets for d⬘ from Experiments 3 and 4 共four levels of subinter-
additional sustained tone. Hence the values of full-range sen- val variable; two levels of experiment variable; data summa-
sitivity index D⬘ 共for 60 cent range兲 are around 8.0, and the rized in Fig. 3兲 indicates that the benefit provided by the
JND values 共for d⬘ = 1兲 are around 7.5 cents—the latter cor- additional tone in Experiment 4, compared to Experiment 3,
respondingly lower than in Experiment 1 共Table IV兲. A two- is 共just兲 not significant at the p = 0.05 level 关F共1 , 8兲 = 4.53,
way ANOVA on the complete data sets for d⬘ from Experi- p = 0.066兴. The error bars in Fig. 3 again indicate a significant
ments 1 and 2 共12 levels of “interval/subinterval” variable; variation in performance over the subject group, as in Ex-
two levels of “experiment” variable; data summarized in Fig. periment 2. The mean value for IT is approximately 0.9 bits.
2兲 indicates that the additional tone in Experiment 2 provides The category width r is consequently slightly greater than the
a significant overall benefit, compared to Experiment 1 stimulus range R and is calculated to be around 70 cents.
关F共1 , 8兲 = 6.55, p = 0.034兴. The error bars in Fig. 2 indicate
that, compared to Experiment 1, there is greater variation in E. Dimensionality of the task
performance over the subject group. Some subjects find the
additional tone of little benefit, whereas other subjects make As mentioned above, the method of d⬘ analysis used
good use of the additional information. The mean values for here, similar to that reported by Braida and Durlach 共1972兲,
IT are just over 1 bit, equivalent to perfect transmission of is technically only valid for stimuli that vary along a single
just over two categories. The category width r is conse- dimension. In practice, the addition of the upper tone in Ex-
quently slightly less than the stimulus range R and is calcu- periments 2 and 4 共see Fig. 1兲 may introduce a second di-
lated to be around 50 cents. mension relating to beat frequencies.
For the small pitch changes used in these experiments,
the beat frequency 共in Hz兲 varies linearly with the pitch in-
C. Results from Experiment 3: Melodic scale-like
crement 共in cents兲, to a good approximation. However, be-
cause the listener cannot distinguish between positive and
A two-tailed t test and nonparametric runs test indicate negative beat frequencies, the perceived beat frequency has a
no significant learning effects in Experiment 3, either in each less simple relation to the pitch increment. For example, for
experimental session or over the course of the whole experi- the major thirds in Experiments 2 and 4, the dominant beat
ment. The variation of inter-stimulus d⬘ across the stimulus frequencies are as follows: a, 44.0 Hz; b, 29.4 Hz; c,
set is shown in Fig. 3. Differences across the stimulus range 14.9 Hz; d, 0.6 Hz; e, 共–兲13.6 Hz. Stimulus d is effectively
were investigated using a one-way ANOVA 共four levels of beat free and stimuli c and e are effectively identical with
subinterval variable兲 and a significant effect was found respect to beat frequency. Similarly, for the minor thirds, the
关F共3 , 24兲 = 15.44, p ⬍ 0.001兴. Inter-stimulus d⬘ values have a dominant beat frequencies are as follows: v, 12.5 Hz; w,
mean of around 1.2 for the pitch increments of 1.0 Hz; x, 共–兲10.4 Hz; y, 共–兲21.7 Hz; z, 共–兲32.9 Hz. Stimlus
15 cents—lower than in Experiment 1, reflecting the addi- w is effectively beat free and stimuli v and x are effectively
tional complexity of the task. The value of full-range sensi- identical with respect to beat frequency. 关For the neutral
tivity index D⬘ 共for 60 cent range兲 is 4.9± 0.7, and the JND is thirds the beat frequencies relating to mistuning from the

major third are ᐉ, 共–兲4.2 Hz; m, 共–兲18.3 Hz; n, 共–兲32.4 Hz; greater around familiar intervals, such as the major and mi-
o, 共–兲46.3 Hz; p, 共–兲60.1 Hz. The beat frequencies relating nor thirds 共equal tempered at 400 and 300 cents and just
to mistuning from the minor third are ᐉ, 51.5 Hz; m, intonation at 386 and 316 cents兲. However, in the results
39.7 Hz; n, 27.9 Hz; o, 16.3 Hz; p, 4.8 Hz.兴 from Experiments 1 and 2, the perceptual landscape appears
According to Braida and Durlach’s analysis, in the one- generally featureless—Fig. 2 shows that within each experi-
dimensional case the quotient r / JND of the category width r ment there is little variation of inter-stimulus 共nearest neigh-
and the JND 共d⬘ = 1兲 should be approximately 5. The data in bor兲 d⬘ across the stimulus range. This is true for small
the final column of Table IV show this quotient to be typi- movements within the stimulus range 共as shown by the indi-
cally 6 or more in the present study, perhaps suggesting that vidual lines in Fig. 2兲 and also for larger movements within
the one-dimensional assumption represents a distortion of the the stimulus range 共i.e., moving between major, neutral and
true situation but not a major distortion. Further evidence is minor thirds in the data shown in Fig. 2兲. The grossly mis-
provided by error patterns within the pooled subject data tuned intervals 共365 and 335 cents兲 and the “halfway” neu-
共see, for example, Table III, which relates to Experiment 2兲, tral third 共350 cents兲 are not distinguished much better or
which are generally as expected for the one-dimensional worse than the stimuli representing “in tune” musical inter-
case. vals 共equal-tempered thirds at 400 and 300 cents and just-
In Experiments 2 and 4 there is some direct evidence intonation thirds at 386 and 316 cents兲. The statistically sig-
that the higher scoring subjects are making use of beat cues, nificant variation for neutral thirds in Experiment 1 is not
i.e., for those subjects the stimulus set is not one dimen- observed in Experiment 2, and may thus be viewed with
sional. For example, for stimulus e in Experiment 2, some some uncertainty. The statistically significant variation for
subjects produce more erroneous identifications as stimulus c major thirds in Experiment 2 is not observed in Experiment
than as the 共nearest neighbor兲 stimulus d. This may be ex- 1—however, a similar significant variation is also observed
plained on the basis 共see above兲 that beats give little or no in Experiments 3 and 4, and hence, on the basis of these
information to distinguish stimulus e 共370 cents兲 and stimu- results for major thirds, it may be concluded that the percep-
lus c 共400 cents兲 from each other, but provide a significant tual landscape is not entirely featureless. In fact, results from
cue to distinguish these two stimuli from stimulus d all four experiments suggest that the b : c discrimination be-
共385 cents兲. tween major thirds of 415 and 400 cents is reduced com-
The effect of beat cues in Experiment 4 is complex. One pared to the other pairs of nearest-neighbor major thirds—
of the higher scoring subjects commented that stimulus d indicated by the dip in all four lines in Fig. 3. 共As an
共see Table II兲 could be reliably identified by the lack of beats alternative statistical treatment, a simple binomial analysis
on the 3rd and 5th tones in the sequence 共whose relation to suggests a probability of less than 3% for chance observation
the sustained tone at 700 cents is very close to a beat-free that one discrimination from four is consistently easier or
316 cents兲, whereas stimulus c could be reliably identified by harder than the rest over the four experiments—hence we
the lack of beats on the 2nd and 6th tones in the sequence may conclude that, for this particular subject group, the b : c
共whose relation to the sustained tone at 700 cents is very discrimination is significantly different from the rest.兲 It is
close to a beat-free 498 cents兲. The task for that subject was not obvious why perceptual cues should be weaker for this
then to distinguish the other three stimuli which provided no particular discrimination. The variation across the range of
obvious beat cues. major-third stimuli 共Fig. 3兲 is more marked in Experiments 3
and 4 共where, in principle, there are cues from the tuning of
the major seconds as well as from the major thirds兲 than in
F. General discussion of Experiments 1, 2, 3 and 4 Experiments 1 and 2 共where cues are only available from the
The addition of the continuous upper tone in Experi- major thirds兲.
ments 2 and 4 enhances subjects’ mean performance com-
pared to Experiments 1 and 3, respectively, giving higher IV. CONCLUSIONS
values for the full-range sensitivity index D⬘, and lowering
JNDs by 2 or 3 cents 共Table IV兲. Comparison of mean data In the quasi-musical context of these experiments, cat-
for major thirds in Experiment 1 and Experiment 2 with egory widths for identification were measured at around
those of Experiment 3 and Experiment 4 shows that the ad- 70 cents and JNDs in frequency for the complex tones were
dition of extra melodic content leads to lower values for the measured at around 10 cents. 共As expected, the addition of
full-range sensitivity index D⬘ and an increase of 2 or the continuous upper tone in Experiments 2 and 4 produced
3 cents in JND, reflecting the greater complexity of the task. an enhancement of subjects’ mean performance compared to
However, this reduction of performance is not seen for the Experiments 1 and 3, apparently due to the availability of
better subjects—they maintain a similar performance, pre- beat cues.兲 The measured values for category width r and
sumably because the increased complexity of the task is bal- JND compare well with the semitone 共around 100 cents兲,
anced by the additional cues available within the more com- which forms the basis of conventional musical practice, and
plex stimuli. the comma 共around 20 cents兲 which distinguishes intervals
As mentioned above, a significant variation of inter- such as just intonation and Pythagorean major thirds. The
stimulus d⬘might be expected across the stimulus range—for measured category width in less than a semitone, indicating
example, sensitivity might be less around unfamiliar inter- that intervals of a semitone or more will be reliably identi-
vals, such as the neutral third 共approximately 350 cents兲 and fied; the measured JND is less than a comma, suggesting that

differences between intervals such as just intonation and sequences,” Percept. Psychophys. 28, 381–389.
Hall, D. E., and Hess, J. T. 共1984兲. “Perception of musical interval tuning,”
Pythagorean major thirds are likely to be apparent to the
Music Percept. 2, 166–195.
listener. The measured values for category width and JND Harris, J. D. 共1952兲. “Pitch discrimination,” J. Acoust. Soc. Am. 24, 750–
also compare quite well with the findings of Parncutt and 755.
Cohen 共1995兲—falloff in performance for a melodic identi- Hill, T. J. W. 共2000兲. “Experiments on the perception of pitch increments in
simple tone sequences,” Unpublished Ph.D. thesis, University of Exeter,
fication task was observed when the melody was constructed
Exeter, UK.
with step sizes less than 40 cents, and close-to chance per- Kameoka, A., and Kuriyagawa, M. 共1969a兲. “Consonance theory. I. Conso-
formance was observed with a step size of 10 cents. nance of dyads,” J. Acoust. Soc. Am. 45, 1451–1459.
The sensitivity index d⬘ for a 15 cent increment was Kameoka, A., and Kuriyagawa, M. 共1969b兲. “Consonance theory. II. Con-
sonance of complex tones and its calculation method,” J. Acoust. Soc. Am.
typically calculated in the range 1.0–2.0. No significant dif-
45, 1460–1469.
ference in response was found between discrimination of ma- Kidd, G. R., and Watson, C. S. 共1992兲. “The ‘proportion-of-the-total dura-
jor, minor or neutral thirds, although sensitivity was slightly tion 共PTD兲 rule’ for the discrimination of auditory patterns,” J. Acoust.
higher for major thirds. Similarly, little significant difference Soc. Am. 92, 3109–3118.
Miller, G. A. 共1955兲. “Note on the bias of information estimates,” in Infor-
in response was found between intervals within each of the mation Theory in Psychology: Problems and Methods, edited by H. Quas-
major, minor or neutral stimulus sets 共although, for major tler 共Free Press, Glencoe, IL兲, pp. 95–100.
thirds, discrimination of the 15 cent increment between 400 Miller, G. A., and Nicely, P. E. 共1955兲. “An analysis of perceptual confu-
and 415 cents was reduced compared to discrimination of sions among English consonants,” J. Acoust. Soc. Am. 27, 338–352.
Moore, B. C. J., and Glasberg, B. R. 共1990兲. “Frequency discrimination of
other 15 cent increments within the stimulus sets兲. This is in complex tones with overlapping and non-overlapping harmonics,” J.
line with the findings of Parncutt and Cohen 共1995兲, whose Acoust. Soc. Am. 87, 2163–2177.
results for a melodic identification task with various step Mori, S., and Ward, L. M. 共1995兲. “Pure feedback effects in absolute iden-
sizes in the range 25–133 cents showed no peak in perfor- tification,” Percept. Psychophys. 57, 1065–1079.
Nelson, D. A., Stanton, M. E., and Freyman, R. L. 共1983兲. “A general
mance at the “in tune” step size of 100 cents. However, re- equation describing frequency discrimination as a function of frequency
sults for the present study do not correspond with the find- and sensation level,” J. Acoust. Soc. Am. 73, 2117–2123.
ings of some previous studies 共Burns and Ward, 1978; Burns Parncutt, R., and Cohen, A. J. 共1995兲. “Identification of microtonal
and Campbell, 1994兲 which indicated nonuniformity across melodies—effects of scale-step size, serial order, and training,” Percept.
Psychophys. 57, 835–846.
the stimulus range. In the latter study, the average sensitivity Perlman, M., and Krumhansl, C. L. 共1996兲. “An experimental study of in-
index d⬘ for a 25 cent increment was found to be around 1.0, ternal interval standards in Japanese and Western musicians,” Music Per-
but with periodic minima in d⬘ across the range of stimulus cept. 14, 95–116.
Rasch, R. A. 共1983兲. “Description of regular twelve-tone musical tunings,”
tunings 共for intervals at multiples of 100 cents兲. It should be
J. Acoust. Soc. Am. 73, 1023–1035.
noted that, in that study, the response categories 共with Rasch, R. A. 共1985兲. “Perception of melodic and harmonic intonation of
50 cent increments兲 did not match the stimulus categories. two-part musical fragments,” Music Percept. 2, 441–458.
This hinders comparison with the present study, but it may Schellenberg, E. G., and Trehub, S. E. 共1994兲. “Frequency ratios and the
discrimination of pure-tone sequences,” Percept. Psychophys. 56, 472–
be conjectured that the relatively constant d⬘ values observed 478.
in the present study may be a consequence of the experimen- Sek, A., and Moore, B. C. J. 共1995兲. “Frequency discrimination as a func-
tal paradigm, which may have encouraged subjects to disre- tion of frequency, measured in several ways,” J. Acoust. Soc. Am. 97,
gard their previous musical experience. However, the lack of 2479–2486.
Smith, N. A., and Schmuckler, M. A. 共2004兲. “The perception of tonal
evidence for anchors or landmarks does not necessarily im- structure through the differentiation and organization of pitches,” J. Exp.
ply that these have no effect on perception in the context of Psychol. Hum. Percept. Perform. 30, 268–286.
these experiments—it may simply be that they stretch and Spiegel, M. F., and Watson, C. S. 共1984兲. “Performance on frequency-
compress perceptual space so as to produce little effect over- discrimination tasks by musicians and nonmusicians,” J. Acoust. Soc. Am.
76, 1690–1695.
all. Thompson, W. F., Hall, M. D., and Pressing, J. 共2001兲. “Illusory conjunc-
tions of pitch and duration in unfamiliar tone sequences,” J. Exp. Psychol.
ACKNOWLEDGMENTS Hum. Percept. Perform. 27, 128–140.
Tufts, J. B., Molis, M. R., and Leek, M. R. 共2005兲. “Perception of disso-
The authors thank the participants for their generosity. nance by people with normal hearing and sensorineural hearing loss,” J.
This work was supported by the UK Engineering and Physi- Acoust. Soc. Am. 118, 955–967.
Vos, J. 共1986兲. “Purity ratings of tempered fifths and major thirds,” Music
cal Sciences Research Council. Comments by an anonymous
Percept. 3, 221–258.
reviewer were particularly helpful. Vos, J. 共1988兲. “Subjective acceptability of various regular twelve-tone tun-
ing systems in two-part musical fragments,” J. Acoust. Soc. Am. 83,
Braida, L. D., and Durlach, N. I. 共1972兲. “Intensity perception: II. Resolu- 2383–2392.
tion in one-interval paradigms,” J. Acoust. Soc. Am. 51, 483–502. Vos, J., and van Vianen, B. G. 共1985兲. “Thresholds for discrimination be-
Burns, E. M., and Campbell, S. L. 共1994兲. “Frequency and frequency-ratio tween pure and tempered intervals: The relevance of nearly coinciding
resolution by possessors of absolute and relative pitch—examples of cat- harmonics,” J. Acoust. Soc. Am. 77, 176–187.
egorical perception,” J. Acoust. Soc. Am. 96, 2704–2719. Watson, C. S., Wroton, H. W., Kelly, W. J., and Benbassat, C. A. 共1975兲.
Burns, E. M., and Ward, W. D. 共1978兲. “Categorical perception— “Factors in the discrimination of tonal patterns. I. Component frequency,
phenomenon or epiphenomenon: Evidence from experiments in the per- temporal position and silent intervals,” J. Acoust. Soc. Am. 57, 1175–
ception of melodic musical intervals,” J. Acoust. Soc. Am. 63, 456–468. 1185.
Creel, S. C., Newport, E. L., and Aslin, R. N. 共2004兲. “Distant. melodies: Watson, C. S., Kelly, W. J., and Wroton, H. W. 共1976兲. “Factors in the
Statistical learning of nonadjacent dependencies in tone sequences,” J. discrimination of tonal patterns. II. Selective attention and learning under
Exp. Psychol. Learn. Mem. Cogn., 30, 1119–1130. various levels of stimulus uncertainty,” J. Acoust. Soc. Am. 60, 1176–
Deutsch, D. 共1980兲. “The processing of structured and unstructured tonal 1186.

J. Acoust. Soc. Am., Vol. 121, No. 4, April 2007 T. Hill and I. Summers: Discrimination of interval size 2383
