162742539 chorus acoustics
TRANSCRIPT
-
The Florida State UniversityDigiNole Commons
Electronic Theses, Treatises and Dissertations The Graduate School
6-23-2008
A Choral Conductor's Reference Guide to AcousticChoral Music Measurement: 1885 to PresentBrenda Kaye Scoggins FaulsFlorida State University
Follow this and additional works at: http://diginole.lib.fsu.edu/etd
This Dissertation - Open Access is brought to you for free and open access by the The Graduate School at DigiNole Commons. It has been accepted forinclusion in Electronic Theses, Treatises and Dissertations by an authorized administrator of DigiNole Commons. For more information, please [email protected].
Recommended CitationFauls, Brenda Kaye Scoggins, "A Choral Conductor's Reference Guide to Acoustic Choral Music Measurement: 1885 to Present"(2008). Electronic Theses, Treatises and Dissertations. Paper 4492.
-
FLORIDA STATE UNIVERSITY
COLLEGE OF MUSIC
A CHORAL CONDUCTOR'S REFERENCE GUIDE
TO ACOUSTIC CHORAL MUSIC MEASUREMENT:
1885 TO 2007
By
BRENDA KAYE SCOGGINS FAULS
A Dissertation submitted to the College of Music
in partial fulfillment of the requirements for the degree of
Doctor of Philosophy
Degree Awarded: Summer, Semester 2008
Copyright 2008 Brenda K. S. Fauls
All Rights Reserved
-
ii
The members of the committee approve the dissertation of Brenda K. S. Fauls defended on June 23, 2008. ______________________________ Andr Thomas Professor Directing Dissertation
______________________________ Richard Morris Outside Committee Member
______________________________ Judy Bowers Committee Member
______________________________ Kevin Fenton Committee Member The Office of Graduate Studies has verified and approved the above named committee members.
-
To the Tom's in my life,
I dedicate this document.
One brought back the music,
The other unlocked my heart.
Life has begun anew.
iii
-
ACKNOWLEDGEMENTS
The path of a doctoral degree is not walked alone. My journey has been blessed with the
guidance and support of a loving family, wise mentors, and wonderful friends - indeed too many
to name. I extend my sincere gratitude to the members of my committee.
First, my deepest gratitude to my major professor, Dr. Andr J. Thomas, whose generous
spirit and navigational fortitude provided for a future of dream fulfillment.
To Dr. Judy Bowers, I extend my thanks for her continued modeling of trailblazing
artistry and fierce dedication to excellence in choral music education.
I would like to thank Dr. Kevin Fenton for his openness and guidance throughout my
degree program.
To Dr. Richard Morris, I extend my appreciation for his continued willingness to bridge
our worlds with the generous sharing of knowledge, resources, and opportunities.
In closing, I thank those who dedicated countless hours, resources, and motivation to the
successful and joyous completion of this personal goal.
iv
-
TABLE OF CONTENTS
List of Tables ............................................................................................................ vi List of Figures ........................................................................................................... vii Abstract ............................................................................................................ viii 1. INTRODUCTION Purpose of the Study ......................................................................... 1 Need For Study ................................................................................. 1 Delimitations ..................................................................................... 1 Organization of Study ....................................................................... 1 Introduction of Topic ........................................................................ 2 2. SELECTIVE REVIEW OF LITERATURE Choral Blend ..................................................................................... 6 Amplitude ......................................................................................... 7 Formants ~Resonances ..................................................................... 10 Frequency .......................................................................................... 12 Quality of Tone ................................................................................. 17 Registration ....................................................................................... 27 3. HISTORY OF ACOUSTIC CHORAL MUSIC MEASUREMENT 1878-1969 ......................................................................................... 37 1970-1979 ......................................................................................... 44 1980-1989 ......................................................................................... 52 1990-1999 ......................................................................................... 66 2000-Present ..................................................................................... 75 4. SUMMARY ......................................................................................................... 88 5. DISCUSSION AND CONCLUSIONS ................................................................ 93 6. APPENDICES A. Glossary ......................................................................................100 B. Equipment ...................................................................................138 C. Respiratory System .....................................................................153 D. Laryngeal System ........................................................................155 E. Articulatory System .....................................................................157 F. Comparison of Different Interval Naming Systems ....................159 G. Piano Pitch ~ Hertz Chart ...........................................................161 H. IPA English Chart ........................................................................163 7. REFERENCES ....................................................................................................164 8. BIOGRAPHICAL SKETCH ...............................................................................176
v
-
LIST OF TABLES Table 1. Terminology Correlate Chart............................................................. 3
Table 2. Register Definition by Physiological Activity................................... 30
Table 3. Comparison of the Average Formant Frequencies of Timbre Types to the Average Formant Frequencies of Voice Classifications.......... 49
vi
-
LIST OF FIGURES
Figure 1. Amplitude Chart................................................................................. 4
Figure 2. Spectrogram of the Vowel /o/ at 131 Hz (C3) ................................... 4
Figure 3. Singing Tasks..................................................................................... 45
Figure 4. Warm-up Cadence ............................................................................. 54
Figure 5. Choral Formations ............................................................................. 74
Figure 6. Chamber Choir Spacings ................................................................... 78
Figure 7. Organization of Choral Formation by Vocal Parts ............................ 82
Figure 8. A Choral Exercise .............................................................................. 83
vii
-
viii
ABSTRACT
The study of choral sound is accomplished through acoustic choral music measurement.
Physical acoustics are the aspects of sound that can be quantifiably measured and psycho-
acoustics is how we perceive what we hear. This study of choral sound will focus on the
measurable physical acoustic facets of amplitude, frequency and the quality of sound. These
facets of acoustic choral sound have psycho-acoustical correlates of loudness, pitch and timbre.
The success of individual singers within a choral setting is largely dependant upon the
conductor's capacity to identify unconscious vocal habits and provide guidance for their
ameliorated vocal function. A clear understanding of the acoustics of choral sound and the
appropriate application of this knowledge can enable choral conductors to better facilitate the
creation of a superior choral sound. To assist the conductor, appropriate solo and speech
research literature has been included to provide an historical foundation and additional
clarification of apropos subject matter.
An extensive glossary has been provided in this document that codifies terminology from
music acoustics, voice science, choral studies, voice studies, equipment guides and usage,
mathematics, and statistics. The goal of this glossary is to facilitate the intermingling of many
divergent disciplines present in this document and to provide a resource for reference when
reading documents not included in this writing.
The acoustics of choral sound are introduced to provide a unified document in a concise
format that can serve as a springboard for informed practice, rehearsal and study.
-
CHAPTER ONE
PURPOSE OF THE STUDY The purposes of this study were to provide a concise overview of the history of acoustic choral
music measurement; to provide selective, applicable solo voice measurement studies for a foundational
understanding of subject matter; to provide a detailed glossary of definitions, abbreviations, and
equipment to aid understanding of acoustic choral music literature; and to provide suggested applications
of the findings of acoustic choral music measurement.
NEED FOR STUDY
Acoustic choral music research is wide spread and can be difficult to access both logistically and
physically. The wealth of diverse subject matter with each having its own specific language, equipment,
and procedures makes it difficult to understand and apply to outside settings. The measurement process is
continually changing, diverse and confusing. A concise, thorough reference source is needed to inform
conductors, singers, and students alike.
DELIMITATIONS
The present study excludes two areas of acoustic choral music research: bone conducted sound
and its effect on the singer, and children's choir research.
ORGANIZATION OF STUDY
A review of selected solo voice research articles and empirical choral sound articles are presented
first by subject matter and then historically. The study begins with selected early investigations (1879-
1969) into singing research which have had direct impact on choral research. The subsequent chapters
present sequential researches within each decade up to and including 2007. Following a summary of
research to date, a discussion chapter is devoted to suggested choral applications of research findings for
the choral conductor. The closing section is an equipment reference guide and a detailed, multi-subject
glossary of terms, abbreviations, and procedures.
1
-
INTRODUCTION OF TOPIC
Choral conductors have the enviable goal of bringing aural life to a composer's work through a
group of individual singers who, once their voices are lifted together, create the choral sound. Consider
Mayer's (1964) words: On occasion, when listening to a fine choir, one hears tone of such infinite beauty
that it is evident that the sum is far greater than the parts; that is, the sound produced is of greater beauty
than would normally be expected from the individual voices involved.1 Is this an occasion made possible
by fate or by luck? No! This greater beauty is the result of an educated, well-informed conductor, who,
while working with and developing exceptional, individual voices, continually strives toward an
amalgamated excellence of such intertwined talents that only one sound is heard a superior choir.
What is required for a choir to be recognized as superior? How does a choral conductor aid the
individual choir singers toward the ultimate choral sound a perfectly blended choir? Most would agree
that vowel unification, diction timing, loudness variance, pitch precision, vibrato amalgamation, timbre
mergence, and choice of registration would be primary considerations. Each of these components is
indigenous to choral sound and equally so, has measurable physical properties which can be examined
within the acoustic study of choral sound.
Acoustics is the study of sound. Acousticians and choral conductors alike are interested in sound
how sound is made, how a produced sound travels, and then how the sound is heard. How sound is
made is the production of the sound. How sound travels is known as the propagation of the sound. How
we interpret what we hear is the perception of sound. 2
What then is choral sound? Is there a way to measure a characteristic of a choral sound? The
answer is yes and the field is known as acoustic choral measurement: the process of determining the
dimensions and/or specifics of the sound of voices singing together. The study of choral sound employs
both physical acoustics and psycho acoustics. Physical acoustics is the reality of sound the aspects of
sound that can be quantifiably measured. Psycho-acoustics is our reaction to sound how we perceive
what we hear. Each time a choir sings forte, is it the same degree of loudness as the previous time they
sang forte? Choral conductors would agree that a choir will produce forte at a level that is in response to
the prior level of sound. Amplitude
1 Mayer, F. (1964). The relations of blend and intonation in the choral art. Music Educator's Journal, 51 (1), 109- 110. 2 Hall, D. (1980). Musical Acoustics: An Introduction. Belmont, CA: Wadsworth Publishing Co. pp. 4-6.
2
-
is the physical measurement of the choir singing forte. If an acoustician were to measure each occurrence
of forte singing in a song selection, the amplitude would most definitely vary yet the choir, and
conductor, may feel that the fortes were all equal. This is but one difference between what we perceive
(psycho-acoustics) as compared to what we can measure (physical acoustics). This is an example of the
crux of this document. As you can see, for choral conductors and acousticians to understand one another,
an agreement in terminology is crucial. As choral conductors, we describe music with terms that express
our perceptions of music; loudness, pitch, timbre and duration. The correlation of perception terminology
to physical terminology is represented in the chart below. The perceptual component of loudness is
relative to amplitude; pitch to frequency; timbre to the quality of the sound; and duration as it functions in
time.
Table 1: Terminology Correlate Chart
Psycho-acoustics
(Perceptual)
Physical Acoustics Measurement Abbreviation
Loudness Amplitude Decibels dB
Pitch Frequency Hertz
Cycles-per-
second
Kilo-hertz
Hz
cps
kHz
Timbre Quality of Sound Formants,
Formant
Frequencies, and
Resonances
FN or RN
Duration Length of Sound Milliseconds or
Seconds
ms
Sec
Our discussion of acoustic choral measurement is now properly framed for both the choral conductor and
the acoustician.
Frequency is diagramed as the number of sound waves for a given duration of time. The sound
waves' displacement is usually a measurement in Hertz (Hz), for instance A4 is 440 Hz or 440 sound
wave cycles per second (cps). Notice in Figure 1 each cycle is periodic but each has different amplitude
from the baseline.
3
-
Decibels
x Milliseconds
Figure 1: Amplitude Chart3
All three lines represent a sound that is the same frequency. Listeners would perceive all three sounds as
being the same pitch. Pitch is that which we can discern as being within a continuum of low to high or
high to low. Figure 2 is a spectrogram of a singing sample which shows us both the amplitude and the
frequency of a recorded song sample. Here you will note that the x-axis is in decibels (dB) representing
the amplitude of the sample. The y-axis is in kilo-hertz (kHz) representing the frequency of the singing
sample.
Figure 2: Spectrogram of the Vowel /o/ at 131 Hz (C3)
3 (http://www.acs.appstate.edu/~kms/classes/psy3203/SoundPhysics/amplitude_waves.jpg)
4
-
Lagefoged explains the quality of sound quite simply: this is the difference between two notes that
are equal in pitch and loudness but have been produced by different instruments, such as a piano and a
violin.4 When choral conductors talk about the differences between voices they will often use descriptors
such as warm, thin, or full. In other words, choral conductors often use the psycho-acoustical term timbre
when talking about the physical acoustic component quality of sound.
The production of human sound requires the interaction of the respiratory system (air), the
laryngeal system (vibrator), and the articulatory system (shaper). The respiratory system provides the
energy air for sound production. The air moves from the lungs into the trachea until reaching the
closed vocal folds (the vibrator of the laryngeal system). Air pressure increases until the vocal folds are
forced apart and caused to vibrate. As the air moves between the vibrating vocal folds, sound is emitted.
This is called the voice source,5 which is a rich spectrum of the harmonics; whole number multiples of the
fundamental frequency. The sound now moves through the vocal tract (the mouth, throat, and nose)6 and
is molded into speech sounds by the articulatory system (the tongue, lips, teeth, and soft palate).
Depending upon the length, shape and degree of mouth opening, these cavities resonate at different
frequencies and shape the sound source into vowels, consonants, and vocal colors that make up the sound
you and I recognize as the human voice. These resonances, also known as formants, are distinct
characteristics of the singer's morphology, training and habitual use of the voice.
The success of individual singers within a choral setting is largely dependant upon the conductor's
capacity to identify unconscious vocal habits and provide guidance for their ameliorated vocal function.
A clear understanding of the acoustics of choral sound and the appropriate application of this knowledge
can enable choral conductors to better facilitate the creation of a superior choral sound. To assist the
conductor, appropriate solo and speech research literature has been included to provide an historical
foundation and additional clarification of apropos subject matter. A conductor's conscious understanding
of the individual's vocal production and its contribution to the synergized acoustical delivery of the
ensemble creates that phenomenon known not only to audiences, but most especially to the creators of
that unique experience that which we know as the choral experience.
4 Ladefoged, P. (1996), 14. 5 Sundberg, J. (1987). The Science of the Singing Voice. Northern Illinois University Press: Dekalb, Illinois, p. 49. 6 Lagefoged, (1996), 92.
5
-
CHAPTER TWO
REVIEW OF LITERATURE
Choral Blend The joining of individual voices to create a combined sound, a choir, requires choral blend. When
one voice is heard above all others, choral blend is adversely affected. Cashmore (1964) points out that
an individual's attempt to lead his or her vocal section is both vocally taxing and is a detriment to the
growth of independent singers.7 Often though, an individual may have a larger voice than their fellow
choir members. Voice instructors often have issue with conductors who ask such singers to sing with a
minimized production in order for the choir to achieve an overall choral blend.
To achieve this perfect choral blend, Mayer8 believed the focus needed to be on timbre, dynamics
and pitch. Vibrato and the tuning accuracy of singers have great impact on the choir's overall intonation.
His method for improving choir intonation involved both just and non-tempered tuning and began by
tuning perfect octaves on the pitches of D4 and/or E4. Meyer would start with the bass section and then
add each vocal section, one at a time at a mf level, until all of the singers were participating. Once the
octaves were in tune, Meyer would move into perfect fourths and fifths, again centered on D4, and would
remain there until the intervals were mastered. Moving gradually through this process, his choir was able
to master intonation and thereby achieve a more perfect choral blend.9 This empirical approach is used
by many fine conductors.
F. Melius Christiansen and Weston Noble are recognized as two important American conductors
of the twentieth century. Giardiniere, in his 1991 dissertation, explained Weston Nobles re-definition of
F. Melius Christiansens concept of voice matching to achieve choral blend. Christiansen directed singers
to alter their sound to match the person(s) next to them, whereas Noble positioned singers next to other
singers whose vocal character was similar. Noble believed an acoustic phenomenon would occur when
voices were placed correctly. Recordings of Nobles voice matching procedures of two to seven singers
were compiled into a cassette perception survey and then was mailed to active choral musicians (N =
218). Auditors showed marked preference for Nobles final arrangement of voices in more than half of
the listening survey. Auditors were not consistent in responses which had only two voices (duets). This
7 Cashmore, D. (1964). A good performance. The Musical Times, 105 (1451), 56-57. 8 Mayer, F. (1964), 109-110. 9 Ibid.
6
-
study had many responses concerning the quality of the recordings, the process of mailing the tapes, the
varying quality of the listening equipment, and its effect on listener preference.10
Amplitude
Amplitude is the measurable physical attribute of what is perceived as loudness. Specifically,
amplitude is the extent of the variation in air pressure from normal air pressure. When air pressure
reduces, the sound is perceived as less loud and conversely, when the air pressure increases, the sound is
perceived as louder. However, how much air pressure increases or decreases is not equal to how much
louder or softer the sound is perceived.11 When measuring the sound pressure level (SPL) of a vowel
sound, the amplitude of the voice source is the sound produced by the vocal fold vibrations. The main
controlling element for this amplitude is subglottal pressure. Other elements involved are the relationship
between the resonances' frequencies of the vocal tract and the partials present in the spectrum. When air
pressure increases, the amplitude increases and conversely when air pressure decreases the amplitude
decreases. Gramming (1991) designed the following experiments to study the effect of loud and soft
phonation on the spectral envelope.
In the first experiment, a female participant was recorded speaking the vowel /a/ at approximately
400 Hz, first in soft phonation and then in loud phonation. The loud phonation revealed all 12 partials
below five kHz. However, in soft phonation, only the first two partials were present and the F0 (fundamental frequency) was stronger. The F0 (fundamental frequency) is the number of repeating cycles
of the vocal folds in one second and is measured in Hertz (Hz). The first partial of a sound is also called
the F0 (fundamental frequency). A partial of the sound is a component of a complex sound which can be
the F0 (fundamental frequency), a harmonic of the F0 (fundamental frequency), or an overtone of the F0
(fundamental frequency). This single participant pilot study was utilized as a basis for the next study.
Participants (N = 20, n = 10 women and n = 10 men, all with normal, untrained voices) were recorded
speaking the vowel /a/ in soft and loud phonation. Overall, the F0 (fundamental frequency) remained
louder than the partials in all participants' soft phonation. However, in loud phonation, a partial, which
represented an overtone, was the loudest. An observed consistency occurred when the resonances
frequencies remained the same although the F0 (fundamental frequency) increased, in loud phonation, the
strongest partial in the spectrum correlated with the first resonant frequency. Again, as in the pilot study, 10 Giardiniere, D. (1991). Voice matching: an investigation of vocal matches, their effect on choral sound and procedures of inquiry conducted by Weston Noble (Doctoral dissertation, New York University, 1991). UMI ProQuest Digital Dissertation Abstracts, 241, AAT 9213181. 11 Lagefoged, P. (1996), 14-16.
7
-
the louder phonations had many more partials than the softer phonations. The increases were evident
when the F0 (fundamental frequency) was at a lower pitch, as in the male participants.
Participants (N = 22 speech therapy students) in the second experiment were recorded speaking
the vowels /a/, /i/, and /u/. The averaged phonetogram results showed the vowel /a/ was ~10 dB higher in
sound pressure level (SPL) than /i/ or /u/ when the participant sounded a low F0 (fundamental frequency).
As the F0 (fundamental frequency) rose, the sound pressure level (SPL) differences between the vowels
reduced. In loud phonation, the sound pressure level (SPL) increased as the frequency of the first formant
increased. In soft phonation, there was no difference between the vowels because the F0 (fundamental
frequency) was the strongest partial.
Grammings' third study utilized both healthy (N = 20 men and women) and non-healthy (N = 10
female patients diagnosed with non-organic dysphonia) participants. Again, phonetograms were made of
the vowel /a/ on a pitch chosen by the participant. The pitch chosen by the participant was evaluated and
described in relation to the participant's full range. The goal of this study was the short term variance in
sound pressure level (SPL) in loud and soft phonation. The patient participants, who used soft phonation,
more than 60% of the time, chose a frequency in the higher part of their range which showed significantly
more sound pressure level (SPL) variation. The resulting sound pressure level (SPL) variation mean for
loud phonation was 2 dB whereas in soft phonation the sound pressure level (SPL) variation mean was 5
dB which led Gramming to conclude voice control was more difficult when the patient participants used
soft phonation.
Weber (1992) was interested in the difference between vibrato and straight tone singing on sound
pressure level (SPL). College choir sopranos (N = 20) were recorded singing /a/ for representative low,
middle, and high pitches in loud and soft dynamics with both vibrato and straight tone. For each
participant, this resulted in 24 trials per soprano (each condition was repeated). Analysis of the recordings
found no significant difference in sound pressure level (SPL) for any condition except for a slight
difference in the loud vibrato condition. Weber concluded conductors should determine the use of
straight tone or vibrato be based on the acoustic characteristics of the performance location since the
sound pressure level (SPL) showed very little variance.12
Sundberg et al. (1998) chose an unexplored musician population to investigate voice source
characteristics, one of which was intensity. Singing participants (N =6 premier male country singers) 12 Weber, S. T. (1992). An investigation of intensity differences between vibrato and straight tone singing (Doctoral Dissertation, Arizona State University, 1992). ProQuest Dissertation Abstracts International, AAT 9223155.
8
-
wore a Rothenberg mask and were recorded speaking and singing the CV (consonant-vowel orientated)
syllable /pae/. The speech condition was two fold: in speech condition one, the participants started at
basal pitch (lowest comfortable pitch) and repeated /pae/ in soft, medium, and loud voice. This pattern
was repeated at four successive thirds, imitating in speech the pitch pattern of an arpeggio. In the second
speech condition, the participants spoke the syllable /pae/ to the pattern of a limerick in soft, medium and
loud voice.
The singing conditions were also two fold. The participants chose and sang a song from their
country repertoire on a starting pitch of their choice without accompaniment. The participant was
encouraged to sing with all the same inflections, dynamics, and intensity as in a performance. The second
singing condition had the participants sing The National Anthem at a starting pitch of their choice.
Extensive detail was given to the recording and analysis process including the equipment used.
Listening participants (N = 19 singing experts) listened to a perception test designed to answer the
question How much pressedness do you hear in this voice? Answers were given on a 100-mm visual
analog scale which ranged from None to Extreme. One third of the samples were replayed to test for
reliability. Listening participants perceptions included an awareness of different voice quality between
the chosen country song and The National Anthem. The participants reported that the amount of
pressedness heard in the samples increased with higher pitches that were coupled with louder volume.
The correlation between the pressedness of the voice on higher pitches with increases in sound
pressure level (SPL), which would be expected to also double the subglottal pressure (Ps), was not evident
in the results. The results suggested that the smaller the sound pressure level (SPL) gain, the greater the
perceived pressedness. But, as expected by the authors, the closed quotient (CQ) and the glottal
compliance were greater in loud speech than in soft speech whereas in singing, the participants used
similar or slightly higher closed quotient (CQ) values. The authors concluded that a voice source
characteristic of country singing was very high closed quotient (CQ) values in loud singing. This
characteristic, often considered a cause of vocal damage (pressedness), had not manifested itself in the
vocal fold pathology of these participants.13
Miller, Schutte and Doing (2001) explored soft phonation in professional tenors. Participants (N =
2) were fitted with an electoglottograph collar and an esophageal balloon while singing into a microphone
four vocal tasks: 1) a sustained Ab4, 2) an Ab4 arpeggio, 3) a sustained note in falsetto, and 4) a sustained 13 Sundberg, J., Cleveland, T., Stone, R., & Iwarsson, J. (1999). Voice source characteristics in six premier country singers. Journal of Voice, 13 (2), 168-183.
9
-
note in modal production. Each vocal task was performed in a soft level and then in a medium level
gradually down to a very soft level while maintaining the same vocal production. One participants vocal
timbre was described as lyrical while the other voice was described as robust. The lyric tenor had no
difficulty with the requested tasks. The robust tenor experienced a moment of silence as the voice would
equalize from a louder production to a softer production. This was accredited to a longer closed quotient
(CQ) phase that was incomplete and a steeper slope on the electroglottography (EEG) that became
significantly shallower in the very soft level. The lyric tenor maintained a steady subglottal pressure (Ps)
throughout the entire task. This data prompted the authors to suggest that messo di voce is a voice
register, not a vocal task.14
Formants ~ Resonances
Pulsating air flow through the glottis (the space between open vocal folds) is known as the voice
source. When sound is measured at the voice source, the fundamental frequency (F0) will have the
greatest amplitude. Each cavity of the vocal tract will have a resonance that will be represented in the
source spectrum envelope as peaks of amplitude at various frequencies. These peaks of amplitude are
formants. Beginning with the first spectral peak occurring at the lowest frequency, the formants are
labeled in order F1, F2, F3and so on. Each formant rises in frequency.15 The resonance frequencies
change as the vocal tract molds articulation. Specific frequencies increase with individual vowels that are
articulated in a specific region of the articulatory system. The first frequency peak (F1) is usually
associated with the pharyngeal space (back cavity of the mouth) particularly with the vowels /e/, /i/, and
//. The second frequency peak (F2) is generated in the front cavity of the mouth for the back vowels /u/,
/o/, and //. The third frequency peak (F3) is dependant upon the front of the tongue, especially in vowels
/u/, /o/, /i/ and //. The fourth and fifth frequency peaks again have front of the tongue influence on the
//, //, and /e/ whereas the back of the tongue influences /u/, /o/, and /i/. The fifth peak is strongly
impacted by the larynx tube.16 It is the unique morphology of each singer that requires individually
specific training to achieve maximum resonances from the vocal tract. Knowledge of the production and
14 Miller, D. G., Schutte, H. K., Doing, J. (2001). Soft phonation in the male singing voice: preliminary study. Journal of Voice, 15 (4), 483-491. 15 Fant, G. (1970). Acoustic Theory of Speech Production: With Calculations Based on X-ray Studies of Russian Articulations. Mouton: The Hague, pp. 17-20. 16 Fant, (1970). 121-122.
10
-
propagation of these resonances will aid in developing voices that are capable of singing healthily over
orchestras and in producing full rich choral ensembles.
One of the most cited articles in voice research is Fant et al.'s (1972) article on the measurement of
subglottal formants. Measurements were taken of the first through third formants (F1, F2, and F3) of the
recordings of participants' speaking the CV (consonant-vowel) syllable /pa/. Results of this study
suggested the glottal strength of participants had a direct impact on the measurement of subglottal
formants. Weak and/or breathy voices showed more subglottal formant traces than those of normal
voices. Formant measurement data garnered in this study was used to develop computer models of
synthesized voices.17
Miller and Schutte (1990) defined formant tuning as using vowel modification to approximate one
or both of the two lowest resonances of the vocal tract to harmonics of the glottal source.18 A leading
Netherlands opera baritone was recorded singing melodic patterns on a variety of vowels and CV
nonsense syllables with a catheter (fitted with a miniature wide band pressure transducer) inserted through
the neck and into the glottis area as well as an EGG (electraglottographic) neck band. Vocal production
began once the topical anesthesia had faded. Supra- and sub-glottal pressures were measured and well as
the formant frequencies and harmonics. Phonations were made at the participants choice of pitch and
ranged from 230 Hz to 380 Hz (Bb 3 to F4) an area where vocal tract realignment is usually needed to
move baritones into full head voice. In other words, the participant reduced the sub-glottal pressure (Ps)
and modified the vowel to make a smooth transition into head voice.19
Miller and Schutte (1992) continued their research into subglottal pressure and formant
measurement by recording professional male singers (n = 2), equipped with two glottis transducers, an
electoglottograph (EGG), and a microphone at a distance of 30 centimeters. The recorded singing tasks
were four scales on the vowel /a/ and sustained /a/ vowels on four range- representative pitches.
Conclusions included confirmation of measurement tools to show center frequencies of pitches when
vibrato was present in the singers' vocal production. The same equipment was able to accurately measure
17 Fant, G., Ishizaka, K., Lindqvist-Gauffin, J., Sundberg, J. (1972). Subglottal formants. STL-QPSR, 13 (1), 001-012. 18 Miller, G., Schutte, H. K. (1990). Formant tuning in a professional baritone. Journal of Voice, 4 (3), 231. 19 Ibid.
11
-
the frequency distance between harmonics and a dominant formant. The vocal tract configuration was
confirmed, in this study, as a variable in determining formant frequency modulation.20
Ternstrm (2007) chose to investigate formant frequencies by using a professional barbershop
quartet. Three four-track recordings of Paper Moon were sung by the participants in an absorbent room.
The recordings included the participants singing together but with each singer placed in one of the four
corners of the room, each participant singing alone, each participant speaking alone, and then all
participants speaking together. Each singer wore a small microphone taped on the end of his nose. The
recordings were analyzed through inverse filtering utilizing Decap software to determine the identity of
formant frequencies, the measurements of the spread of formant frequencies, and the relationship of
partials in both individual and ensemble measurements. The vowels chosen for analyzing were /u/ (to), /i/
(be), and /a/ (divine).
Results suggested singers separated their formants from each other as evidenced in wide- spread
formant frequencies. Formant frequencies were often on or close to a partial of the individual singer as
well as to the common partials of another singer. The spread formant frequencies may have been in an
effort to hear oneself better so that the combined sound might have seemed larger and more expanded, in
other words, more resonant. In the barbershop world this is referred to as locked and rung!21 Success for
this quartet was achieved through varied vowel production versus attempting to sing exactly the same
vowel the opposite of choral singing. Barbershop quartets may be able to increase their resonance by
adjusting their vowel quality.22
Frequency
Frequency is the rate of vibration of a periodic event. In phonated sound this means the number of
sound wave cycles per second (cps). When we measure frequency it is expressed in hertz (Hz). We
assign a specific name to a pitch because we do not hear frequencies. Our available hearing range of
frequency is approximately 20 Hz to 20,000 Hz.23 The lowest note that we can hear is what would be the
lowest C (Csub zero) on the piano if it were extended two whole tones. Each successive C going from left to
20 Schutte, H., Miller, D., Svec, J. G. (1995). Measurement of formant frequencies and bandwidths in singing. Journal of Voice, 9 (3), 290-296. 21 Ternstrm, S., & Kalin, G. (2007). Formant frequency adjustment in barbershop quartet singing. International Congress on Acoustics, Madrid, September 2007, 1-6. 22 Ibid. 23 Lagefoged, (1996), 21.
12
-
right on the piano is ordered numerically C1, C2 and so on. These notes are said to be an octave apart.
C4 is commonly referred to as middle C. A4 is the fourth A on the piano from right to left and is
commonly known as A440 because the vibration of the air stream as it passes through the glottis is 440
cycles per second (cps) or 440 Hz. When we speak of pitch, we are using a perceptual term of relativity
that functions on a scale from low to high. When we speak of frequency, we are speaking in absolutes
using a term of measurement of the number of sound waves occurring within a second. (See Appendix D).
In 1979, Shipp et al. recorded participants (N is not provided, n = 10 professional operatic singers,
n = not provided number of spastic dysphonic patients) singing a variety of sustained vocal lines utilizing
targeted frequencies throughout their ranges. Acoustic analysis revealed many differences between the
sub-groups. The singer participants' variance of vibrato pitch was within 0.5 semitones whereas the
patient participants had very little vibrato as reflected in their signal amplitude. The patient participants
had very large cycle-to-cycle variations whereas the singer participants' variations were very small.
However, the variation mean rate of vibrato was similar for both the singers and the patients. The results
suggested the physiological manifestation of vocal tremor and vibrato are similar, yet, singers may have
mastered a stabilizing technique in which the nerve pulses of muscles are inhibited except for the superior
laryngeal nerve which stimulates the cricothyroid muscle. Perhaps patients and less experienced singers
allow, or do not suppress, stimulation of muscle nerves in areas of the vocal tract (including the
respiratory system) that cause muscles to engage that are not needed for phonation.24
The next three landmark studies investigated the understanding of singers' vowel production in a
variety of singer modes of phonation. Bloothooft and Plomp (1984) first recorded each singer (N = 14
professional singers, n = 7 male and n = 7 female) in an anechoic room singing the nine Dutch vowels for
one to two seconds in each of the following tone qualities: neutral, light, dark, free, pressed, soft, loud,
straight, and extra vibrato. These terms were taken from accepted vocal pedagogy and the participants
confirmed knowledge of and an understanding of each of the terms. Comparison of the nine modes'
average sound pressure level (SPL) revealed that the neutral mode and the free mode appeared to be
interchangeable descriptors of the same mode of singing. Comparison of the nine modes spectral
compositions showed that the presence of, or increased use of vibrato did not vary the spectral
24 Shipp, T. & Izdebski, K. (1979). Elements of frequency and amplitude modulation in the trained and pathologic voice. Acoustical Society of America Supplement, 1 (66), Fall 1979, 56.
13
-
compositions. From these conclusions, Bloothooft and Plomp reduced the number of modes to six; soft,
light, dark, neutral, pressed and loud.
Each singers classification was used to determine the fundamental frequencies (F0) used for each
participant (five for men and four for women). The sopranos and tenors showed twice the spectral
variance in the F0 (fundamental frequency) across the vowels and modes of singing as that of the bass and
alto participants. Although no perceptual data were taken, authors suggested sopranos and tenors needed
better intelligibility of vowels. The greatest vowel variance for all the participants was the vowel /u/. The
vowels /a/, / /, and // showed half of the variance than that of the vowel /u/. The information was not
provided regarding measurement tools used for the vowel variances; however, great detail was given to
the measurement process and results.25
Bloothooft and Plomp's (1985) second article used the same subjects and data to discuss the vowel
spectrum for each participant with respect to the main effect of the four vowels. Each vowel was
measured in dBs and at increments of ten milliseconds with a 1/3-octave band filter spectrum that was
normalized for SPL (sound pressure level). A comparison was made between the perception-oriented
spectrum space (formant frequencies) and the production-oriented spectrum space (from 1/3 octave
spectra). The vowels were represented as the most important single source of spectra variance for low
fundamental frequencies (F0). Male and female variants were consistent with one another. The
relationship between the average sound level of the singers formant (Fs) and the fundamental frequency
(F0) was found to be vowel dependant. When the fundamental frequency (F0) was higher than 392 Hz, the
results showed a lower singers formant for women. The modal register had less variability in the first
formant (F1) than the falsetto register and it was hypothesized that in singing higher frequencies, the first
formant (F1) is very close to the fundamental frequency (F0). Bloothooft references Sundberg's (1981)
results which showed strong acoustic coupling between glottis and vocal tract26 and suggested this was a
possible cause for these results.27
25 Bloothooft, G., Plomp, R. (1984). Spectral analysis of sung vowels: I. variation due to differences between vowels, singers, and modes of singing. Journal of the Acoustical Society of America, 75 (4), 1259-1264. 26 Sundberg, J. (1981). Formants and fundamental frequency control in singing. An experimental study of coupling between vocal tract and voice source. Acustica, 49, 47-54. 27 Bloothooft, G., Plomp, R. (1985). Spectral analysis of sung vowels. II. The effect of Fundamental frequency on vowel spectra. Journal of the Acoustical Society of America, 77 (4), 1580-1588.
14
-
Again, the same data is used in Bloothooft and Plomp's third study, which compared the individual
participant's spectra of the different modes of phonation. The overall conclusions confirmed that primary
differences in the fundamental frequency (F0) were associated with the differing lengths of the male vocal
tract whereas in the women, the main difference was associated with the glottal opening. The pressed-
dark mode of singing in the participants clearly showed increased pharyngeal volume which was directly
influenced by the height of the larynx.28
Maxwell (1985) investigated the effect of masking on a singer's ability to sing in tune. Masking is
the obscuring of one sound by another. In singing, the inability to hear oneself sing is often the result of a
masking noise which sometimes is the loudness of the surrounding singers. The greatest masking effect
within a choir occurs within one's own vocal section, for those singers are singing the same frequencies
(what we think of as pitches).
In the first of three experiments, participants (N = 24 college voice majors) were recorded singing
vocalizes and song excerpts with and without masking noise. The second experiment recorded
participants (N = 15) as they sang The Star Spangled Banner in a key of their choosing in which
masking noise was added at an unknown, random point. The third experiment was a 10-week
longitudinal study with four treatment conditions: 1) normal lessons and normal practice (CG control
group); 2) white noise lessons and normal practice; 3) normal lessons and white noise practice; and 4)
white noise lessons with white noise practice. In each study, pre-experiment and post-experiment
recordings were made of each participant prior to and after each experiment. From these recordings of the
first two experiments, a listening tape was made for judge participants (N = 9, n = 3 voice teachers, n =
professional non-voice musicians, and n = 3 lay musicians). The listening tape contained excerpts from
the pre- and post-recordings of participants. The judges ranked the voice quality of the first excerpt as
compared to the voice quality of the second excerpt as better, same, or worse (studies one and two). This
same procedure was executed for an intonation comparison of the paired excerpts. For the third
experiment, the judge participants were asked to rank the singer participants' vocal progress between the
first of the paired excerpts as compared to the second of the paired excerpt. Five options were provided
for the ranking: great progress, considerable progress, some progress, same, and worse.
The judges perceptions of the first experiment participants samples found white noise adversely
affected participant intonation and voice quality. It was not surprising that the judges were able to detect
28 Bloothooft, G., Plomp, R. (1986). Spectral analysis of sung vowels III. Characteristics of singers and modes of singing. Journal of the Acoustical Society of America, 79 (3), 852-864.
15
-
the point when masking noise had been introduced in the second study. The sample group of the third
study, which received the highest mean score ranking, had masking noise during their lessons and practice
time. However, comparison of variance within groups found much greater variance within all other
groups outside the control group. Teacher guidance with white noise indeed produced greater results.
Participants without teacher guidance of white noise regressed. In all studies, participants tended to flat
ascending passages, sharp descending passages, sharp sustained notes, and modify / / to // or /a/ when
masking was introduced. The recording, editing, and playback equipment are unknown for the listening
participants' perception listening tape. Also, the production of white noise is unknown. These specifics
would aide in understanding the conclusions drawn, the perceptions of the auditors, and would provide a
roadmap from which to apply the information garnered. However, great detail is given to the statistical
analyses of the listening participants responses.29
Gramming et al. (1988) wondered what the relationship was between the changes in voice pitch
when loudness was considered as a factor. Male and female singers and non-singers (N = 20) were
recorded singing triads (singers) and pitch glides (non-singers) to provide data for phonetograms. The
same participants were asked to read a lengthy (non-related) passage, first in a quiet environment,
followed by three additional readings in steadily increasing noisy environments. Singers were found to
use a stronger fundamental frequency (F0) and an elevated frequency with increased noise in the
environment. Non-singers showed no difference in pitch. Authors proposed singers wider pitch range
accessibility and familiarity with their full pitch range as an explanation for these results. Additionally,
this may be a reason for reduced pathology in similar life settings.30
Nordmark and Ternstrm (1996) looked at intonation from a very different angle. The most
defining interval of Western tuning systems (Pythagorean, pure, and equal temperament) is the major and
the minor third. Hemholtz believed that intervals which were not "purely" tuned caused a "beating" which
would be heard as a dissonance.31 Nordmark and Ternstrm created synthesized non-beating ensembles
sounds to add to the existing knowledge of beat ensemble sounds and their relationship to intonation. To
create these sounds, synthesized violas were used because they most closely resembled human sounds -
once a flutter component was added. The average ensemble flutter level was found to be between 10-15 29 Maxwell, D. (1986). The effect of white noise masking on singers. Journal of Research in Singing, 8 (2), 9-19. 30 Gramming, P., Sundberg, J., Ternstrm, S. Leanderson, R., Perkins, W. H. (1988). The relationship between changes in voice and pitch loudness. Journal of Voice, 2 (2), 118-126. 31 Hemholtz, (1885), 24.
16
-
cents (Ternstrm, 1993).32 For this experiment, nine cents of flutter was added to the synthesized viola
sounds. Two groups of three ensemble sounds were used to create versions of major thirds: the first
group had the fundamental frequency (F0) set at 220 Hz; the second group was set at 390 cents above the
fundamental frequency (F0) for a slightly larger major third interval than a pure major third which would
have been at 386 cents above the fundamental frequency (F0). Once created, the dyad was replicated 9
more times at different fundamental (F0) pitches. Each dyad was repeated twice in random order on a 20
dyad perception test. The headphoned listening participants (N = 16, n = 11 undergraduate choral music
education students, and n = 5 orchestra musicians) were given the opportunity to tune each dyad to their
preference for a major third. The range of cents above the fundamental was 350 to 450 cents. If a
participant expressed preference for a deviation above or below this range, the computer would not allow
the participant to move on to the next dyad. The results showed listener preference for interval size of a
major third was 395.4 cents - which is closer to equal temperament than to pure intonation (386 cents).
Participant results suggested that non-beating intervals (pure intonation) are not preferred. However,
participant preference reliability was inconsistent in this study.33
Quality of Tone
Helmholtz (1885) described the quality of a tone as being sometimes called its color, timbre, or
register.34 When one is able to discern one pitch of the same frequency, duration, and loudness from
another it is because its quality of sound is different from the others. Hemholtz determined that the
difference must be in the manner in which the motion is performed within the period of each single
vibration.35 This manner can be perceived as brighter or more acute; it could be the way the tone begins
(onset) or ends (off set); the amount of resonance (or the lack of resonance) in the sound; or the effect of
one's pronunciation on the tone.36 Fillebrown believed the quality of a tone was the result of the singer's
mood or emotion; an expression of the individual which was completely unique to the singer.37
32 Ternstrm, (1993), 7. 33 Nordmark, J. & Ternstrm, S. (1996). Intonation preferences for major thirds with non-beating ensemble sounds. TMH-QPSR, 37 (1), 57-62. 34 Helmholtz, (1885), 24. 35 Ibid., 19. 36 Ibid. pp. 65, 66, 113. 37 Fillebrown, (1911), 7-8.
17
-
Fillebrown did not have a scientific, anatomical, physiological explanation for the quality of a singer's
tone but believed the answer would be found through continued research.
Schoen (1921), a student of Carl Seashore, studied the presence of vibrato in professional
sopranos (N = 5). Professional recordings of Nellie Melba, Alma Gluck, Frances Alda, Emma Eames,
and Emma Destinn singing Bach-Gounod's Ave Maria were analyzed by tonoscope (early stroboscopy).
The selected pitch was the third note of the composition, D5 (~ 587.33 Hz). Each participant's sample
was analyzed with respect to the attack of the note, the accuracy of intonation, the fluctuation of the
frequency, the release of the note, and the tonal movements leading to the note and away from the note.
Individual characteristics were provided for each participant. The overall conclusions showed this tone
was led to from a lower note and resulted in a low attach frequency. Schoen surmised a time interval
might have elapsed before the intensity of breath was engaged fully. Equally interesting was that the
release of the note was high in frequency even though the next note was lower. Schoen conjectured that
this might be due to an attempt to maintain a steady pitch to the end of the tone and that breath support
might wane, causing the participant to press more breath support which raised the pitch. Each time the
same pitch from the same musical phrase was repeated, the participant sang it differently. The vowel
quality seemed to have no effect on the pitch accuracy. Movement from tone to tone seemed to be glide-
like, almost a portamento. The participants seemed to sing sharp with respect to both pure and tempered
intonation. Schoen concluded [erroneously] that although vibrato was present in every voice, it was only
present when there was strain in the accompanying muscles. Schoen suggested the muscle strain was in
response to the singer's emotional excitement while singing and that vibrato was the result of a neuro-
muscular condition characteristic of the singing mechanism and therefore a periodic-pitch phenomenon.38
Bartholomew (1934) hypothesized that oscillator recordings of singers, both professional and
amateur, would reveal the physiological structure(s) responsible for various qualities. With this
information, singers, as much as was possible, would be able to consciously control the voice mechanism.
Bartholomew recorded 46 films and from them defined four characteristics of good male voice quality:
vibrato, tonal intensity, the presence of a strengthened low partial at 500 cycles per second (cps) or lower,
and the presence of a high formant lying between 2400 and 3200 cycles per second (cps). Sometimes
another peak occurred around 5700 cycles per second (cps), which the author surmised occurred when the
larynx pipe was energized strongly enough that its natural octave began to appear. There were similar
38 Schoen, M. (1921). An Experimental Study of the Pitch Factor in Artistic Singing. Ph.D. Dissertation: University of Iowa, August, 1921.
18
-
indications for female voices but with the following exceptions: the high formant centered higher around
3200 cycles per second (cps); and the coloratura had almost no high formant yet the tone quality was
deemed "good" because of its "purity".39
Twenty five years later, at the 51st conference proceedings for the Acoustical Society of America,
Bartholomew suggested a classification of singer tones was necessary. Spectrographic and X-ray studies
of singing would allow for voice classification according to the singers voice quality, the singers
expressed mood, and the vowel sung by the singer. Bartholomew proposed twenty-seven classifications
of physiological differences visually noted in x-rays coupled with acoustic differences found in
spectrograms.40 Fry (1956) immediately responded with 27 voice classifications, but based the system on
three specific voice types light, lyric, and dramatic. Although a definition of these three voice types is
not provided, the general thought was that light described a voice that did not have a professional quality
to the sound perhaps lack of the singer's formant (Fs). Lyric and Dramatic voices represented opposites
of the professional voice spectrum. The three types were determined by the position of the larynx and the
configuration of the epiglottis, pharynx, and root of the tongue. Additional factors taken into
consideration were the mood of the singer and the vowel being articulated.41
Rshevkin (1956) recorded male voices singing vowels /u/, /a/, /i/, and /o/ on pitches ranging from
94 cycles per second (cps) to 490 cycles per second (cps) for duration of approximately 0.1 seconds.
Harmonic analysis revealed two distinct increases within two narrow bands of spectrum; 400-600 cycles
per second (cps) and 2200-2800 cycles per second (cps) which were not present in untrained baritones.
The higher formant frequency in the 2200-2800 cycles per second (cps) region was labeled the singer's
formant (Fs). Listeners described voices with the singer's formant (Fs) as metallic. Rshevkin suggested
that these peaks occurred only at the beginning of the vowel which the trained singer learned to modify to
39 Bartholomew, W.T. (1934). A physical definition of good voice-quality in the male voice. Journal of the Acoustical Society of America, 5 (3), 25-33. 40 Bartholomew, W. (1956). A basis for the acoustical study of singing. The Journal of the Acoustical Society of America, 28 (4), 757. 41 Fry, D. B. (1956). A basis for the acoustical study of singing. Program of the Fifty-First Meeting of the Acoustical Society of Americas Joint Meeting with the Second ICA Congress. Cambridge, Massachusetts, 34.
19
-
the vowel singing position. These results agreed with the findings of his earlier research (1927) and
those of Bartholomew who found a high singer's formant around 2800-3200 cycles per second (cps).42
Delattre (1958) felt the work of correlating voice formants with types and classes of voices had not
yet been successfully accomplished. The design of this study was not provided, but through an acoustic
articulatory comparison of vowel color and its effect on voice quality, Delattre reached the conclusion that
the quality of a singer's voice seemed to be characterized by the two or three formants whose frequencies
are just above the vowel formants.43
Arment's (1960) dissertation sought to compare the spectra of vowel tones with the perceptual
designation of the same tones on a bright to dark hierarchy. For the initial pilot study, participants (N = 2
sopranos with perceptually different tone qualities) were asked to sing four different pitches (D4, A4, D5,
F#5) on three different vowels (/i/, /a/, /u/) for a duration of four seconds per tone. The recordings were
made in an 8' x 10' acoustically dead room. To make the perception tape, the tones had both their onset
and offset trimmed leaving a two second tone. Auditors (N = 6 voice teachers and singers) were asked to
rate the vowel on a bright to dark ranking scale of: 1) very bright, 2) moderately bright, 3) neither
predominantly bright or dark [neutral], 4) moderately dark, or 5) very dark. Analysis of the auditors'
preferences included: 1) brightness to darkness rating for each vowel, 2) brightness to darkness rating for
each pitch, 3) brightness to darkness ratings for each vowel on each pitch, 4) brightness to darkness
ratings for each singer, and 5) brightness to darkness ratings for each vowel as sung by each singer.
Those tones which received the highest agreement on the dark to bright hierarchy were chosen for spectral
analysis, including identification of formants and intensities of harmonics.
Spectral analysis of the tones, which the auditors found to be very bright, revealed narrow
formants. The second formant (F2) was high in intensity and overall high harmonics. The dark vowel
spectra had broad formants, a third formant (F3) low in intensity, and a broad formant between 3000 and
5000 Hz. Another variable, two different singers with perceptually different tone qualities, was apparent
in the overall spectra (no definitive information is given regarding this statement).
The primary study recorded participants (N = 5 sopranos with a minimum of five years of vocal
training) singing D4, F#4, B4, D5, C#5, A4, G4, and E4. Each tone was sung on each of six vowels; /i/,
/e/, /a/, /o/, /u/, and //. The participants were asked to sing specific tones on a specific vowel in a 42 Rshevkin, S. N. (1956). Some results of the analysis of singing voice. Program of the Fifty-First Meeting of the Acoustical Society of Americas Joint Meeting with the Second ICA Congress. Cambridge, Massachusetts. 34-36. 43 Delattre, P. (1951). The physiological interpretation of sound spectrograms. Publication of the Modern Language Association (PMLA), 66 (5), 864-875.
20
-
particular voice quality bright, dark, or neutral. Each singer was given time to study the required order
of tasks and then given time for a practice run prior to the official recording. To aid the participant in
maintaining the intensity level between singing tasks, a decibel meter was positioned in the participant's
sight line. The target intensity level was 75-80 dB. The participants were recorded in the same 8' x 10'
acoustically dead room with a microphone thirty-two inches from the singer and forty-eight inches from
the floor.
The listening participants (N = 16 singing teachers and graduate level singers, n = 8 men and n = 8
women) were asked to evaluate a series of tones on a Likert 10 point scale from extremely bright to
neutral to extremely dark. Each tone in the series was to receive its own evaluation although the auditor
was going to hear six tones at a time. Spectral analysis of each tone was completed for all harmonic,
formant, and vibrato data. These spectral data were cross referenced with the listening participants'
answers. Arment concluded the brightness or darkness of a tone may be regarded as a continuum of tonal
characteristics44. The brightness to darkness continuum might be influenced by the vowel, the intensity,
and/or the pitch of the tone, but ultimately it stands alone as a significant descriptor of the tone. Varying
loudness of tones showed no effect on the brightness or darkness of tones. However, vowel did seem to
have a direct effect on the brightness or darkness of tone. Just as in the pilot study, bright tones had
strong high harmonics whereas dark tones had strong low harmonics. Bright tones had narrow formant
bands in comparison to wide banded dark tones. Tones which ranked the brightest showed greater
intensity of the second formant (F2) and an increase in the amount of harmonics in the tone.45
Coleman (1973) investigated exactly what physiological components define the quality of a
speaker's voice such that the speaker's sex is known. Two experiments were devised in which participants
(N = 40 university students, n = 20 males and n = 20 females) were recorded speaking a variety of speech
tasks and repeated some of the tasks using a laryngeal vibrator. The recordings were analyzed and the
vocal tract resonances (VTR) and laryngeal fundamental frequencies (LFF) were computed for each
participant. The first perception test utilized five-second samples from each participant played
backwards. In this test, auditors' (N = 17 university students) responses were significant (p > .01) with
94% accuracy in identifying the sex of the sample with respect to the laryngeal fundamental frequency
44 Arment, H. (1960). A Study By Means of Spectrographic Analysis of the Brightness and Darkness Qualities of Vowel Tones in Womens Voices. (University Microfilms No. AAG6002989). 45 Ibid.
21
-
(LFF). Accuracy dropped to 56% when the sex of the sample was compared to the average mean of the
vocal tract resonances (VTR).
The second perception test utilized the laryngeal vibrator samples which had the highest vocal
tract resonances (VTR) for the females (n = 5) and the lowest vocal tract resonances (VTR) for the males
(n = 5). Only two pitches were used for the samples 240 Hz and 120 Hz. The samples had equal
representations of the following descriptors: low vocal tract resonances (VTR) and low laryngeal
fundamental frequencies (LFF), high vocal tract resonances (VTR) and high laryngeal fundamental
frequencies (LFF), high vocal tract resonances (VTR) with low laryngeal fundamental frequencies (LFF),
low vocal tract resonances (VTR) with high laryngeal fundamental frequencies (LFF). The auditors (N =
25 university students) were asked to determine the sex of the speaker and the results showed correct sex
identification 245 out of 250 times in the first two descriptors above (those in which the VTR and the LFF
are indicative of the same sex). When the descriptors were jumbled, male characteristics (low VTR and
low LFF) were perceptually more prominent. The results of these experiments led Coleman to the
conclusion that laryngeal fundamental frequency plays a heavier role in our ability to discern between
male and female speakers.46
Teie (1976) used a variety of singers (N = 31, n = 5 male first year voice students, n = 5 female
first year voice students, n = 5 male fourth year voice students, n = 5 female fourth year voice students, n
= 3 male untrained singers, n = 3 female untrained singers, and n = 5 voice faculty members) to look at
the effect of vocal training on presence of the singer's formant (Fs), an increase in energy in the 2800-
3200 Hz range. Each participant was recorded singing the vowels /a/, /i/, and /u/ on two pitches. The
male participants sang at 160 Hz (E3) and 288 Hz (D4) and the female participants sang on 288 Hz (D4)
and 512 Hz (C5). These pitches were chosen to represent the upper and lower voice registers of both the
male and female participants. It was also deemed important for one of the pitches to be sung by all
participants (288 Hz, D4). The participants were instructed to sing at full volume and to vary the distance
of their mouth to the microphone by watching an oscilloscope so that 125 dB was maintained. The
recordings were conducted in a sound proof speech laboratory room.
Each of the participant's six samples was analyzed through spectrography for the fundamental
frequency (F0) and the presence of partials in the tone. Teie concluded that the amount of training affects
the frequencies higher than the second formant (F2), most specifically the range of 2 kHz to 4 kHz. The 46 Coleman, R. (1973). A comparison of the contributions of two vocal characteristics to the perception of maleness and femaleness in the voice. STL-QPSR, 14 (2-3), 13-22.
22
-
training effect was present in the intensity levels of the tones for both trained and untrained participants
had similar configuration and breadth within the singer's formant (Fs) range, 2800-3200 Hz. Little
difference between all of the subsets of participants was apparent in the F1 and the F2 when the singers
were singing the same vowel at the same pitch. In the trained singers' samples, there was spectral energy
peaks in the 6 to 8 kHz region. Most interesting was that the untrained singer's produced tones with
almost as prominent singer's formant (Fs) as did the trained singers on the /i/ vowel. This suggests singers
should strive to have the /i/ vowel quality in all vowel sounds to enhance the singer's formant (Fs) region.
Teie went so far as to conclude the essence of consistent tone quality is the ability to color all vowel
sounds with an /i/ resonance.
Teie felt his results were circumspect due to the low participant number for each subset category.
Additionally, the dynamic level chosen may have had an impact on the results and therefore a greater
variety of dynamic levels would have provided keener insight as to this effect. The Fs was inconsistent in
the female singers' spectra. In closing, Teie conjectured as to the effect consonants would have on the
presence of the singer's formant (Fs).47
To examine vocal registration, Cleveland (1977) recorded male participants (N = 8 professional
Swedish singers) singing the vowels /i/, /e/, //, /o/, /u/ on the pitches C3, F3, A3, E4. A listening test was
designed with three hearings of each vowel vocalization presented in two twenty-five minute sessions
each separated by a thirty-minute break (five vowels x four pitches x eight subjects). Some vowel sounds
were synthesized by a source-filter network. Auditors (number unknown) were asked to determine the
voice classification of the singers as bass, baritone, or tenor.
Source spectra, formant frequency, and sonogram measurement were employed on the vowel
vocalizations. The information showed the voice classification was dependant on vocal tract size and
dimension, for example, the vocal tract length of basses singing /i/ was nineteen centimeters as compared
to tenors at 15.5 centimeters. Cleveland also found timbre type classification to be strongly influenced by
formant frequency and suggested that its importance outweighed pitch. The correlation between formant
frequency of spoken vowels and sung vowels was quite high and could be useful in future voice
47 Teie, E. (1976). A comparative study of the development of the third formant in trained and untrained voices. (Doctoral Dissertation, University of Minnesota, 1976). Dissertation Abstracts International, 37, (10A), 6135.
23
-
classification. Since vocal timbre exists at an earlier age than full range capability, Cleveland suggested it
is a better indicator of voice classification.48
Magill and Jacobsen asked professional singers (n=15) and college music students (n=15) to
identify their voice classification and then recorded them singing sustained vowels and major arpeggios
appropriately pitched for their self-proclaimed voice categories. Analysis of the recordings showed the
presence of increased spectral energy in the range of the singers formant (Fs) in both males and females
and in all voice categories. There was more singers' formant (Fs) presence in the male voices which
Magill and Jacobsen hypothesized may have been due to a lower first formant (F1) that allowed for a
greater number of harmonics to fall within the area of the singer's formant (Fs) frequency envelope. The
strength of the energy in the singer's formant Fs) region showed a direct correlation to the participant's
amount of training and experience.49
Colton and Estes (1979) recorded participants (N is not provided) singing in four separate voice
qualities on selected pitches throughout their vocal range. Auditors (N is unknown) had a high degree of
accuracy in identifying the participants' modes of phonation, even at the ends of the vocal ranges. The
acoustical results of the recordings showed definite frequency bandwidths, specific resonant peak
locations with representative spectral envelopes to dynamic ranges. Physiological results were equally
definitive of each vocal mode. The results suggested the unique features of each voice mode could
provide singers with a roadmap toward a variety of healthy vocal modes of phonation that in turn would
offer singers multiple voice qualities.50
Murray (1979) explored the presence of jitter in female spoken phonation as compared to sung
phonation. Jitter is the presence of irregular periodicity in the action of the vocal folds and is often
perceived as hoarseness. Female singers (N = 4) were recorded speaking the vowel /a/ and then singing
the vowel /a/ in four different conditions (conditions are unknown). The recorded samples were measured
for frequency perturbation (jitter). A panel of participants (N is not provided) were asked to listen to the
recorded samples and determine if the samples were sung or spoken. Perception participants were unable
48 Cleveland, T.F. (1977). Acoustic properties of voice timbre types and their influence on voice classification. Journal of the Acoustical Society of America, 61, 1622-1629. 49 Magill, P., Jacobson, L. (1978). A comparison of the singing formant in the voices of professional and student singers. Journal of Research in Music Education, 26 (4), 456-469.
50 Colton, R., & Estill, J. (1979). Elements of quality variation voice modes and singing. Acoustical Society of America Supplement, 1(66), Fall 1979, 55-56.
24
-
to discern differences between spoken and sung vowels. The analysis results showed less jitter in spoken
vowels than that of the sung vowels.51
Hertegard et al. (1990) used sung vowels to investigate "open" versus "covered" vowels.
Participants (N = 11 professionally trained male singers, n = five tenors, n = three baritones, and n = 3
basses) were recorded singing in both head and covered technique with a variety of acoustical equipment
in many conditions. Participants received no training as to the difference between covered and open
singing technique as all participants confirmed they had received such training from singing experts
during their years of vocal study.
The first study utilized a flexible fiberoptic endoscope to allow video of the working mechanism
during both the open and covered singing of a one octave scale on the vowel /ae/. The participants were
instructed to choose a scale that would cross the passaggio near the top of the scale. At the end of the
scale the participants were asked to sing an octave interval to return to the starting pitch. The participants
then sang a sustained note on the vowel /ae/ near the passaggio. No directions were given for dynamics in
either task.
The resulting recordings (both audio and visual) were observed and listened to by participants (N
= 3, n = two phoniatricians and n = one logopedist) to evaluate whether or not the flexible fiberoptic
endoscope recordings presented any differences between open and covered techniques. The designated
form for the participants conclusions had the categories no difference and obvious difference for the
visual recordings and obvious, slight, or nil for the audio recordings. Obvious differences were noted
by the panel participants in the recordings between open and covered vocal production. Visual analysis
revealed the soft palate was consistently higher in seven of the subjects in covered singing. Ten of the
subjects widened their pharynx for covered singing. Of the five participants in which the larynx was
clearly visible, all five participants widened the laryngeal ventricles and tilted the larynx forward in
covered singing.
In the second study, participants (N = 7 males singers) wore a Rothenberg mask
(pneumotachograph mask) and were recorded singing /pae/ at a pitch of their choosing near the passaggio,
alternating between open and covered singing. The recorded samples were inverse filtered and produced
a transglottal air-flow wave form (FGG) for analysis of the first and second formant (F1, F2). A flow
glottogram graph (FGG) shows specific activity of the vocal fold cycle peak-to-peak flow amplitude in 51 Murray, T. (1979). Vocal jitter in singers voice. The 98th Meeting of Acoustical Society of America, November 1979, Salt Lake City, Utah, 55.
25
-
milliliters per second, glottal leakage in milliliters per second, period time in milliseconds, and duration of
the quasi-closed phase in milliseconds. The results obtained from the inverse filtering were varied.
Subglottal pressure (Ps) and sound pressure level (SPL) showed little or no variation between covered and
open singing. The first formant (F1) was generally lower during covered singing whereas the second
formant (F2) was generally higher in covered singing. Also, the voice source appeared different between
open and covered singing although no definitive information was detailed.
The participants from the second study also participated in the third study. Participants were
recorded singing a sustained vowel /ae/ near the passaggio at a pitch of their choosing in both open and
covered technique. Spectral analysis of these recordings gave information regarding fundamental
frequency (F0), the level of harmonics, and the frequencies of the first and sometimes second formant (F1,
F2). The spectrogram of same participants open singing was superimposed on the participant's covered
singing spectrogram for six of the seven participants. The energy of the singers formant region was
unchanged between open and covered production. The highest energy level was located at the harmonic
closest to the first formant (F1), but it was unclear whether this was in open or covered singing.
When the participant used covered technique to equalize the passaggio, the frequency of the fourth
harmonic (F4) would often agree with the frequency of the second formant (F2). Perhaps a relationship
existed between the passaggio and this match. Was this result due to the increased loudness (averaging
eight dB) in covered singing versus open singing? Another factor in the sound spectrum was that the
amplitude of the fundamental frequency (F0) was higher in covered singing, just as the first formant (F1)
was lower. These changes were speculated to be due to changes in the voice source. Most importantly,
these combined results suggested that covered singing reduced strain on the vocal mechanism and could
prevent hyper-functional strain of the larynx.52
Detweiler (1993) designed a study to confirm Sundbergs concept of the singers formant (Fs).
The singer's formant (Fs) and the source of the singer's formant (Fs) resonance has a direct relationship
between the ventricular spaces in pulse phonation, and the laryngopharyngeal outlet cross section area
which would result in a 6:1 ratio. One tenor and two baritones (N = 3) were recorded singing during
laryngeal stroboscopy and an MRI procedure. Although the participants produced consistent energy
increases in the singer's formant region (Fs) in all procedures, and in both modal and pulse modes, these
participants did not meet the 6:1 ratio requirement. However, the MRI images were obtained while the 52 Hertegard, S., Gauffin, J., Sundberg, J. (1990). Open and covered singing as studied by means of fiber optics, inverse filtering, and spectral analysis. Journal of Voice, 4, 220-230.
26
-
participant was lying down, which produced a vertical orientation. The result was an overestimation of
the area to be measured (Sundberg, 2003). However, this study confirmed the existence of singer's
formant resonances (Fs1 and Fs2) in the pulse registers of these participants.53
Female barbershop tenors have a very specific voice quality which is perceived as light and having
very little vibrato. Abbott (2001) recorded female barbershop tenors (N = 27) speaking and singing
voices. Acoustic analysis of the recordings revealed consistencies throughout the participant group. The
female barbershop tenors' voices were characterized with an increased fundamental frequency variation in
their speaking voice when compared to existing data for similar aged women. When singing, the
participants had great variability in the upper passaggios, higher spectral energy in the fundamental and
lower harmonics, and vibrato presented in 25% of the time recorded (extremely low percentage).54
Registration
Helmholtz (1885) believed the tension of vocal folds not only determined the pitch of the tone, but
also which register the tone originated. He also asserted the thickness of the vocal folds played a part in
the sound of the tone, for example, the head voice was thought to be the product of the drawing aside of
the mucous coat below the chords (sic) thus rendering the edge of the chords sharper, and the weight of
the vibrating part less, while the elasticity is unaltered.55 The breast voice (chest or modal voice) was a
result of the tissue below the vocal folds pulling at the bottom of the vocal folds, thereby making them in
effect heavily weighted.56 The articles that we are about to examine are built on the foundation Hemholtz
provided for us, yet many surprises are in store.
Fillebrown (1911) acknowledged that head tones, chest tones, closed tones, and open tones were
accepted vernacular of the day, but strongly advocated that registers were not a natural feature of the
voice. He supported his claim through a series of statements by surgeons and professional singing
teachers. These included Manuel Garca, the creator of the laryngoscope, who was reported to have
confirmed Fillebrown's belief in the "one voice" system.57 Although Fillebrown purported no vocal
53 Detweiler, R. (1994). An investigation of the laryngeal system as the resonance source of the singers formant. Journal of Voice, 8 (4), 303-313. 54 Abbott, S. E. (2001). Acoustic evaluation and analysis of the female barbershop tenor voice. Unpublished doctoral dissertation, The Florida State University. 55 Helmholtz, (1885), 101. 56 Ibid. 57 Fillebrown, (1911), 2.
27
-
registers, he provided this definition of registers: a series of tones of a characteristic clang or quality,
produced by the same mechanism.58
Janwillem van den Berg (1963) agreed that vocal registers were pr