In February of 1999, the Late Show with David Letterman brought on a musician from the Republic of Tuva, Russia, who would astonish American audiences with one of the world’s most outstanding singing techniques it has heard, with its origins going back into ancient times (Rothman & Foley, 1999). World-renowned musician and living national treasure of Tuva, Kongar-ol Ondar, performed his mastery of a technique of singing called Khoomei, also known as Tuvan throat singing or overtone singing, in which the singer combines several notes into one to produce a high frequency “whistling” note that overlays on top of their voice. Ondar, who brought the world’s attention to Khoomei through a musical collaboration with blues musician Paul Pena, once again captured our attention and our curiosity to understand the mystery behind Khoomei. The origins of throat singing can be traced to Central Asia in the Eastern Steppe and have roots in Tuvan and Mongolian cultures, as well as a variety of many other cultures (Ruiz & Wilken, 2018). Through this article, we will be taking a look into the vocal tract to find the answer as to how overtone singing works.
Here is a brilliant demonstration of an overtone being done, as well as a spectrogram for visual understanding. This overtone is done by Anna-Maria Hefele who is a vocal artist, musician, and content creator on YouTube.
To begin, it is important to understand the concepts of fundamental frequency and harmonics. The fundamental frequency is, in simple terms, the lowest noise output (wave) of any given noise, including your voice or a sung note (Teach Me Audio, 2020). It is the first registered note that our ears and microphones pick up as it is also the loudest or most amplified. For example, if you play all strings open on a guitar, think of the E string, the string that produces the lowest note, as taking the role of producing the fundamental frequency. Harmonics are the waves that follow the fundamental frequency. They are the same wavelength as the fundamental frequency but vibrate faster depending on the frequency of the wave for each harmonic in question. Think back to that example using the guitar, with each string playing at the same length as the next, yet each subsequent string vibrates at a faster rate, thus creating a higher frequency. Another way to think of the fundamental frequency and harmonics is to picture a rainbow, which represents a sound, with red being the lowest note and each ascending color of it representing harmonics. With each additional color, the frequency increases in pitch all the way up to violet, being the highest pitch.
Refer to Figure 1 to observe a graph that gives a simplistic view of the relationship between the fundamental frequency and harmonics, and note that the fundamental frequency is the most amplified along with each harmonic being raised a frequency of 10 Hz for each one graphed along the X-axis. This is because the frequency of a harmonic is multiplied by whatever the fundamental frequency is. It should also be important to note that harmonics are not always less amplified as their frequency increases. Figure 2 shows a visual of what the fundamental frequency and harmonics look like in action. Note that for each increase in frequency (Hz), more vibrations exist on the axis of the wave.
Now to take a look at overtone singing. When we speak or sing, air from our lungs breaches the vocal folds (the source), causing an intense vibration, which creates sonic energy (energy you can hear, or sound). Shaping your filter, the mechanical workings of your vocal tract are responsible for how you produce vowels. Lowering your jaw and vocalizing to produce an “ahh” sound versus raising your jaw to produce an “ee” sound are both examples of what the filter does (see Figure 3 for visualization).
In overtone singing, however, the filter is manipulated to such a degree and control that the listener begins to hear two different notes of differing pitches coming from the same singer. The singer produces the fundamental phonation and then proceeds to cultivate the overtone, the “whistling” note, which can be clearly heard if sung by masters of the art. The prerequisite to achieving an overtone, at least through Khoomei, is the initial vocal fry which is done by constricting the vocal folds to produce the lowest note, along with amplitude from the air. The tongue, which is placed in a “whistling position,” provides the vocal tract with a resonance chamber within the mouth, and with enough practice, allows Khoomei singers the ability to achieve an overtone (Ruiz & Wilken, 2018). Overtone singing is heavily reliant on manipulating the way the filter is shaped to produce vowels, in order to cultivate that distinguishable “whistling” sound on top of the fundamental singing (Foresman, 2008).
Studies using spectrograms can provide a great insight into what is occurring with the sound of overtone singing by studying the formants of it. Formants are the resonant by-products that are determined by the shape of the vocal tract, such as how low the jaw is, or where the tongue is fixed. Formants in spectrograms are measurements of resonance being manipulated and are categorized by number (Foresman, 2008), formant 1 being constructed in the area above the flat surface of the tongue, and formant 2 being constructed by the back of the tongue. These two areas are separate air spaces, like two vacuums (cavities), in which harmonic energy is manipulated, giving us a formant with a pitch depending on how that part of the vocal tract is being manipulated (Voice Science Works). This phenomenon is referred to as a resonance chamber (Bergevin et al. (2020).
Bergevin et al. (2020) presented spectrogram data that displayed visualization of what is happening to the vocalization of an overtone singer while singing Khoomei, and they show that the formants 2 and 3 converge, revealing that singers use their vocal tract to concentrate a seemingly singular note. Formants cannot become a singular item such as mixing yellow and blue paint to create green, but rather, can overlap unto each other, such as placing a yellow lens on top of a blue lens to create a green one. A visualization of this can be observed above in Figure 4.
Although Bergevin et al. (2020) provide evidence that overtone singing is a result of overlapping formants, the mechanics of the filter required to produce one are still unknown. MRI scans suggest that it is highly likely the result of the singer’s tongue pressing itself against the alveolar ridge while simultaneously constricting the windpipe in proximity of the uvula. This can be seen above in Figure 3. Bergevin et al. (2020) argue in comparison with previous studies concerning the importance of the creation of a cavity, aka a resonance chamber, that the “rear cavity” (vacuum of the throat) is responsible for determining the pitch of the overtone, whereas the “front cavity” (vacuum created by the tongue) is responsible for the merging of formats. Foresman (2008) also asserts the theory of the resonance chamber as a likely explanation for the development of an overtone in Khoomei, citing previous studies that are indicative of that argument as well.
From where it stands, the best estimate that research has on the development of the overtone lies within the structure of the vocal tract through a combination of constricting the vocal cords and windpipe as well as a specific placement of the tongue. That being said, it is not entirely conclusive how this happens, only that it gives linguists a reasonable theory that is backed up through observations of Khoomei singers. Overtone singing is, in this article’s opinion, yet another wonderful and awesome ability of the seemingly endless surprises that humans are capable of achieving. In the hopes of helping others to appreciate the complexity of overtone singing, it is recommended that curious people always take the initiative to learn about the mysteries behind the linguistic capabilities of humans.
Bergevin, C., Narayan, C., Williams, J., Mhatre, N., Steeves, J. K. E., Bernstein, J. G. W., & Story, B. (2020). Overtone focusing in biphonic tuvan throat singing. ELife, 9, e50476. https://doi.org/10.7554/eLife.50476
Foresman, B. (2008). Acoustical measurement of the human vocal tract: Quantifying speech & throat-singing [Senior thesis]. Pomona College.
Ruiz, M. J., & Wilken, D. (2018). Tuvan throat singing and harmonics. Physics Education, 53(3), https://doi.org/10.1088/1361-6552/aaa921
Teach Me Audio. (2020, May 6). Fundamental and harmonic frequencies. https://www.teachmeaudio.com/recording/sound-reproduction/fundamental-harmonic-frequencies
Rothman, R. (Writer), & Foley, J. (Director). (1999, February 12). Show #1168 (Season 6) [TV series episode]. In R. Burnett (Executive Producer), The Late Show with David Letterman. Worldwide Pants Inc.
Voice Science Works. (n.d.). Harmonics vs. formants. https://www.voicescienceworks.org/harmonics-vs-formants.html
I would like to thank Dr. Presley Pizzo for her help in allowing me to grasp the complexity of how overtone singing works, as well as for sharing sources that also help explain its process. The structure of this article is greatly influenced by the way in which Dr. Pizzo helped me understand the parts of phonetics that lead up to the explanation of overtone singing.