Evolution of songSave for early manuscripts and the sacred music still sung in synagogue and church, little can be known about the origins of song. In his brilliant survey of the evolution of speech in relation to language and human brain development (1996), the neuroscientist Deacon has observed that there are no extant ‘simple’ languages, only the present complex ones. These comprise different linguistic forms of grammar and syntax, moulded by cultural idiosyncracies, mutual interaction, and, above all, the specific influence that the language potential of early hominids had on brain evolution through natural selection. This surely would also have been the case for song's prehistory, which of course is also unknown. However, its nature may perhaps be surmised from the first linguistic communication between mother and infant which, if not in the womb itself through transmitted sound, is in the instinctive, inherently tuneful, and innately beautiful sound of the lullaby, whether sung or simply hummed. Musicologists recognize that Jewish and Greek sacred chant, plainsong, psalmody, and motet, both liturgical and secular, and their parallels in folksong, the lyric poetry of the troubadours' song and the madrigal, were all formative in the subsequent development of opera as an important aspect of the Renaissance. All of these, together with the later development of lieder, that distinctive German song style for the accompanied solo voice, which joins music, poetry, and its interpretation into a singular whole, have contributed to sung music — both classical and in the multitude of forms today.
As a musical instrument the singing voice has wide tonal compass and uniquely variable pitch, intensity, and stress. These and other prosodic features can fill even simple tones with emotional charge, as in psalmody. Here, in the solemn context of the mass and of the expectations of those hearing it, agony and love are portrayed as spiritual and reverential, appropriate to sacred music; intensely beautiful in its own way but, through religious observance, containing, not freeing, the spirit. When it encompasses all emotions, beautiful singing is epitomized by the ideals sought in the Italianate Bel Canto tradition, which came to perfection through the development of the operatic form. Its essence and secular appeal rested then, and to this day, on the emotional response of the listener to the production particularly of perfectly sung vowels in the arias for the solo voice, now in harmony with orchestra and chorus. Indeed, such emphasis on the quality of the open vowels, achieved by years of assiduous practice, was also at the heart of Gregorian chant, in which voices of different tessitura (the natural centre and tonal surround of each voice) would sing the tenor or falsetto parts. Then, the ‘tenor’ voice was the ‘lower’ voice, ‘holding’ the plainsong melody in long, drawn out notes, progressing smoothly from one note to the next in ‘legato’, sung either solo or by one half of the chorus. The upper (falsetto) voices responded in Amens or antiphons (short refrains). These, as well as other exclamatory additions, such as the Kyrie and Gloria in excelsis, allowed a more florid style of singing, but constrained nevertheless, so that emotion was subservient to the awe and solemnity appropriate to the Mass. But not so in opera, where legato in a favourite aria can still an audience and then climactically bring it to its feet over the full spectrum of human emotions, from love to hate, from grief to happiness. It can reveal evil or good intent as in Iago's scheming, or display the ambivalence of Carmen's love.
And so on to the German lieder, epitomized by Schubert's extraordinary genius in creating songs and song cycles. These were no longer constructed so that each repeating verse had the same musical form; instead, and based on the poetry of Goethe, Schiller, and their successors, sonority and melody now captured and enhanced the dramatic content of their lyric poetry, for each consonant, word, and line.
Beautiful singing, then, perhaps even more than speech itself, proclaims the emotional state of mind of the singer and hence, through the linguistic and emotional content of the words and the quality of the composed music, recapitulates a similar state in the minds of listeners, magically uniting composer, performer, and audience. The commonality of such a collective experience thus reveals the full extent of the communicative power of the highly evolved, culturally moulded gift (but from whom?) of human language, expressed in song as well as speech.
Sound generationThe terms ‘tenor’ and ‘falsetto’ refer to the early recognition that in the singing of notes of ascending or descending pitch within the great scale, individual voices show a ‘break’, requiring a distinct readjustment of voice production. The lower range was called voce di piena or voce di petto, meaning ‘full voice’ or ‘voice of the chest’; the upper one, voce finte, or ‘head voice’. These descriptions reveal the early recognition of the two principal voice ‘registers’ by relating them to the perceived placement or apparent source of the voice. In either case there is in fact only one sound source, that of the ‘phonating’ glottis (the aperture between the vocal cords). It is this sound which is modulated by the articulatory movements of the jaw, tongue, lips, and palate to add syllable, fricative, and other phonetically-distinct components to shape the natural sound or timbre of the individual voice. The willed intention to sing or speak, or for that matter the occurrence of an involuntary gasp or groan, depends on two sets of movements due to muscular activity; those of the thorax leading to expiratory airflow (see breathing), and those of the laryngeal cartilages in the voice box. In the latter case the vocal cords are brought together (adduction), interrupting the airflow. This process is not itself directly perceived, only the sense of vocal effort in generating the intensity of the intended sound, which through lifelong learning is inextricably bound to the sound heard ‘in one's head’. In common usage it is said that the vocal cords ‘vibrate’, but this is not actually correct in physical terms. Instead, the sound is generated secondarily to the sudden interruption of the expiratory airflow by cord adduction; the driving pressure in the airway below the cords (sub-glottal pressure) then forces the cords apart and a spring-like action closes them again. This cycle repeats in oscillatory fashion until the singing breath is exhausted or the ‘voicing’ ceased through voluntary action, by the moving apart (abduction) of the vocal cords. Such ‘valving’ of the airflow occurs at a frequency governed by the endowed mass and thickness of the vocal cords, and most importantly by the tension within them; this latter, a function of their length, is determined by the position of the different laryngeal cartilages, which is governed by activity in the extrinsic laryngeal muscles, supported by the intrisic ‘vocalis’ muscle. The actual sound is generated by the repeated compression and decompression of the gas particles immediately above the glottis, this process being acoustically magnified and harmonically enriched by the resonance and filtering properties of the vocal tract above. But control over the harmonic balance, and hence over the timbre of the voice, can only occur within the harmonic range determined by the overall frequency content of sound emission from the cords themselves.
These scientific facts only complement that which those versed in the Gregorian and Bel Canto traditions already knew, and which is still emphasized: that the aesthetic goal of perfectly sung vowels can only be met, with few exceptions, by years of diligent practice. The trained singer learns to control these properties, not through proprioception, as with the learning of limb motor skills, but through the acoustic goal of the quality of the sounds produced. Interestingly — save perhaps for the low frequency (6–8 Hz) intensity modulation of tones that occurs in vibrato — the frequency of sung notes is not directly represented in the frequency of the neural commands to the laryngeal muscles; rather, the frequency of action potentials in the motor nerves is simply that which is necessary to generate the muscle tensions for the intended note. Thus a larynx removed from a cadaver will generate rich, pure tones if the vocal cords are manually adducted in the presence of a supplied flow of air: a macabre scientific fact about the production of human sound in stark contrast to aesthetic considerations!
Musical prosodySimulation however could never match the human skills used in singing, the way the physical attributes of intensity, pitch, and harmonic content are used serially to create stress and intonation. These, together with tempo and rhythm, link the unitary phonetic events (phonemes) of consonant, syllable, and fricative into the linguistically-complete words which symbolically represent, through verb and noun, both the world of action and things about us, and also our emotions (‘affect’) generated within.
The fundamental importance of prosody in relation to human speech and song, where the timing of stress within a word can determine its linguistic function as noun or verb, is well expressed in ‘office psalmody’. There each syllable is represented by a single or sustained note as in Dom in nus vo bis cum, and it is only the entire tonal progression of notes and intervals within this simple vocal line which conjoins the phonemes into linguistically meaningful words. The control over sound intensity needed to produce a beautiful, sustained tone at constant or smoothly-changing pitch and intensity in legato, or needed for stress, as in vocal attack, is wholly dependent on the dynamics of the pressure (sub-glottal pressure) that drives the expiratory airflow and that can be said to ‘power’ vocalization. The release of this pressure for the normal, vocally ‘clear’ attack (coup de glotte), or the mezza voce or ‘breath’ attack, requires precise timing of respiratory and laryngeal movements. If this timing fails, tones are slurred; if the vocal cords do not fully oppose a ‘breathy’ sound is produced. It is the timing of such skilled movements that is probably disrupted in the particular disturbance of vocalization (dysarthria) associated with lesions of the cerebellum, a structure intimately involved in motor ‘learning’ and now shown by imaging techniques to be active during vocalization. These and related topics, including the mechanical characteristics of breathing, bear also on the usually contentious matter of breath control in singing.
One surprising feature of the scientific analysis of breathing movements in singing is the finding that, except at high lung volume and at the onset of high notes at low intensity, the diaphragm is not actively involved, contrary to the emphasis given to the diaphragm's importance by voice teachers. Pressure measurements have shown that the diaphragm is mainly in a passive state, not undergoing active contraction through nervous control, so that it does not directly power sound production. However, it does serve mechanically to couple abdominal and ribcage motions, in which case abdominal muscle activity would relieve the ribcage from gravitational effects due to the mass of visceral organs. This frees the ribcage to contribute to the dynamics of subglottal pressure changes used in vocal stress. Nevertheless, the sensations generated in the chest wall during singing have traditionally been referred to the diaphragm, and this practice will doubtless continue.
When, through lifetime learning, habit, and experience, our movements become more and more automatic, the total sensory experience associated with them reduces to one of ‘effortless’ action that barely intrudes into consciousness. Breath and laryngeal control during speech and singing epitomize this state and when achieved set the cornerstone of supreme vocal performance, freeing the mind to dwell solely on artistic matters of interpretation — and it is these which eventually unite singer and audience.
Crocker, R. L. (1966). A history of musical style. Mcgraw-Hill, New York.
Deacon, T. (1997). The symbolic species. The Penguin Press, Allen Lane.
Sears, T. A. (1977). Some neural and mechanical aspects of singing. In Music and the brain, (ed. M. Critchley and R. A. Henson), pp. 78–94. Heineman, London.
See also larynx; music and the body; voice.