Sound devices are computer peripherals that produce, manipulate, or record sound or electronic signals representing sound. Virtually all modern music and movie sound production is done digitally using computer sound devices.
Speakers and Signals
Whether it is in a computer, a pair of headphones, or a stereo system, a speaker produces sound by causing the cone (external surface) to vibrate. The cone, often made of paper, is attached to the voice coil, an electromagnet. Behind the voice coil is a second magnet. By sending a positive or negative electrical signal to the voice coil, it is alternatively attracted to and repelled from the second magnet. By continuously alternating the signal between positive and negative, the voice coil, and therefore the cone, can be made to vibrate. The faster the material vibrates, the higher the frequency of the sound; the larger the vibration, the louder the sound.
Instead of directly recording or manipulating a continuous, continuously varying (analog) signal, sound devices record a digital signal, which is a series of whole numbers, represented in binary notation, at regular intervals. For example, a compact disc records a number between 0 and 65,535 at a rate of 44,100 times per second. In order to move from the analog world of speakers and microphones to the digital one, a device must sample (measure), quantize (divide), and compand (renumber) the signal.
First, the analog signal is sampled once in each interval (in this example, every 1/44,100 of a second). Any changes that happen between samples are lost. Second, the entire range of the signal, from zero volume to full volume, is quantized into sections (in this example, 65,535 of them) and the value of each sample is moved to the closest quantization level (dividing line). Third, each dividing line is renumbered, or companded, to be a whole number (in this example, between 0 and 65,535).
These samples can now be easily converted into binary data (in this example, 16 bits per sample) and recorded to a digital medium such as a hard drive, floppy drive, or compact disc. In order to reconstruct (play back) the sound, the binary data are read from the media, companded back to the original range (no volume to full volume), and each sample is played until a new sample is found (in this example, every 1/44,100 of a second).
Encoding and Compression
Converting an analog signal to a digital one is called "encoding." The encoding method described earlier is Linear Pulse Coding Modulation. This means that the quantization levels are equally spaced (linear), the signal is sampled regularly (pulse), the signal is encoded (coding), and the signal is being converted from an analog signal (modulation). The human ear, however, is better at detecting changes in quiet sounds than those in loud sounds. In linear encoding, then, the ear often cannot distinguish between two adjacent levels in loud sound.
Most encoding methods take advantage of this by placing the levels very close together at low volumes and further apart for high volumes. Every sound format (e.g., WAVE from Microsoft and IBM, AIFF from Apple, -Law from NeXT and Sun Microsystems) has its own method of nonlinear quantization, though most space the levels logarithmically.
With carefully chosen quantization levels, sound can take roughly four fewer bits per sample than a linearly encoded sound of the same quality. In general, if the digital signal is changed to contain less information (e.g., fewer bits per sample), then the signal has been "compressed."
Compression is important because it takes an extraordinary amount of binary code to represent high-quality digital sound. A three-minute song, recorded accurately enough for the human ear, requires 30 megabytes of storage. A compact disc, though it can store more than 700 megabytes of data, can only hold 74 minutes of music.
Many encoding techniques compress the data after they have been digitized (sampled) and companded. Though these techniques are " lossy " (they degrade the quality of the recording), like non-linear encoding, they take advantage of characteristics of human hearing to minimize the audible effect of those losses. No matter what method is used, however, there is always a trade off between the quality of the sound and the amount of compression.
Like companding, there are many methods of compression. The MPEG-2 Layer III (MP3) format is particularly popular because it can compress sound by a factor of ten while reliably reproducing popular music. It also allows the user to trade off compression for quality by explicitly specifying the number of bits per second of music.
There are many other methods of encoding and compression, both public (e.g., differential pulse code modulation, adaptive differential pulse code modulation) and proprietary (e.g., RealAudio from RealNetworks, Advanced Streaming Format from Microsoft). The Musical Instrument Digital Interface (MIDI) is a method for electronic instruments to communicate and is also a form of encoding. Instead of encoding analog signals, it encodes the length, pitch, volume, and instrument of each note.
In a computer, digital sound data are read from a medium and decoded or uncompressed by the central processing unit, but the sound card performs the signal processing (companding, quantization, mixing, etc.). It is the sound card that digitizes the signal from the microphone or other input device or converts it to analog for the speakers.
In the case of MIDI encoding, the digital data only includes qualitative data about notes, but no actual recorded sound. For MIDI, sound cards have methods for emulating specific instruments. In "wave table synthesis," the sound card itself contains a short recording of each instrument, and is very accurate because it is based upon recordings of real instruments. In "frequency modulation synthesis" (FM synthesis), the sound card contains information about how to simulate each instrument. This method is less realistic because the simulation is not exact. In either case, by shifting the frequency of samples and combining them according to the MIDI data, the sound card can reproduce an entire piece of music. By using polyphony (imitating more than one instrument at the same time), the sound card can simulate entire groups of instruments.
Sound cards have other techniques for enhancing sound or making it more realistic. "Head-related transfer functions" allow the device to warp the music in order to make the sound seem to originate from somewhere other than the speakers. Digital filters can boost or change components of the sound (e.g., boost the bass) in ways that are difficult and expensive on analog equipment. Digital devices also rarely lose accuracy over time as all analog devices do.
Music, like any artistic work, is copyrighted under U.S. law. Like patents, copyrights were created so that people could make their work public without losing the right to protect and profit from their work. Copyright law dictates that the copyright holder has control over the distribution of the work, for profit or otherwise.
Though making audiotape "mixes" for other people is illegal, the labor-intensive quality of making them and the degradation of quality associated with copying from one tape to another has limited its popularity. Digital technology created new ways for users to "share" music widely without the sound quality problems associated with tape recordings. During the late 1990s, the speed of modern computers, the availability of high-speed Internet connections for private use, the low price of computer compact disc players and recorders, and the wide availability of compression techniques like MP3, turned computers into tools for obtaining and storing enormous amounts of music. These collections, like taped mixes, are only legal if their owner purchased the original tape or compact disc as well.
By the end of the twentieth century, the illegal copying and distribution of music became so widespread that the music industry feared such actions were substantially infringing on the rights of the copyright holders. The music industry began experimenting with cryptographic methods for ensuring that music is not copied. Industry watchers know that any solutions are likely to be temporary, since digital security is always vulnerable to the efforts of technologically savvy programmers who seek ways to break through new digital boundaries.
see also Codes; Coding Techniques; Music, Computer.
Salvatore Domenick Desiano
Berg, Richard E., and David G. Stork. The Physics of Sound, 2nd ed. Upper Saddle River, NJ: Prentice Hall, 1995.
Haring, Bruce. Beyond the Charts: MP3 and the Digital Music Revolution. Los Angeles, CA: JM Northern Media, LLC, 2000.
Kientzle, Tim. A Programmer's Guide to Sound. Reading, MA: Addison-Wesley, 1998.
Pohlmann, Ken. Principles of Digital Audio, 4th ed. New York: McGraw-Hill Professional Publishing, 2000.