A vocoder (a portmanteau of vox/voc (voice) and encoder) is a speech analyzer and synthesizer. Speech refers to the processes associated with the production and perception of Sounds used in Spoken language. It was originally developed as a speech coder for telecommunications applications in the 1930s, the idea being to code speech for transmission. The 1930s were described as an abrupt shift to more radical and conservative lifestyles as countries were struggling to find a solution to the Great Depression. In Communications a code is a rule for converting a piece of Information (for example a letter, Word, Phrase, or Its primary use in this fashion is for secure radio communication, where voice has to be digitized, encrypted and then transmitted on a narrow, voice-bandwidth channel. The vocoder has also been used extensively as an electronic musical instrument. An electronic musical instrument is a Musical instrument that produces its sounds using Electronics.
The vocoder is related to, but essentially different from, the computer algorithm known as the "phase vocoder". A phase vocoder is a type of Vocoder which can scale both the Frequency and Time domains of audio signals by using phase information
Whereas the vocoder analyzes speech, transforms it into electronically transmitted information, and recreates it, the voder (from Voice Operating Demonstrator) generates synthesized speech by means of a console with fifteen touch-sensitive keys and a foot pedal, basically consisting of the "second half" of the vocoder, but with manual filter controls, needing a highly trained operator. [1]
Contents |
The human voice consists of sounds generated by the opening and closing of the glottis by the vocal cords, which produces a periodic waveform with many harmonics. Kraftwerk (ˈkʁaftvɛɐk German for " power plant " or " Power station " is an influential Electronic music band from The glottis is defined as the combination of the Vocal folds and the space in between the folds (the Rima glottidis) The vocal folds, also known commonly as vocal cords, are composed of twin infoldings of Mucous membrane stretched horizontally across the Larynx. In Acoustics and Telecommunication, the harmonic of a Wave is a component Frequency of the signal that is an Integer This basic sound is then filtered by the nose and throat (a complicated resonant piping system) to produce differences in harmonic content (formants) in a controlled way, creating the wide variety of sounds used in speech. An audio filter is a type of filter used for processing Sound signals. In Physics, resonance is the tendency of a system to Oscillate at maximum Amplitude at certain frequencies, known as the system's A formant is a peak in the Frequency spectrum of a sound caused by acoustic Resonance. There is another set of sounds, known as the unvoiced and plosive sounds, which are not modified by the mouth in the same fashion. Voice or voicing is a term used in Phonetics and Phonology to characterize speech sounds, with sounds described as either voiceless A stop, plosive, or occlusive is a Consonant sound produced by stopping the airflow in the Vocal tract.
The vocoder examines speech by finding this basic carrier wave, which is at the fundamental frequency, and measuring how its spectral characteristics are changed over time by recording someone speaking. In Telecommunications, a carrier wave, or carrier is a Waveform (usually Sinusoidal) that is modulated (modified with an input signal The fundamental tone, often referred to simply as the fundamental and abbreviated fo, is the lowest frequency in a harmonic series. This results in a series of numbers representing these modified frequencies at any particular time as the user speaks. In doing so, the vocoder dramatically reduces the amount of information needed to store speech, from a complete recording to a series of numbers. To recreate speech, the vocoder simply reverses the process, creating the fundamental frequency in an oscillator, then passing it through a stage that filters the frequency content based on the originally recorded series of numbers. An electronic oscillator is an Electronic circuit that produces a repetitive electronic signal often a Sine wave or a Square wave.
Most analog vocoder systems use a number of frequency channels, all tuned to different frequencies (using band-pass filters). An analog or analogue signal is any continuous signal for which the time varying feature (variable of the signal is a representation of some other A band-pass filter is a device that passes frequencies within a certain range and rejects ( Attenuates frequencies outside that range The various values of these filters are stored not as the raw numbers, which are all based on the original fundamental frequency, but as a series of modifications to that fundamental needed to modify it into the signal seen in the output of that filter. During playback these settings are sent back into the filters and then added together, modified with the knowledge that speech typically varies between these frequencies in a fairly linear way. The result is recognizable speech, although somewhat "mechanical" sounding. Vocoders also often include a second system for generating unvoiced sounds, using a noise generator instead of the fundamental frequency.
The first experiments with a vocoder were conducted in 1928 by Bell Labs engineer Homer Dudley, who eventually patented it in 1935. Bell Laboratories (also known as Bell Labs and formerly known as AT&T Bell Laboratories and Bell Telephone Laboratories) is the Research organization Homer W Dudley (1896-1987 was a pioneering electronic and acoustic engineer who created the first electronic voice synthesizer for Bell Labs in the 1930s and led the development Dudley's vocoder was used in the SIGSALY system, which was built by Bell Labs engineers (Alan Turing was briefly involved) in 1943. In Cryptography, SIGSALY (also known as the X System, Project X, Ciphony I, and the Green Hornet) was a secure speech Bell Laboratories (also known as Bell Labs and formerly known as AT&T Bell Laboratories and Bell Telephone Laboratories) is the Research organization Alan Mathison Turing, OBE, FRS (ˈt(jʊ(ərɪŋ (23 June 1912 &ndash 7 June 1954 was an English Mathematician The SIGSALY system was used for encrypted high-level communications during WW-II. In Cryptography, SIGSALY (also known as the X System, Project X, Ciphony I, and the Green Hornet) was a secure speech Later work in this field has been conducted by James Flanagan. James Loton Flanagan is an electrical engineer and was Rutgers University 's vice president for research until 2004
Since the late 1970s, most non-musical vocoders have been implemented using linear prediction, whereby the target signal's spectral envelope (formant) is estimated by an all-pole IIR filter. This article is about the Decade 1970-1979 For the Year 1970 see 1970. Linear prediction is a mathematical operation where future values of a discrete-time signal are estimated as a linear function of previous samples IIR redirects here For the conference company IIR see Informa. In Electronics, Computer science and Mathematics, a digital filter is a system that performs mathematical operations on a sampling, In linear prediction coding, the all-pole filter replaces the bandpass filter bank of its predecessor and is used at the encoder to whiten the signal (i. e. , flatten the spectrum) and again at the decoder to re-apply the spectral shape of the target speech signal. In contrast with vocoders realized using bandpass filter banks, the location of the linear predictor's spectral peaks is entirely determined by the target signal and need not be harmonic, i. In Acoustics and Telecommunication, the harmonic of a Wave is a component Frequency of the signal that is an Integer e. , a whole-number multiple of the basic frequency.
Even with the need to record several frequencies, and the additional unvoiced sounds, the compression of the vocoder system is impressive. Standard systems to record speech record a frequency from about 500 Hz to 3400 Hz, where most of the frequencies used in speech lie, which requires 64kbit/s of bandwidth (the Nyquist rate). In Signal processing, the Nyquist rate is two times the bandwidth of a Bandlimited signal or a bandlimited channel However a vocoder can provide a reasonably good simulation with as little as 2400 bit/s of data rate, a 26× improvement.
Several vocoder systems are used in NSA encryption systems:
(ADPCM is not a proper vocoder but rather a waveform codec. STE redirects here for other meanings see STE (disambiguation. ITU has gathered G. 721 along with some other ADPCM codecs into G. 726. )
Vocoders are also currently used in developing psychophysics, linguistics, computational neuroscience and cochlear implant research.
For musical applications, a source of musical sounds is used as the carrier, instead of extracting the fundamental frequency. Music is an Art form in which the medium is Sound organized in Time. For instance, one could use the sound of a synthesizer as the input to the filter bank, a technique that became popular in the 1970s. This article is about the Decade 1970-1979 For the Year 1970 see 1970.
In 1970, electronic music pioneers Wendy Carlos and Robert Moog developed one of the first truly musical vocoders. Wendy Carlos (born Walter Carlos, November 14 1939 is an American Composer and electronic Musician. Dr Robert Arthur Moog (ˈmoʊɡ to rhyme with "rogue" ( May 23, 1934 &ndash August 21, 2005) was an American pioneer of A 10-band device inspired by the vocoder designs of Homer Dudley, it was originally called a spectrum encoder-decoder, and later referred to simply as a vocoder. The carrier signal came from a Moog modular synthesizer, and the modulator from a microphone input. The modular synthesizer is a type of Synthesizer consisting of separate specialized modules connected by wires (patch cords to create a so-called patch. The output of the 10-band vocoder was fairly intelligible, but relied on specially articulated speech. Speech refers to the processes associated with the production and perception of Sounds used in Spoken language. Later improved vocoders use a high-pass filter to let some sibilance through from the microphone; this ruins the device for its original speech-coding application, but it makes the "talking synthesizer" effect much more intelligible. A sibilant is a type of Fricative or Affricate Consonant, made by directing a jet of air through a narrow channel in the Vocal tract towards
Carlos' and Moog's vocoder was featured in several recordings, including the soundtrack to Stanley Kubrick's A Clockwork Orange, in which the vocoder sang the vocal part of Beethoven's Ninth Symphony. A Clockwork Orange is a 1971 Satirical Science fiction Film adaptation of a 1962 novel of the same name, by Anthony Ludwig van Beethoven ( English ˈlʊdvɪg væn ˈbeɪtoʊvən, 16 December 1770 &ndash 26 March 1827 was a German Composer and Pianist. Also featured in the soundtrack was a piece called "Timesteps," which featured the vocoder in two sections. Originally, "Timesteps" was intended as merely an introduction to vocoders for the "timid listener", but Kubrick chose to include the piece on the soundtrack, much to the surprise of Wendy Carlos.
One of the first rock songs to feature a vocoder was "The Raven" by progressive rock band The Alan Parsons Project on their 1976 album Tales of Mystery and Imagination, and also was used on later albums such as I Robot. " The Raven " is a 1976 song by The Alan Parsons Project from their album Tales of Mystery and Imagination. Progressive rock (often shortened to " progressive " " prog " or " prog rock " is a form of Rock music that evolved The Alan Parsons Project was a British Progressive rock band active between 1975 and 1990 founded by Eric Woolfson and Alan Parsons. Tales of Mystery and Imagination is a Progressive rock album by The Alan Parsons Project, released in 1976 (see 1976 in music) I Robot is a Progressive rock Album recorded by The Alan Parsons Project, engineered by Alan Parsons and Eric Woolfson Following Alan Parsons' example, vocoders began to appear in pop music in the late 1970s, for example, on disco recordings. Alan Parsons (b 20 December 1948 in London) is a British Audio engineer, Musician, and Record producer Pop music as a genre features a noticeable rhythmic element catchy melodies and hooks, a mainstream style and conventional structure Disco is a Genre of dance-oriented music whose origins are hard to define Jeff Lynne of Electric Light Orchestra used the vocoder in several albums such as Time. Jeffrey Lynne (born 30 December 1947 in Shard End, Birmingham) is a Grammy Award -winning English rock Songwriter Time is a Concept album by Electric Light Orchestra released in 1981. Pink Floyd made extensive use of the vocoder on the album Animals, even going so far as to put the sound of a barking dog through the device. Pink Floyd are Animals is the debut album by Oxford band This Town Needs Guns. Another example is Giorgio Moroder's 1977 album From Here to Eternity. Hansjörg "Giorgio" Moroder (born on April 26 1940 in Urtijëi, Italy) is an Italian Record producer, songwriter From Here To Eternity is a 1977 album composed produced and performed by Giorgio Moroder. Vocoders are often used to create the sound of a robot talking, as in the Styx song "Mr. Roboto". Styx ( pronounced: /stɪks/ is an American Rock band. Their hit songs have included " Come Sail Away " " Babe " " " Mr Roboto " is a song written by Dennis DeYoung and performed by the band Styx on their 1983 Concept album Kilroy Was Here It was also used for the introduction to the Main Street Electrical Parade at Disneyland. The Main Street Electrical Parade was a regularly-scheduled parade created by Bob Jani, famous for its long run at Disneyland at the Disneyland Resort The late R&B singer Roger Troutman used the vocoder extensively in his songs from the 70s up until the late 80s. Roger Troutman ( November 29 1951 &ndash April 25 1999) was the lead singer of the band Zapp who helped to pave the way for
Vocoders have appeared on pop recordings from time to time ever since, most often simply as a special effect rather than a featured aspect of the work. The illusions used in the Film, Television, Theater, or Entertainment industries to simulate the imagined events in a story are traditionally called However, many experimental electronic artists of the New Age music genre often utilize vocoder in a more comprehensive manner in specific works, such as Jean Michel Jarre (on Zoolook, 1984) and Mike Oldfield (on Five Miles Out, 1982). New Age music is peaceful Music of various styles which is intended to create inspiration relaxation and positive feelings often used by listeners for Yoga, Jean-Michel André Jarre (born 24 August 1948, Lyon) is a French Composer, performer and Music producer. Zoolook is the fifth album by Jean Michel Jarre, and released in 1984 on Disques Dreyfus. Michael Gordon Oldfield (born 15 May 1953 in Reading, Berkshire) is an English Multi-instrumentalist Musician Five Miles Out is a record album written and mostly performed by Mike Oldfield. There are also some artists who have made vocoders an essential part of their music, overall or during an extended phase. Examples include the German synthpop group Kraftwerk, jazz/fusion keyboardist Herbie Hancock during his late 1970s disco period, French jazz organist Emmanuel Bex, Patrick Cowley's later recordings and more recently, avant-garde pop groups Trans Am, Black Moth Super Rainbow, Daft Punk, the Christian synthpop band Norway, as well as metal bands such as At All Cost and The Devil Wears Prada. Synthpop is a subgenre of New Wave and Pop music in which the Synthesizer is the dominant musical instrument Kraftwerk (ˈkʁaftvɛɐk German for " power plant " or " Power station " is an influential Electronic music band from Herbert Jeffrey Hancock ("Herbie" born April 12 1940 is a Jazz Pianist and Composer. Disco is a Genre of dance-oriented music whose origins are hard to define Patrick Joseph Cowley ( October 19, 1950 - November 12, 1982) was a Disco and Hi-NRG dance music composer and recording Trans Am is a three-piece band that performs a mix of Synth pop and rock music Black Moth Super Rainbow is an American experimental band from Pittsburgh, Pennsylvania. Daft Punk At All Cost is an experimental metalcore band from Austin Texas.
More recently, the combination of using vocoders and pitch-correction software can be heard in the 2007 single "Sensual Seduction" by Snoop Dogg. Year 2007 ( MMVII) was a Common year starting on Monday of the Gregorian calendar in the 21st century. "Sexual Eruption" (also known by Cordozar Calvin Broadus Jr (born October 20 1971 His mother nicknamed him " Snoopy " as a child because of the way he dressed and because of his love of Clay Aiken's Falling, on the album On My Way Here, also employs a vocoder effect in the bridge. Clay Aiken (born Clayton Holmes Grissom on November 30, 1978) is an American pop Singer who began his rise to fame on the
"Robot voices" became a recurring element in popular music during the late twentieth century. Several methods of producing variations on this effect have arisen, of which the vocoder remains the best known and most widely-used. The following other pieces of music technology are often confused with the vocoder:
The Talk box (Sonovox), Autotuner, Linear predictive coding, Ring modulation, Speech synthesis, and Comb filter. A talk box is an effects device that allows a musician to modify the sound of a Musical instrument. Auto-Tune is a proprietary Audio processor created by Antares Audio Technologies which uses a Phase vocoder to correct pitch in vocal and instrumental Linear predictive coding ( LPC) is a tool used mostly in Audio signal processing and Speech processing for representing the Spectral envelope Ring modulation is a signal-processing effect in electronics related to Amplitude modulation or frequency mixing, performed by multiplying two signals where one Speech synthesis is the artificial production of human speech. In Signal processing, a comb filter adds a delayed version of a signal to itself causing constructive and destructive interference.
Cher's Believe is an example of a vocal sound that is often confused with the vocoder. Cher ( IPA: /ʃɛr/ born Cherilyn Sarkisian, May 20 1946 " Believe " is a Grammy Award winning global number one Multi-Platinum Dance Song which served as the world-wide lead single for American singer Mark Taylor, producer of Believe, used the vocoder minimally where the effect would become most striking. " Believe " is a Grammy Award winning global number one Multi-Platinum Dance Song which served as the world-wide lead single for American singer For example, in the lyrics "Do you believe in life after love?", the vocoder was used by shifting it into "Do you believe in life after lo-" instead of "-love" which added the dimension Taylor was looking for. Because Taylor was concerned with keeping the trade secret of the true effect on the song, he originally attributed the effects to the Digitech Talker [1] [2] which he claimed to produce the synthesized sound, but not before the vocals were filtered through the Drawmer DS404 [3]. However, as later defined as the Cher effect, the vocal effects were in fact created by Auto-Tune, used deliberately to create the effect. The Cher effect is an informal term used to describe extreme digital Pitch correction of singing to produce a semi-artificial voice Sound effect (who also use Auto-Tune is a proprietary Audio processor created by Antares Audio Technologies which uses a Phase vocoder to correct pitch in vocal and instrumental [4]
The sub-page Robotic voice effects includes more detailed comparisons. "Robot voices" became a recurring element in popular music during the late twentieth century, and several methods of producing variations on this effect have arisen
Vocoders have also been used in television, film and games usually for robots or talking computers:
Most recently a famous song that is composed completely using vocoder is Imogen Heap's song "Hide and Seek" Where Heap plays full chords through her vocal without any other acompaniment
Example of vocoder
"Mr. Blue Sky" by the Electric Light Orchestra (1977)