This is the under-the-hood view of the keyboard for the Voder (Voice Operating Demonstrator), the first electronic device capable of generating continuous human speech. It accomplishes this feat through a series of keys that generate the syllables, plosives, and affricatives normally produced by the human larynx and shaped by the throat and tongue. This week’s film is a picture montage paired with the audio from the demonstration of the Voder at the 1939 World’s Fair.
The Voder was created by one [Homer Dudley] at Bell Laboratories. He did so in conjunction with the Vocoder, which analyzes human-generated speech for encrypted transfer and re-synthesizes it on the other end. [Dudley] spent over 40 years researching speech at Bell Laboratories. His development of both the Voder and the Vocoder were instrumental in the SIGSALY project which aimed to deliver encrypted voice communication to the theatres of WWII.
In this film, the Voder is first demonstrated with a flat, robotic rendition of the phrase “she saw me”. The operator then runs through the various possible inflections to show the flavor that the foot pedal provides. Inside the Voder is a group of band pass filters in parallel that span the frequency range of human speech. Excitations are received from either the noise generator or the relaxation oscillator, and selection between the two is made from the wrist bar. The pitch is controlled with the foot pedal. The band pass outputs are fed to ten gain pots under the operators fingers. Three additional keys manipulate the excitations to produce the consonant stop sounds like /t/, /d/, /p/, /b/, /k/, and /g/.
Voder’s pitch can be adjusted to emulate all kinds of voices, from man to woman to child. It is capable of speaking an any language the operator can speak. As a special bonus, Voder makes very convincing cow and pig sounds.
In creating the Voder, it was discovered that non-inflected vowels sounded like a foghorn, so vibrato was added to make them more human. This of course means that Voder can sing, and the operator gives a heartwarming performance of “Auld Lang Syne”.
For an operator, getting the Voder to speak is a difficult undertaking. Generating a single word requires the keying of several sounds in quick succession, along with simultaneous wrist bar action and pedal work to color the inflection. Bell Labs auditioned a few hundred girls to train in Voder operation, but ultimately had fewer than 30 expert operators. [Helen Harper], who you hear in this film, was considered the best. According to [Helen], mastery required about a year of constant practice.
[Thanks to Fran for the tip!]
[Voder keyboard image source]