This is the under-the-hood view of the keyboard for the Voder (Voice Operating Demonstrator), the first electronic device capable of generating continuous human speech. It accomplishes this feat through a series of keys that generate the syllables, plosives, and affricatives normally produced by the human larynx and shaped by the throat and tongue. This week’s film is a picture montage paired with the audio from the demonstration of the Voder at the 1939 World’s Fair.
The Voder was created by one [Homer Dudley] at Bell Laboratories. He did so in conjunction with the Vocoder, which analyzes human-generated speech for encrypted transfer and re-synthesizes it on the other end. [Dudley] spent over 40 years researching speech at Bell Laboratories. His development of both the Voder and the Vocoder were instrumental in the SIGSALY project which aimed to deliver encrypted voice communication to the theatres of WWII.
In this film, the Voder is first demonstrated with a flat, robotic rendition of the phrase “she saw me”. The operator then runs through the various possible inflections to show the flavor that the foot pedal provides. Inside the Voder is a group of band pass filters in parallel that span the frequency range of human speech. Excitations are received from either the noise generator or the relaxation oscillator, and selection between the two is made from the wrist bar. The pitch is controlled with the foot pedal. The band pass outputs are fed to ten gain pots under the operators fingers. Three additional keys manipulate the excitations to produce the consonant stop sounds like /t/, /d/, /p/, /b/, /k/, and /g/.
Voder’s pitch can be adjusted to emulate all kinds of voices, from man to woman to child. It is capable of speaking an any language the operator can speak. As a special bonus, Voder makes very convincing cow and pig sounds.
In creating the Voder, it was discovered that non-inflected vowels sounded like a foghorn, so vibrato was added to make them more human. This of course means that Voder can sing, and the operator gives a heartwarming performance of “Auld Lang Syne”.
For an operator, getting the Voder to speak is a difficult undertaking. Generating a single word requires the keying of several sounds in quick succession, along with simultaneous wrist bar action and pedal work to color the inflection. Bell Labs auditioned a few hundred girls to train in Voder operation, but ultimately had fewer than 30 expert operators. [Helen Harper], who you hear in this film, was considered the best. According to [Helen], mastery required about a year of constant practice.
[Thanks to Fran for the tip!]
[Voder keyboard image source]
http://www.youtube.com/watch?v=Bht96voReEo
Thread over.
Developed for singing. Yeah, sure.
What does a fleshlight prototype have to do with this thread?
It should have been a 3d printed fleshlight.
Everyone knows it should sing “Daisy”
My father, Stan Watkins, after perfecting the speech of Pedro, the Voder, also taught it to sing Daisy (I have a recording of that, alongside one of Hal the computer in 2001 Space Odessey singing Daisy as he dies. Arthur C Clarke said he got the idea from a friend at the Bell Labs.
What, no schematic?
This blows Vocaloid away!
The phone companies still use it today. Some services do not use a person for voice response. In fact…. Some of the PBX services use it. Now we have voice synthesis instead of that method……
Do we really live in modern times?
Bell Labs offered “Science Kits” when I was in high school in the late 60’s, one which taught you about speech synthesis It was probably the first mderately complex electronic kit I built that actually worked.
http://www.beatriceco.com/bti/porticus/bell/belllabs_kits_ss.html
More on SIGSALY here:
http://www.nsa.gov/about/cryptologic_heritage/center_crypt_history/publications/sigsaly_story.shtml#6
and more details:
http://www.nsa.gov/about/cryptologic_heritage/center_crypt_history/publications/sigsaly_start_digital.shtml
The Voder (or at least a Voder) still exists; it’s on display in the lobby at one of AT&T Labs facilities in NJ.
When Stan Watkins retired from the Labs in 1948, they offered him one of the Voders he had worked so hard to make talk. But as they would not pay for its transportation to England he declined. I’m glad to know there is still one on view.
I’ve been working directly with Bell Labs to produce a modern replica of the VODER. It was just completed and delivered April 8, 2016. We have found two of the original ‘practice’ equipment racks for the VODER, but our team would love to hear from anyone with information about the location of any of the operator consoles (with the keyboards). They have been lost from the AT&T archives.
Kudos for doing this. I have lots of information from my Dad’s work on the original 1936-39 Voder but nothing about the consoles except photos whichI think have already been posted online.
I’d love to know your company (or is this an independent project?) and what you propose doing with the Voder. Do keep in touch.
I apologize for not seeing this until now. My company is Synthetic Sound Labs in NJ. http://www.steamsynth.com. I’d love to chat sometime. How do we arrange that without the world getting our private info?
pretty cool
Miku sure has a very unique family tree
Current cell phones only transmit 2 octaves (4to1) range, the high pitches of diction. The hum-tone of the voice is encoded this old fashioned way! That’s why people sound so fake on a cell phone. Imagine with it fixed pitch. Robot voice on every call.
Coming soon: 5 octave cell phones. They will finally sound about as good as an old fashioned landline.
Weird. People sound exactly the same on the cell phone as they do in person on my phone. Are you still carrying around a startac or something? Or maybe you just have a bunch of robots calling you?
My old StarTAC sounded pretty darn good, with a dynamic range similar (to my ears) to that of a landline phone. It is actually my favorite cell phone of all the ones I’ve used over the years.
Well, maybe someone pranked echodelta’s phone and made it so that it calls chatbots instead of the people they think they are calling.
Point is: wtf? I forgot after thinking up that brilliant idea for a prank!
Analog mobile voice signals at one time carried nearly full fidelity voice signals. This is part of the reason analog used so much power. The power needs and the throughput are two reasons the analog signals gave way to digital. Our ‘modern’ mobile phones sound horrible compared with the first mobile phones, which were analog.
Umm…ok. If you say so.
But, hey my 8 track is much better than my CD player, too so I get it.
:/
justice099: No, it’s true. The reason is that developments in cell phones have been more about reducing the bit rate required, than about improving voice quality. The only objective has been, how few bits can we use while still sounding “good enough”? Of course, “good enough” is a bit of a moving target that depends very much on how badly the developer wants to reduce the bit rate.
At one point, Bell Labs rejected the idea of using satellites for relaying telephone calls, because their research showed that people could discern latency (delays) as short as a few tens of milliseconds in conversation, and this exceeded the round-trip time for satellite communications. Now we are happy to use services that delay the voice by more than a second. Yeah, it sounds crappy, but it gets the job done.
It’s not necessarily the limited frequency band that makes the voice sound bad, it’s the way it is encoded: http://www.radio-electronics.com/info/cellulartelecomms/gsm_technical/audio-codecs-vocoders-amr-celp.php
I don’t know how things are around the world, but here in europe I have enjoyed HD voice for quite some years on all networks I tried. Of course, the other person on the line has to have the right phone as well..
https://www.youtube.com/watch?v=MV7PJ5uolgk
You’re welcome. Posted Dave Tomkins book last week and now we have this.
We dodged a bullet with the Voder. At its debut there was other new technology demonstrated at the same fair that combined with the Voder and tape recordings could have made voice activated voice mail and robo-calling with synthesized voice possible.
Just imagine Ma Bell with huge banks of tape recorders to store messages and Voder voices on tape saying things like “You have. Zero. New messages.” “To delete. Message. Say Seven.”
Thankfully, that mashup wasn’t conceived until well into the solid state digital electronics era!
My father, Stan Watkins, Bell Labs engineer, taught the Voder to talk in work 1936, and then taught 28 telephonists to play it for the Worlds Fair in NY and the Golden Gate Exposition in 1939. And he also taught the Voder to sing Daisy. I have the record. Look at my blog which will be featuring my father’s work (when I stop doing the NoBloPoMo blog-a-day). My site is stanwatkins.wordpress.com.
They say in the video that they had no thought of commercial applications, but I think the voder was an early attempt at speech compression. It is similar to linear predictive coding, which compresses by using a model of the vocal tract, so that only the parameters of the model need to be transmitted. I think Bell Labs deemed the voder a dead end, which is why they dismiss it as an educational demonstration. Well, if it really was just an educational demonstration, why did they test 300 girls to find 20 who could become proficient?
Confusing the VODER and the VOCODER is fairly common and has led to much confusion. The VOCODER (Voice Coder) was indeed developed as an early form of speech compression, and later used in the War efforts to encrypt speech (SIGSALY). You are correct that the VOCODER technology led to many other forms of speech compressions, including LPC. The VODER however, was developed mostly as a publicity stunt for the 1939 World’s Fair & SF Expo and did use much of the technology that was developed for the VOCODER. While the BL team tried to find commercial / assistive applications for the VODER, it proved too difficult to learn to become practical – also the reason that only 20 or so girls passed the muster to become Voderettes after intense training for about a year.
Had Bell Labs and some other companies working on voice synthesis, recording, compression and transmission put their knowledge together after WW2 they could have invented voice mail. Another company demonstrated simple voice control tech at the same fair Bell Labs demonstrated the Voder.
AT&T could’ve sold it as a premium service with a computer built to operate several Voders, and voice command to recognize a few words like Next, Back, Delete, Keep. For the outgoing message and incoming message storage each customer would’ve had a dedicated reel tape recorder. I’d expect for the era that to pay for it they would’ve needed the biggest 1,000 business phone customers in the USA as clients.
What would be the selling point, especially when there were automatic phone recorders using Edison cylinders or belts that recorded in Edison style before WW2? (One of the belt type can be seen in one of the old Blondie movies.) The convenience of being able to get messages from anywhere there’s a phone, without having to rely on being called by a secretary, and trusting the secretary’s transcription was correct. Wouldn’t have to play phone tag with the office or keep the office informed of your location or wait by a phone for a message call.
In other words exactly like the voicemail we have now, with the exception of not having mobile phones. Companies like Fuller Brush or Watkins that had legions of traveling salesmen would’ve loved it. Their sales force could’ve called in orders around the clock, and then been able to retrieve messages about their order status at any time. Time they’d previously spent waiting for calls or having to make calls in specific time periods could be used for selling product.
Voicemail in the late 1940’s and 1950’s would’ve been a ‘force multiplier’ by un-tying business from the 9-5 workday, and it would’ve been a godsend to people working on the opposite coast from their home office. For example in Pacific or Mountain time, dealing with an Eastern time company is done by 2 or 3 o’clock local time. That pushes afternoon order placing into the next day. But if the eastern company had voicemail, they’d have late orders ready to process first thing in the morning, before the people out West had their breakfast.
But those companies just saw their inventions as interesting experiments and abandoned them. Voicemail would have to wait over 40 years to be invented. In the early 80’s business uptake of VMX was slow because the systems were expensive, it wasn’t originally sold as a telco service. There were also FAX machines and pagers and tape answering machines to communicate without having to have people sitting by telephones. Telco provided voicemail only became a thing quite a while after “Ma Bell” saw that several large businesses were operating their own systems and the Telcos were not getting a piece of that action.
Well, they DID. Only they were called answering machines, and when new, they were conglomerations of relays, motors, belts, and other mechanical nightmares that cost a fortune and broke down frequently. What they didn’t have were microcontrollers and solid state memory.
What makes you think I’m confusing Voder with Vocoder? I’m well aware of the differences. The Vocorder was indeed a form of speech compression; the Voder did only the synthesis side, with no matching analysis device. The problem being that analyzing speech for formant content is a much more difficult problem.
If you really believe that Bell Labs recruited and trained 300 women to find 20 proficient Voder operators, just for a publicity stunt, well, I’ll let you figure that out.