Vocoding With A Piano

October 5, 2009

This really cool project allows a grand piano to “speak”. We don’t know any details about its construction but we had to share. The keys are being hit by solenoids in a manner to replicate human speech. Click through to the video, it’s worth it. You may have to allow the popup to see the video, and it is in german, but the piano is clearly speaking english. We want one to keep around the office. It could read our emails to us.

(Edit from 2015: The link went bad, but it can be found elsewhere on YouTube.)

[via matrixsynth]

43 thoughts on “Vocoding With A Piano”

Ben Ryves says:

October 5, 2009 at 12:36 pm

I think the link should be http://www.3sat.de/kulturzeit/tips/138237/index.html ?
Very interesting stuff, though I don’t think I’d have understood what it was “saying” without the subtitles..!

Report comment

Reply
arthur92710 says:

October 5, 2009 at 12:44 pm

That link does not work. Use the one in comment 1
That sounds wired. They should do the reverse. Play a song and translate it into text. Curious about the results.

Report comment

Reply
nate says:

October 5, 2009 at 12:48 pm

Here’s a direct link to the video: http://wstreaming.zdf.de/3sat/veryhigh/091002_klavier_kuz.asx

Report comment

Reply
Alan says:

October 5, 2009 at 12:58 pm

Actually unfortunately it is replicating an audio file of the document read by a kid. Think of it as making a speaker out of a piano.

Report comment

Reply
1. Greenaum says:
  
  November 12, 2015 at 2:10 am
  
  Yep but if the computer was using a speech synthesizer, the output would still be a waveform, just the same as recording it from a kid gives. I’m sure any old speech synth software would do, although apparently the guy with the very German hairdo put some work into tweaking it manually. I can’t think of any other way of making a piano speak than this.
  
  Report comment
  
  Reply
  1. Flávio says:
    
    February 25, 2016 at 11:41 am
    
    They just quantized a spectrogram at the frequencies of a normally tuned piano, then feed the result to a piano with solenoids on each key. There is a lot more compositions, not only this one… by the way,this one it’s called “Deus cantando”.
    
    Report comment
    
    Reply
Stromlo says:

October 5, 2009 at 12:59 pm

Awesome, but scary!

Report comment

Reply
Hackius says:

October 5, 2009 at 1:09 pm

They should have gotten someone without a thick accent.

Report comment

Reply
heegemcgee says:

October 5, 2009 at 1:11 pm

@alan:
Isn’t that splitting hairs? Are you any less amazed?

Report comment

Reply
pony says:

October 5, 2009 at 1:17 pm

@ alan
Not quite. It’s almost as though they are doing Fourier synthesis by combining keystrokes instead of sine waves.

Report comment

Reply
Kyle says:

October 5, 2009 at 1:37 pm

Nice! As pony said, considering that it’s a percussive instrument and not just a sine wave, that’s quite a feat!

Report comment

Reply
Bryan says:

October 5, 2009 at 1:48 pm

For some reason, that piano creeps me the hell out…..

Report comment

Reply
trueowen@verizon.net says:

October 5, 2009 at 2:05 pm

Would be awesome to have a whole band of instruments talking to each other.

Report comment

Reply
Ian says:

October 5, 2009 at 2:07 pm

@Bryan
I agree. It sounds creepy as hell.

If I had the time and money I’d pop the solenoids directly on the strings and go for more of a Kraftwerk “Man-Machine” sound. Maybe modulate the carrier by adjusting the dampening on the string.

Report comment

Reply
Ray says:

October 5, 2009 at 2:10 pm

@Alan
The piano makes the noise, yes like a speaker. What else exactly should it do? Not sure why it matters if the audio is coming from a recording or a voice synth program?

Report comment

Reply
will d. says:

October 5, 2009 at 2:11 pm

if that’s really just replicating an audio file, i wonder how it would sound replicating other instruments, or even a whole orchestra. this seems like something that could be done easily with a software piano synth.

Report comment

Reply
Ray says:

October 5, 2009 at 2:12 pm

And by like a speaker I mean in the abstract sense that its producing sound, regardless of the method.

Report comment

Reply
Josh says:

October 5, 2009 at 2:26 pm

This would make one hell of a haunted house prop. Simply amazing.

Report comment

Reply
nate says:

October 5, 2009 at 2:58 pm

I’m not sure how he’s doing it, but if it really is just converting frequency ranges to keystrokes, it should be dead simple to write a program to convert audio files to MIDI. I might play around with the idea some when I get the time.

Report comment

Reply
sly says:

October 5, 2009 at 3:59 pm

GLaDOS cometh

Report comment

Reply
spacecoyote says:

October 5, 2009 at 4:33 pm

There’s a shareware program called TS Audiotomidi that does this.

Report comment

Reply
Alex says:

October 5, 2009 at 4:38 pm

I uploaded this to youtube for those with slower connections.

http://www.youtube.com/watch?v=muCPjK4nGY4

Report comment

Reply
Cynyr says:

October 5, 2009 at 4:55 pm

A better direct link, since mplayer didn’t like the .asx one. mms://ondemand.msmedia.zdf.newmedia.nacamar.net/zdf/data/msmedia/3sat/09/10/091002_klavier_kuz_vh.wmv

Which is really just the contents of the .asx one. Why we need a link to a file that has a link the media in it, I have no idea.

Report comment

Reply
EdZ says:

October 5, 2009 at 5:01 pm

@Pony
That’s EXACTLY what they’re doing. You could probably increase the output quality by varying the strike velocity (the current implementation appears to be bang-bang). This could also be done to create a midi output, and drive any instrument(s) that can produce enough separate and relatively pure tones. Several guitars tuned to be slightly out of phase, for example, could work.
It works using existing speech, so is not a speech synthesiser. You could feed the output of a speech synthesiser into it though (as well as any other sort of sound file).

Report comment

Reply
razor says:

October 5, 2009 at 5:09 pm

Dude! Anybody here StrongBad fans? (www.homestarrunner.com) That sounds just like his Lappy 486 :P hehehe

Report comment

Reply
macegr says:

October 5, 2009 at 5:36 pm

It almost seems that this is less about combining frequencies to get a specific waveform, and more about hitting a lot of keys in rapid succession to get low frequency 1-bit audio. It sounds a lot like the speech samples you could get from the old computers that simply had on-off buzzers. By having all these keys in parallel, you can get a lot of plinks per second and overcome the mechanical limitations of a single key. Then you randomize the keypresses around a central frequency to color the overall sound impression with an overtone that appear to follow the sound sample.

Maybe they’re NOT doing it this way, but I can’t read German. :)

Report comment

Reply
Colin says:

October 5, 2009 at 6:59 pm

I’ve been thinking about something like this for a while but in reverse. Using human voice to accurately recreate the sound of other instruments (think a cappella but with a computer automatically creating the sheet music based on a sound recording as the input.)

If some one knows how they did this more precisely, it might help. The problem with just simply using a Fourier breakdown is the assumption of pure waves, and I haven’t thought of a good method of taking into account the overtones.

Any ideas?

Report comment

Reply
1. Galane says:
  
  November 12, 2015 at 4:05 pm
  
  Vocaloid?
  
  Report comment
  
  Reply
vic says:

October 5, 2009 at 9:01 pm

@Colin: Fourier’s transform is a particular case of Schmidt’s orthogonal projection from the space of periodic function to the subspace of sinusoidal functions … I’m pretty sure it is possible to project to any other subspace. Now my years of study are a bit far away so I’ll let it to you to go on from this point ;)

Report comment

Reply
Pilotgeek says:

October 5, 2009 at 10:44 pm

@sly
Yes… this is very glados-y and quite creepy. If there’s ever some sort of AI that gains consciousness and becomes crazy, it damn well better use one of these to vocalize.

Report comment

Reply
1. Galane says:
  
  November 12, 2015 at 4:06 pm
  
  What makes it creepy is they’re using a piano to deliver a manifesto that essentially calls the human race a disease afflicting “mother earth”.
  
  Report comment
  
  Reply
KI4MCW says:

October 6, 2009 at 12:03 am

As an experiment a couple of months back, I tried to use MIDI to imitate an SSTV waveform (similar to the audio of a fax transmission, but slower). I used a Perl script to write individual key events to a MID file, then played the file back through Media Player (or whatever). I could not find any waveforms in the Windows General MIDI palette that had a fast enough attack time to render “notes” that were less than a millisecond in duration. Even if it had worked, the output would have needed to be phase-correct across notes, which I don’t think is even possible with MIDI. The whole thing was ridiculous, but the “song” files sure are funny to listen to.

Report comment

Reply
astera says:

October 6, 2009 at 3:35 am

I made a transcription and tried to translate it as good as possible (yes, some parts *are* weird – even in German):

Alles klar? Wohl kaum – das lässt sich aber ganz einfach ändern.

Schon erstaunlich, wie genau plötzlich die Worte der Deklaration für einen Internationalen Gerichtshof gegen Umweltverbrechen verständlich werden. ‘Wien Modern’ war eine von zehn kulturellen Institutionen, die um einen künstlerischen Beitrag für die Veranstaltung im Dogenpalast in Venedig gebeten wurde.
Diese Botschaft mit musikalischen Mitteln hörbar zu machen ohne auf eine simple Vertonung zurückzugreifen, das war das ehrgeizige Ziel.

Berno Polzer: Ich glaube, es ist teilweise verständlich, teilweise unverständlich. Und es spielt genau mit der Grenze unserer Konstruktionsleistung. Das heißt, wir hören Klänge, die offensichtlich keine normale Musik sind, aber auch keine Sprache, und manchmal findet sozusagen so eine kleine Überbrückung statt. Ich finde, man hört auch ohne den Text zu kennen einzelne Worte, und das Aha-Erlebnis passiert eigentlich dann, wenn man den Text sieht und dann plötzlich die Sprache da ist.

Ein weiterer Brückenschlag: Miro Markus, ein neunjähriger Schüler aus Berlin, hat den Text für die Performance aufgenommen: Jugend als Hoffnungsträger der älteren Generation.

Der österreichische Komponist Peter Ablinger hat das Frequenzspektrum der Kinderstimme auf sein computergesteuertes mechanisches Klavier übertragen.

Peter Ablinger: Ich löse die eine Phonographie, das bedeutet also eine Aufnahme von irgendetwas – in diesem Falle der Stimme -, in einzelne ‘Pixel’ auf. So könnte man im übertragenen Sinne durchaus sprechen. Und wenn ich die Möglichkeit der Wiedergabe in einer sehr hohen Pixelauflösung habe, und diese habe ich nur mit einem mechanischen Klavier, dann kann ich tatsächlich eine Art von Kontinuität wiederherstellen. Wir können also in einem Klavierklang tatsächlich mit etwas Übung oder Unterstützung oder Untertitelung eine menschliche Stimme hören.

Got it? Probably not – but we can easily change that.

Pretty amazing, how all of a sudden the words of the Declaration become understandable to a European Environmental Criminal Court. ‘Wien Modern’ was one out of ten cultural institutions asked for an artistic contribution to the event in Palazzo Ducale in Venice.
The ambitious goal was to make this message audible with musical means, without falling back to a simple setting.

Berno Polzer: I think, it’s partially understandable, partially not. And it plays well with the limits of our construction abilities. That is, we hear sounds that obviously aren’t normal Music, but neither they are language, and one could say that sometimes, a bridging happens. Personally, I think you can understand individual words even without knowing the text, and the Eureka moment happens when you see the text, and suddenly, the language is there.

Yet another bridge: Miro Markus, an elementary school student from Berlin, narrated the text for the performance: Youth as a hope for the older generation.

The Austrian composer Peter Ablinger transferred the frequency spectrom of the child’s voice to his computer controlled mechanical piano.

Peter Ablinger: I break down this phonography, meaning a recording of something – the voice, in this case -, in individual ‘pixels’, one can say. And if I have the possibility of a rendering in a fairly high resolution (and that I only get with a mechanical piano), then I in fact restore some kind of continuity. Therefore, with a little practice, or help or subtitling, we actually can hear a human voice in a piano sound.

Report comment

Reply
scholli says:

October 6, 2009 at 3:40 am

google gives a quite good translation of the background of this art project:

http://translate.google.de/translate?u=http%3A%2F%2Fwww.3sat.de%2Fmediathek%2Fframeless.php%3Furl%3D%2Fkulturzeit%2Ftips%2F138237%2Findex.html&sl=de&tl=en&hl=de&ie=UTF-8

Report comment

Reply
Patrick Flanagan says:

October 6, 2009 at 11:33 am

I did nearly the exact same thing for my Master’s Recital in spring 2007.

http://rocketsurgeon.s3.amazonaws.com/PWAP_End.mp4?AWSAccessKeyId=0JC3J24V0Q2JT4S9FR02&Expires=1255212586&Signature=A8bjSj28i4OD8mr8Aoh9Fghghk0%3D

(1.7 MB)

I saved myself a lot of time and money by renting a Disklavier, but hat’s off to Peter for building his own player piano.

One conceptual difference between my work and Peter’s is that my piece begins at a very slow tempo and gradually accelerates to slightly more than normal speed, at which point the text becomes quasi-understandable. It’s a kind of acoustic time-stretching, DSP without the digital signal, that foregrounds the threshold between music and speech.

Synthesizing phonemes with a noise component is difficult, so I limited the text to words with only vowels sounds and l,m,n,r,w, and y. As it happens, many of the roughly 400 English words that meet that criterion have to do with sex, drugs, or Islam, which made for a politically volatile text, but that was really just a byproduct of the process.

Once the text was prepared, I recorded myself reciting it and did a Fourier analysis in Max/MSP. I wrote my own partial-tracking software in Max, and used that to extract prominent partials, which were converted to notes and saved in MIDI file. I retouched the MIDI file in Cubase to make the speech more understandable. The final MIDI “score” of the piece resulted from looping the retouched MIDI file while accelerating from a fraction of the original tempo to slightly faster than real-time.

I’m not staking any claims to originality here; I stole the idea of instrumental speech synthesis from the Indian/English/German composer Clarence Barlow, who I studied with in Cologne in 2002/2003.

Report comment

Reply
draeath says:

October 7, 2009 at 2:59 am

Thank you for the youtube upload. ASX? Really? What were you thinking, there?

Report comment

Reply
dunk says:

October 7, 2009 at 3:32 am

Spar-ky!
http://www.youtube.com/watch?v=s3etiNLAFi0#t=3m17s

Report comment

Reply
dunk says:

October 7, 2009 at 3:51 am

sorry. 3m17s for the talking piano

Report comment

Reply
Alex says:

October 7, 2009 at 1:06 pm

@deaeath
asf was pretty much the only option since I had to rip the video off a media player stream. Trust me, it’s not the format I would’ve preferred.

Report comment

Reply
Tim Anderson says:

October 10, 2009 at 10:55 pm

I would like to see the sheet music of the speech.

Report comment

Reply
Stoph Long says:

October 10, 2009 at 10:56 pm

Patrick,

I wasn’t able to view the document you mentioned. I have access to a disklavier and would be fascinated to play the midi file you mentioned if it is available. Also, the auditory research community email list has been discussing the “Talking Piano” project and I’m sure would be interested in hearing about your work too.
It sounds like a very interesting project.

StophLong at
yahoo.co.uk

Report comment

Reply
Galane says:

November 12, 2015 at 4:10 pm

The Voder as demonstrated at the 1939 World’s Fair was much more understandable. https://www.youtube.com/watch?v=0rAyrmm7vv0

Report comment

Reply
Olm-e says:

November 13, 2015 at 3:29 pm

this seems to be programed with pure-data (0:38) … neat !

Report comment

Reply

Hackaday

Vocoding With A Piano

43 thoughts on “Vocoding With A Piano”

Leave a ReplyCancel reply

Search

Never miss a hack

If you missed it

The Requirements Of AI

Ancient Ice Production

Real LED TVs Are Finally Becoming A Thing

The Engineering Of The Falkirk Wheel

Practice Makes Perfect: The Wet Dress Rehearsal

Our Columns

The History Of The View-Master

2026 Hackaday Europe Call For Participation: We Want You!

Retrotechtacular: Mr. Wizard Jams With IBM

Keebin’ With Kristina: The One With The NEO With The Typewriter Shell

Hackaday Links: February 15, 2026

43 thoughts on “Vocoding With A Piano”

Leave a ReplyCancel reply

Search

Never miss a hack

Subscribe

If you missed it

Our Columns