Sine-wave Speech Demonstrates An Auditory One-way Door

October 7, 2023

Sine-wave speech can be thought of as a sort of auditory illusion, a sensory edge case in which one’s experience has a clear “before” and “after” moment, like going through a one-way door.

Sine-wave speech (SWS) is intentionally-degraded audio. Here are the samples, and here’s what to do:

Choose a sample and listen to the sine-wave speech version (SWS). Most people will perceive an unintelligible mix of tones and beeps.
Listen to the original version of the sentence.
Now listen to the SWS version again.

Most people will hear only some tones and beeps when first listening to sine-wave speech. But after hearing the original version once, the SWS version suddenly becomes intelligible (albeit degraded-sounding).

These samples were originally part of research by [Chris Darwin] into speech perception, but the curious way in which one’s experience of a SWS sample can change is pretty interesting. The idea is that upon listening to the original sample, the brain — fantastic prediction and learning engine that it is — now knows better what to expect, and applies that without the listener being consciously aware. In fact, if one listens to enough different SWS samples, one begins to gain the ability to understand the SWS versions without having to be exposed to the originals. In his recent book The Experience Machine: How Our Minds Predict and Shape Reality, Andy Clark discusses how this process may be similar to how humans gain fluency in a new language, perceiving things like pauses and breaks and word forms that are unintelligible to a novice.

This is in some ways similar to the “Green Needle / Brainstorm” phenomenon, in which a viewer hears a voice saying either “green needle” or “brainstorm” depending on which word they are primed to hear. We’ve also previously seen other auditory strangeness in which the brain perceives ever-increasing tempo in music that isn’t actually there (the Accelerando Illusion, about halfway down the list in this post.)

Curious about the technical details behind sine-wave speech, and how it was generated? We sure hope so, because we can point you to details on SWS as well as to the (free) Praat software that [Chris] used to generate his samples, and the Praat script he wrote to actually create them.

41 thoughts on “Sine-wave Speech Demonstrates An Auditory One-way Door”

rnjacobs says:

October 7, 2023 at 7:03 pm

Kinda reminds me of Silbo Gomero (q.v. wikipedia)

Report comment

Reply
Dr. says:

October 7, 2023 at 7:19 pm

We are also a model based on training.

Report comment

Reply
Jordan says:

October 7, 2023 at 7:22 pm

I got the effect on the first three samples, but after that I was able to understand the last 3 samples on the first listen, without having heard the un-distorted version.

Report comment

Reply
1. Nathan says:
  
  October 7, 2023 at 9:32 pm
  
  Same here. I guess we’re fast auditory learners? Sharp ears? I’ve always had a good “radio in my head” and my dad’s a professional musician for my whole life, though I never had any formal music training.
  
  Report comment
  
  Reply
  1. Joshua says:
    
    October 7, 2023 at 9:48 pm
    
    Then there’s a good chance to learn telegraphy now. That way, the hearing and the brain’s pattern recognition can be trained, maybe. A talent for music and rhythms shouldn’t hurt, either. :)
    
    Report comment
    
    Reply
2. Vincent Pribish says:
  
  October 7, 2023 at 9:48 pm
  
  someone needs to do it out of order, I caught on by the Kettle one
  
  Report comment
  
  Reply
3. Foldi-One says:
  
  October 8, 2023 at 4:42 am
  
  Yeah I found the first sample just about understandable myself, wasn’t exactly right but did pick out some of it it turns out correctly (which very much surprised me wasn’t confident at all). And after that first one the rest I had way more confidence on.
  
  I wonder if it is because the voice is relatively familiar to us anyway – the accent, tone and pace of the normal version of the speech seems like somebody I could know – So we are already listening and trying to make it match our expectations.
  
  Report comment
  
  Reply
  1. Richard wayne Williams says:
    
    October 14, 2023 at 7:16 am
    
    When I did my voice recording one day on a guys machine for recording (a music sincizer) it was portable by the way my voice was terrible but a few hours later it returned to normal not sure why maybe i was upset as they say not right not sure guess thats why stress is bad for a singers voice even if u dont use it for a while maybe i should of just been better in my education when i was younger than the world could of seen me better
    
    Report comment
    
    Reply
4. Dan says:
  
  October 8, 2023 at 5:17 am
  
  Likewise – I half got the first one, but after hearing the original for it I could hear the remaining 5 fine first time.
  
  Report comment
  
  Reply
5. abjq says:
  
  October 10, 2023 at 7:20 am
  
  I was getting my ear in by sample 5 and got the last one perfectly.
  
  Report comment
  
  Reply
Joshua says:

October 7, 2023 at 8:33 pm

Okay. Sounds like a bird trying to speak.
It’s as if merely the changes in pitch are being recorded, rather than full speech.
It’s as if a metallophone is being used to mimic speech.

Report comment

Reply
Joshua says:

October 7, 2023 at 8:40 pm

What makes me wonder, which kind of speech is best suited here.
The stereotypical stiff or more defined BE accent or the AE one?
Also, is a male voice better or a female one?
Say, the male one usually is deeper and more clear, while female one is higher and squeaky.

Report comment

Reply
1. Foldi-One says:
  
  October 8, 2023 at 4:48 am
  
  I suspect all of that depends very much more on how your brain is already wired – you spend you time listening to and speaking the local dialect and this is recorded in that dialect you I expect have a huge leg up on everyone else. So for more global easy to comprehend I suspect the BBC newsreader style of speaking is probably going to be more familiar than any American accent.
  
  Report comment
  
  Reply
2. BrightBlueJim says:
  
  October 9, 2023 at 2:53 am
  
  What it makes ME wonder is, how little more information can be added to make the speech intelligible? This demonstrates how close you can get with an extremely low bitrate. How much more data does it take to add plosives and sibilants, and is this enough?
  
  Report comment
  
  Reply
3. Matthias says:
  
  October 9, 2023 at 9:36 pm
  
  There’s no single British English accent.
  
  Btw, most British accent are non-rhotic. I would half expect that might make them harder to pick out. And most of them aren’t any more defined or stiff..
  
  Report comment
  
  Reply
Paul says:

October 7, 2023 at 9:04 pm

Woah. Flashbacks. Reminds me of all those hours I spent messing with Dennis Klatt’s speech synthesis code in the late 80s (later made famous by Stephen Hawking). One of the first things I found on the primordial internet. Fun stuff. Apparently Praat’s author Boersma thought so to — his PhD thesis appeared a few years later.

Report comment

Reply
Paul says:

October 7, 2023 at 9:07 pm

arrggh. … thought so *too* …
Enter key works now, but still can’t edit, at least not with “anonymous coward” credentials.

Report comment

Reply
echodelta says:

October 7, 2023 at 9:37 pm

Try the player piano doing this, not sine nor electronic. Freaking cool. Its on the tube.

Report comment

Reply
spiritplumber says:

October 7, 2023 at 10:29 pm

Try to mumble “Idunno” and you’ll get the same effect :)

Report comment

Reply
Jan says:

October 8, 2023 at 1:16 am

interesting… now suddenly it makes sense why some people can understand what R2D2 is saying

Report comment

Reply
Floydian Slip says:

October 8, 2023 at 1:41 am

The original sounds a bit like a young Bill Nighy (the British actor from such great films as The Best Exotic Marigold Hotel, Love Actually, World’s End and amazingly funny Hot Fuzz).

Report comment

Reply
Ostracus says:

October 8, 2023 at 5:33 am

“Most people will hear only some tones and beeps when first listening to sine-wave speech. But after hearing the original version once, the SWS version suddenly becomes intelligible (albeit degraded-sounding).”

Like the little voice in our head reading over the input, filling in the blanks.

Report comment

Reply
Lindsey Montana says:

October 8, 2023 at 5:57 am

What is “Sine Wave Speech?” Why is it named that? How is it made?

Report comment

Reply
1. Daniel Larrosa says:
  
  October 8, 2023 at 7:54 am
  
  https://www.mrc-cbu.cam.ac.uk/people/matt.davis/sine-wave-speech
  
  Sine-wave speech is a form of artificially degraded speech first developed at Haskins Laboratory.
  
  Generating Sine-Wave Speech:
  Sine-wave speech is generated by using a formant tracker to detect the formant frequencies found in an utterance, and then synthesising sine waves that track the centre of these formants.
  
  Best regards,
  
  A/P Daniel F. Larrosa
  (Montevideo – Uruguay)
  
  Report comment
  
  Reply
  1. James Stout says:
    
    October 9, 2023 at 4:16 pm
    
    Thanks for the link!
    
    Report comment
    
    Reply
Pax says:

October 8, 2023 at 6:32 am

This sounds like a less robot version of tiny speech.
I think the more interesting question would be how algorithmically to go from this and back to something sounding like the original.

Report comment

Reply
jpa says:

October 8, 2023 at 10:40 am

Another kind of degraded speech is produced by cochlear implants. Only a few frequencies yet it doesn’t take that long to understand.

Report comment

Reply
1. Joe says:
  
  October 9, 2023 at 7:52 am
  
  Would that imply that perhaps as density and computing power on a chip continues to improve, along power economy/capability, that future cochlear implants could have greater resolution, resulting in truer reproduction?
  
  Report comment
  
  Reply
fluffy says:

October 8, 2023 at 12:24 pm

There’s a similar phenomenon where if you use wav2midi or similar to convert audio of a song to a MIDI file, people who are familiar with the song can discern the lyrics while those who aren’t cannot. Mark Rober did a demonstration of something similar with a MIDI-controlled player piano a few years ago, where by providing subtitles the piano’s “speech” was totally intelligible but without them it was difficult/impossible to tell what it was saying.

Report comment

Reply
Wibble says:

October 8, 2023 at 1:00 pm

I wonder if this is related to the perception aspects of a cochlear implant?

Report comment

Reply
Tillmann Krauss says:

October 8, 2023 at 2:56 pm

So that is how R2D2 is communicating…

Report comment

Reply
Till says:

October 8, 2023 at 2:57 pm

So this is how R2D2 is communicating…

Report comment

Reply
C says:

October 9, 2023 at 1:25 am

Like most pop music. There are always parts that are ambiguous and since the lyrics are often poetic and we don’t know the context in which the lyrics were made up it can be difficult to figure out what the lyrics are. Listening to music again after reading the lyrics changes it.

Report comment

Reply
1. BrightBlueJim says:
  
  October 9, 2023 at 2:47 am
  
  It’s the ambiguous parts that make both music and poetry work. These are the parts that transform the consumer into a participant.
  
  Report comment
  
  Reply
  1. C says:
    
    October 9, 2023 at 4:34 am
    
    You misunderstood. I mean ambiguous in that a word will be heard differently by different people. Some of those interpreted lyrics are incorrect and don’t match the actual lyrics. Ambiguity in meaning is a separate discussion as this can be done on purpose.
    
    Report comment
    
    Reply
Twisty Plastic says:

October 9, 2023 at 5:57 am

This seems really familiar. Was this same method used to generate the voice of some sort of alien or android in some old 70s or early 80s sci fi?

Report comment

Reply
𐂀 𐂅 says:

October 9, 2023 at 5:02 pm

What if I retrain my brain on enough distorted recordings of phonemes and phoneme combinations?

Report comment

Reply
1. C says:
  
  October 10, 2023 at 12:22 am
  
  Too much information has been removed to recover the original without adding more meta data. You can interpret the sine wave speech in multiple ways. Only by adding more information such as lip reading, a script, context, or the original audio, can your brain fill in the gaps.
  You probably could train your brain to become better at it. People with cochlear implants can only hear a limited number of frequencies, but they can learn to understand speech.
  
  Report comment
  
  Reply
2. BrightBlueJim says:
  
  October 10, 2023 at 12:56 am
  
  Then you will have gained a skill that will never serve you.
  
  Report comment
  
  Reply
  1. 𐂀 𐂅 says:
    
    January 16, 2024 at 11:20 am
    
    Like an arts degree from Harvard?
    :-)
    
    Report comment
    
    Reply
James j Jones says:

October 10, 2023 at 1:15 pm

Remember the old claims of backward masking in songs, and how those who said they existed would always tell you what you were supposed to hear before playing it? This was kind of like that. I did listen to each processed version repeatedly before I listened to the original. A time or two, I got pretty close, but not often. (Wish I’d written them down.)

Report comment

Reply

Hackaday

Sine-wave Speech Demonstrates An Auditory One-way Door

41 thoughts on “Sine-wave Speech Demonstrates An Auditory One-way Door”

Leave a ReplyCancel reply

Search

Never miss a hack

If you missed it

Finding A New Model For Hacker Camps

Ask Hackaday: Where Are All The Fuel Cells?

Death Of The Cheque: Australia Moves On

How To Sink A Ship: Preparing The SS United States For Its Final Journey

The Terminal Demise Of Consumer Electronics Through Subscription Services

Our Columns

Who Is Your Audience?

Hackaday Podcast Episode 334: Radioactive Shrimp Clocks, Funky Filaments, Owning The Hardware

This Week In Security: Anime Catgirls, Illegal AdBlock, And Disputed Research

Linux Fu: Windows Virtualization The Hard(ware) Way

FLOSS Weekly Episode 843: Money Usually Helps

41 thoughts on “Sine-wave Speech Demonstrates An Auditory One-way Door”

Leave a ReplyCancel reply

Search

Never miss a hack

Subscribe

If you missed it

Our Columns