AlterEgo Listens To Your Internal Voice

Recent news reports have claimed that an MIT headset can read your mind, but as it turns out that’s a little bit of fake news. There is a headset — called AlterEgo — but it doesn’t actually read your mind. Rather, it measures subtle cues of you silently vocalizing words. We aren’t sure exactly how that works, but the FAQ claims it is similar to how you experience reading as a child.

If you read much science fiction, you probably recognize this as subvocalization, which has been under study by the Army and NASA. However, from what we know, the positioning of sensor electrodes is crucial and can vary not only by speaker, but also change for the same speaker. Perhaps the MIT device has found a way around that problem. You can see a video of the system, below.

In addition to picking up your silent speech, the headset can reply silently, too. That’s just plain old-fashioned bone conduction, which is a well-understood technique. We have covered several projects that use it in the past.

If this interests you, there has been research going on for years to allow people to send Morse code by manipulating their EEG waveforms. We’ve even seen robots controlled by brain waves.

Photo Credit: MIT Media Lab

27 thoughts on “AlterEgo Listens To Your Internal Voice

  1. I propose a name change to “the voices inside my head.”
    So this uses EMG instead of EEG? Seems to be located around the tongue.
    “deliberate movements of internal speech articulators (when a user intentionally vocalizes internally)”

  2. Yes I think NASA discovered this and many scientists are moving forward on it. It seems that sometimes when you read something the back of your tongue moves slightly as if you were actually talking out loud. But no one in the room can hear you not even yourself. So if you can detect this small muscle twitches in back of throat you can actually convert it to real speech and actions. I mean it is your brain producing this from your own thoughts.Does not use EEG signals.

    They also found that this effect seems to preceed you actaully speaking out loud so the system can detect what your going to say a split second before you actually say it. This was in a Clint Eastwood movie called FIREFOX (1982) in where the non-fictional MIG-31 fire control system was activated and controlled by the pilot’s thinking and then saying it out loud – not a voice recognition system. So someone was thinking about it at least as far back as 1980’s.

    https://youtu.be/sYh9_QmNwRA

    1. I remember reading this years ago. It sounds like they were using neural networks to do the recognition then. It sounds like there wasn’t enough power to get it working beyond a simple demo. With the current explosion of neural network research and applications in everyday use, it makes sense for someone to give this a try again.

    2. Interesting… I worked on NIR methods for pattern recognition. At the same time I had friends at TRW and Johnson Control working on the audio pattern recognition for voice commands. HHHmmm… I never discussed what the most effective algorithms were in regards to fastest processing time (latency) and accuracy. For my NIR work, I didn’t think the genetic or neural network algorithms were as effective in regards to accuracy as what I was using. I was using the hierarchical cluster analysis (PCA) to determine closest match to build against… then building an individual PLS model for the target at 100 with the closest match from the PCA model of everything at 0 to assure no false positives. Then depending on the variability of the spectra… the PLS model required more data to assure no false negatives… which weren’t as risky of an issue for health and safety in regards to meeting specifications for food and drugs since if failed the NIR… we had a specific standard operating procedure for the out of specification investigation and standard testing since the procedures were validated.and implemented on the specification as an “alternate test method”.

    1. I had a similar reaction. It strongly reminds me of the ‘SixthSense’ camera/projector/computer thing that was supposed to be ‘ready’ and released back in 2009 or so.
      This paper might be intended to support a patent application or something, it has a lot of product-y details completely irrelevant to what would be a pretty good science project. Even the single graph doesn’t present its information as a scientist would.

      1. The MIT medial lab where this system is coming from specializes in the design and implementation of systems that integrate new (or not) sensing and information presentation methods (in this case EMG and bone conduction) into functional interfaces. As is the case with most HCI research presenting new interfaces, it is not to be looked at in the traditional “science” framework but more with a design/user experience perspective of how would it work/look/modify the way people live with it.

        In that sense, a large proportion of the papers presented at for example ACM CHI, UIST, TEI, CSCW are proofs of concept for new interfaces that are in many cases taking scientific discoveries of the past 100 years and actually taking them from a “2-3 NASA paper from a dude in 1963’s” state to a “Here is a functional implementation, product design, and how would regular people use it, how would it change their daily/professional lives” state.

        Also, considering the sample size, the graph does not stand out as being an inappropriate way of presenting that data. However, the fact that they only reported the accuracy as a metric, and not commenting on the types of errors it is doing is more problematic depending on the application scenario. (e.g., the user intends to do the arithmetic operation 3+4, but instead ends up preheating his oven at 350F)

  3. Well it seems like it only goes up around the ear for the bone conduction sound output and physical support, that chin piece and the way it digs in by the neck are very deliberate, so it’s far more reading the throat than reading the mind. While I’ve not read the paper, the overview says “without externally observable movements” which I see as eluding toward a ‘talking under your breath’, ‘modulated breathing’ or whatever you’d like to call it.

      1. This is certainly not new technology and MIT is not pioneering it. They are not the inventors. NASA was the first to “play” with it here in the USA but I suspect the Russians knew about it for decades. I personally always thought something was going on in the back of the tongue and Adam’s Apple when you have a serious internal mental monologue about something or when reading silently. I never thought it was a precursor to speaking aloud though. And I never thought anybody could detect the muscles flexing there.

        This is NOT BONE CONDUCTION. This is purely motor-nerve sensors on the pharyngeal and the laryngeal prominence muscles. The only reason why the MIT version is on the guy’s ear is to hold it in place. It’s just hanging on the ear loop not sensing anything there like the mastoid bone. The one I saw on TED Talk was just a throat sensor like those WW2 throat microphones.

        I’m thinking the device may have a hard time working on women as their laryngeal prominence is almost nonexistent.This is an EMG sensor device NOT audio. When (or if) you do this, there is no audio component. You can detect you’re doing it when you feel your tongue moving slightly when you are thinking about something. So far you can only pull out trigger-words not whole thought sentences. So you have to know what you’re looking for in the subject patient’s thoughts beforehand. Also, a new non-invasive device by US DHS called MALINTENT can also detect stuff like what you’re thinking when exposed to particular multimedia stimuli. Some have used it to detect what languages you may be proficient at by flashing foreign words and looking for your internal brain recognition.

  4. sounds like MIT is looking for more of that grant money and that they are falling prey to marketers and hype. I truly expected more from MIT, at the least a proper representation of the technology. This will not read your mind unless you vocalize all of your thoughts, so if your “internal voice is truly internal then this will do jack squat.

    1. This is not MIT’s baby. They are just copy-cats to someone else possibly NASA. Also NO you don’t need to “vocalize” anything. You just need to slightly flex your tongue and/or Adam’s Apple (or even your lips) when you silently read something or in a really natural internal monologue about something. Everybody doesn’t do it. Like when your Ego and Id are having a good conversation about something that you would never vocalize or be thought of as a crazy tin-foil hat victim (or talking on Bluetooth ear dongle) – :D

  5. I hope it becomes an actual product. Just for voice-to-text I would like something that I could use without disturbing my coworkers. RSI is a sonuva. There are plenty of other applications as well.

    1. Maave – It can’t do dictation. It can only recognize words that it is programmed to detect for YOU. It’s not like a speaker independant system as it does not even use audio. It uses EMG from, in this case your lip muscles in your chin, the other systems use your adam’s apple. I guess the earpiece is NOT for sensing. It’s just a bone conduction feedback from the processing computer according to [Shannon]. So sorry the MIT method wont help you. There is ANOTHER mental dictation system (very slow) that uses EEG and was planned to use with Dr Hawkin before he died.

      I downloaded an eye movement PC FREEWARE that does the same thing with your eyeball, tip of your nose, your finger etc. You just aim the pointer to the letter or word and wait a second and it takes. Very slow. Uses your web cam.

Leave a Reply

Please be kind and respectful to help make the comments section excellent. (Comment Policy)

This site uses Akismet to reduce spam. Learn how your comment data is processed.