Hacking Amazon Echo Through Its Remote

This one’s crazy… literally one electronic device is talking to another. In spoken English. And it works.

We’ve covered several hacks for the Amazon Echo, but some might be surprised to learn that there is another piece of interesting hardware that comes along with it – a remote control. Wire in a Raspberry Pi to it, and you’ve given yourself a way to automate control of the Echo without ever taking the Echo itself apart. [Gamaral] did just this and gave his Echo some significantly enhanced capabilities.

He started off by identifying the power rails of the remote. Then he wires in a 3.3v voltage regulator and uses a 100 ohm resistor as a voltage divider to bring it down to the 1.8 volt logic level used by the Echo remote. A single wire runs from the Raspi GPIO to one of the tactile switches on the controller.

For software, the Raspi is running RPi buildroot with Espeak and a cron scheduler compiled in. This allows him to send commands to the Echo which makes it say just about anything he wants. But any voice commands accepted by the Echo should work. If you want to go outside of those boundaries check out the method of spoofing WeMo devices we saw the other day.

Be sure to check out the [gamaral’s] entertaining video below to see the hack in action.

27 thoughts on “Hacking Amazon Echo Through Its Remote

  1. That’s a straightforward way to do it. But since the remote uses Bluetooth, I bet it would be possible to spoof it with just a little protocol sniffing. At worst, a logic analyzer on the bus between the CC2560 radio and the micro.

    I know the non-voice Fire Stick remote is similar hardware-wise, and acts as a straightforward HID profile device.

      1. I believe that chango is saying that by spoofing the remote, the Pi would be able to send voice commands directly to the Echo instead of Pi -> headphones -> Microphone -> Echo. This would hopefully improve the recognition accuracy since there would be no speaker/mic distortions or ambient noises.

    1. Just once, can there be an Echo article without someone posting this (incorrect) assumption?

      The Echo does do the voice processing Amazon server side, but it only after the wake word. That is, it’s local voice recognition that listens for the wake word only, then it gets handed off.

      Every time, jeez.

      1. The fact itndoes the voice processing remotely is silly, I’ve got single core phones that can easily output video to a screen and stereo audio with local voice processing. That’s an android phone which uses a weak interpretive environment which is crap. So why can my echo run a little code locally to do the lifting with its dualcore CPU, no video requirements and dedicated Audio Processors? Because then amazon won’t get have an excuse to log as much data.
        I’m not saying its always listening a simple test with wireshaek proves that, but it really irks me my echo sends everything over the internet when it really doesn’t need to. Just wish there was a way to “root” it and see what we can get it doing.

        1. Voice recognition in the cloud is nothing new. Android’s voice-to-text was cloud-only before there was a local capability. Google’s “Ok Google” is all processed in the could. I’ve tried many voice-to-text software, on phone and pc, and none of them work as well as the cloud ones. I don’t know if it’s sheer processing power, or the ability to be updated and fine-tuned in real time, but the cloud ones always just work better.

          1. I just remembered a great point for speech recognition in the cloud: Contextual awareness. If I talk to some offline voice recognition, and I say “Skrillex”, what’s the odds that it’s going to get the correct word? What if I say “Skrillex Make It Bun Dem”? I just tried with Google Now and it got it exactly right. It gets it right because it can do more than just try to recognize words by their sound, but compare them to a wide list of artists, news, products, recent and past searches (globally), etc., and update it all in real time. Offline voice recognition can not possibly compete for accuracy. Frankly, it’s precisely this online processing that makes the Echo feasible in the first place.

  2. “”Then he wires in a 3.3v voltage regulator and uses a 100 ohm resistor as a voltage divider to bring it down to the 1.8 volt logic level used by the Echo remote.”” No he doesn’t. He uses a 3.3V regulator for power, and then uses a voltage divider for the GPIO. Big difference

  3. Hello, I am new to Hackaday. Just bought an Echo with remote, and interested to see this hack. Tried to replicate it, but couples of questions cannot find the answers.

    – It is clear described about the wiring on Echo’s remote. But it is not clear about which GPIO pin will be connected to RPi? Neither on the link https://hackaday.io/project/6820-amazon-echo-voice-command-automation

    – If I understand correctly from the video, RPi send the command (from keyboard type-in) through the wire to the “voice” pin on remote. And let Echo speaks. If so, why need to install Espeak on RPi? Does Espeak translate the text to speech waveform, and send it through the GPIO to “voice” of remote? (I don’t know what input is required by “voice” on remote. But it seems contradictory with the following message of GPIO HIGH and LOW only “A script is used to toggle the RPi’s GPIO HIGH and LOW before and after espeak reads out the voice command.”

    – And where can I download the script?

    – Jus to confirm, I can still use the AAAx2 battery instead of 3.3v power in, right? (maybe mechanically not wise.)

    Too many question as a new guy at Hackday? Always more questions when learning. Thanks, pals.

    1. Sr Marconi, obviously the GPIO is used to simulate the button press, and the voice commands are being transmitted as audio from the headphone-out of the RPi to the microphone-in of the remote. There’s no commands going over the GPIO.

  4. Also has anyone tried sending audio from a raspberry pi to the echo as a bluetooth speaker device? That would be the best way to push alerts to the echo speaker rather than using the remote controller hack and “simon says”.

    1. I’m sure you could use the pi and have it as a bluetooth speaker, but….you can’t initiate a ‘skill’; meaning, I can’t tell my computer to push out “Alexa, flashbriefing” through the bluetooth Echo and it go off to Amazon and bring back the news. That is why this hack works. Using the remote, you can utilize all the functions of the Echo.

      1. Right, so the Echo-as-BT-speaker approach won’t work. But as Chango notes in the very first comment, the remote communicates with the Echo via Bluetooth as well. So unless Amazon is using some proprietary and hard-to-crack protocol to transmit audio across the BT connection (whichever profile it’s using) it should be possible to emulate the remote using your own device (Pi, Arduino, whatever) and get the Echo to say what you want, when you want, by sending it audio speech prefaced with the “Simon say” key phrase.

        I find this possibility to be one worth pursuing. Unfortunately I’ve been unable to find anything on the web about how the remote sends audio to the Echo (which BT profile is used, etc), or about anyone already implementing such a hack. But then, maybe by Google-Fu is weak. Has anyone else pursued this?

  5. So, late to the party on this, but it’s 2018 and apparently there is still no good way to send notifications to Alexa (i.e. in my case have Alexa tell you when a motion sensor is triggered or start a skill when I enter a room without telling her to). Your approach is the best way to go. Still, I feel like this project could go a little further. Has anyone considered removing the MD v1.2 microphone and attempting to send a signal directly to the remote? Is it possible? I purchased a few from ebay, but my skills and equipment are pretty basic. I attempted to reflow the area to remove it without success. That said, can the buttons be customized to elicit responses? In the end this seems like the best way to automate skills without a voice response.

    1. I’m wondering the same thing. I’ve got it all working but really want to feed audio directly into the mic. I’m asking on Adafruit forums because they sell i2s mems mic breakouts and also on the Raspberry Pi forum. If I can get it working I’ll post back here on what to do.

Leave a Reply

This site uses Akismet to reduce spam. Learn how your comment data is processed.