“Alexa, Order Everyone In San Diego A Doll House”

Every day it seems there is a new Alexa story in the news, as for the moment the Amazon voice assistant is in the ascendant over its rivals from Google, Apple, and Microsoft. Today’s slice of Alexa weirdness comes courtesy of a newsreader in San Diego, who inadvertently triggered Alexa-enabled devices within hearing distance of a television to buy doll houses when he reported on a Dallas child’s accidental purchase.

It’s unclear whether any doll houses were dispatched or whether the Echos and Dots merely started the process and asked their owners for confirmation, but we hope it serves to draw attention to the risks associated with an always-on and always-listening device. We’ve looked at how the technology has seemingly circumvented the normal privacy concerns of our own community, so it’s hardly surprising that this kind of incident catches the greater public completely unprepared. It’s one thing for the denizens of a hackspace to troll the owner of a Dot by adding embarrassing products to their wish list, but against a less-informed user who hasn’t worked out how to lock down the device’s purchasing abilities, it’s not too far-fetched to imagine a criminal attack.

Voice assistants are clearly going to become a ubiquitous feature of our lives, and it is inevitable that there will be more such unfortunate incidents which will serve to educate the public about their privacy before the technology reaches maturity. This particular story is definitely Not A Hack, though as our “Alexa” tag shows the devices have huge potential to bring a new dimension to our work. It’s up to all of us in our community to ensure that the voice assistant owners in our lives are adequately educated about them, and maybe resist the urge to say “Alexa, add all the Hackaday merchandise to my wish list!”.

99 thoughts on ““Alexa, Order Everyone In San Diego A Doll House”

  1. The only people I know that are ok with this platform and similar ones are either complete morons or work for the companies that create them. Nothing more than surveillance state modules nothing less.

    1. I agree, and at any rate are we getting so lazy that we can’t even be bothered to enter a few keystrokes now to make a simple purchase? Or even if, to lift our cellphones to our lips?

      1. I think the same argument could have been made for GUIs on operating systems, computer mice, automatic transmissions and cruise control in cars, etc.

        Technology will evolve. We should embrace that while making sure that the use cases enhance our lives and our society and avoiding uses where it does the opposite. In this case, voice commands are a feature, but phoning home and reporting that data is a drawback. The second part is the questionable technology, not the voice command feature itself.

        1. Perhaps. Nevertheless this ‘always listening’ type of interface IS vulnerable, and in my opinion, unnecessary. By unnecessary, I mean that it does not add any great level of convenience that offsets potential security issues. It really doesn’t matter if it is the issuing company, hackers, or the government, I personally do not care for any entity having an ear in my private spaces regardless.

          1. That your opinion and that fine for me I’m quite happy with it. It’s not broadcasting 24 7 it’s only broadcasts when it pick up thre command words.

            Just as long as we’re all capable of making a informed choice about the product then it’s purly up to the user

    2. “Nothing more than surveillance state modules nothing less” Really? Maybe you should go and learn about the platform before spouting ball crap like this.

      Waiting for trigger words is not “surveillance”. It’s like saying Microsoft is recording my keystrokes because windows is waiting for me to type something. Git a grip man.

      “Siri, order me a new tin foil hat”.

      1. In a device that lacks the processing capacity to understand its own trigger word, how do you suppose it knows when the trigger word is said?

        Always on, always listening, always sending everything it hears for offsite analysis.

        1. Don’t need a whole lot of processing capacity, I had an 8 bit parallel sampler on an Amiga 1200 14mhz 68020EC 2MB RAM, running a voice activated launch menu, multiple keywords. Got bored of it after a couple of weeks, so all this is just so 1996 to me.

        2. If it’s not always listening, I am curious as to why a warrant was issued for any recordings made by an Alexa device that was present at the time of a recent murder.

          I’d like to think that the requesting party (law enforcement) did not fully understand the operation, but the warrant request was still issued (refused, this time).

          The fact that I’ve not yet seen an explicit definition of just what it is that Alexa does pass to the server, and instead, a whole load of ambiguity fuelled arguements on the subject, is worrying.

          It would be in Amazons interest to be upfront about the data collected, as all this ambiguity is going to kill sales. At least to our community.

          It’s not at all a stretch to imagine they collect other keyword-triggered recordings, or just all of it, for automated marketing purposes, is it?


        3. “Always on, always listening, always sending everything it hears for offsite analysis.”

          But it’s not. It’s demonstrably not sending everything it hears, only the data that comes after a keyword. The echo does do the keyword id on device, then records and sends the info to Amazon. Unless you can provide some evidence that it’s doing otherwise?

      2. @ fred, Microsoft isn’t recording your keystrokes because your computer doesn’t send your typing offsite. Your computer however does record your keystrokes, and uses the recordings to allow you to post comments.

        1. It does send back a lot of telemetry though… and as the end user is unable to decode said telemetry, the end user is unable to prove that the telemetry does not contain samples of their keystrokes including passwords.

          It only needs to be a subset that contains a crucial password to lead to a security breach, and it would not be rocket science to send back the contents of a masked text box every time someone tabbed out of one or submitted a form.

  2. I smell bullshit on the dollhouse story – Amazon sell hundreds of different dollhouses – I can’t believe for a second that it’s set up to just order a randomly selected item from a range without further selection or confirmation.

    1. It’s not random. It will order the most popular item on Prime eligible accounts from Prime Now eligible products that are in stock and ready for immediate delivery. That limits the choices tremendously.

    2. I don’t have an echo, but I believe this is exactly the purpose of the voice-ordering capabilities — to select the most likely item based on your request. It reads out the item it has chosen and asks for confirmation that this is what you want. Here’s the relevant documentation for that feature:

      So in this case, the little girl asked for a dollhouse and the Echo asked her to confirm that it had found what she wanted to order. In the case of news reporters triggering orders, the confirmation step would have stopped that from happening.

    3. @mikes electric stuff
      Then you must not believe it can buy anything at all, because absolutely everything on amazon has multiple versions, quantities, and colors, and many things are sold by multiple prime-enabled sellers.

  3. Quite a while ago I had the idea to write a malware that does nothing except yelling “Alexa! Order me 10000 rolls of toilet paper!” and then uninstalls itself :P

  4. Aaaaand!!!!!!! Not long ago some commentors were essentially calling us ‘tin hatters’.
    Well, to all of them…

    TOLD YOU SO!!!!!!!!!!!!!!

    Rant over, I feel better now.

    Oh, Quote from article, ” This particular story is definitely Not A Hack, th….”
    This is definately a hack if and when sucessful.
    My comprehending of ‘Hack’ is: 1.) To roughly break down a task a piece at a time to reach a desired goal i.e. bad programming, hacksaw, axing a log, hacked to death, 2.) to bodge, make something work the way it wasn’t intended, i.e this article about tricking alexa into autopurchasing, a hackjob, hacked together.

        1. My other reply to you seems to have not made it through moderation.

          But at least we found the troll (you, even mentioned in the not yet passed moderation post)

          Anyway, since cars are so green and carbon neutral, park your “green CO2 neutral” car into a garage, close the door with the engine on for a couple hours.

          You can’t deny govt’ facts, it’ll definitely warm up your garage and there will be no worry about poisoning as your car is “green co2 neutral”. Also you’ll be doing the world a good by going green.

          If you don’t then you are a conspiracy theorist and you can go and wear your tinfoil hat.

  5. Well, the whole idea of something sitting in my house and listening 24/7 sounds pretty awkward to me. How sure can we be that the thing only reacts to our requests? What if it records everything and keeps the records ready… for whatever reason?

    1. Let’s see here. You could desolder its storage device, throw it into a programmer, and have a look at the data to see if it’s recording everything and keeping the records ready. You could educate yourself on compression algorithms and see if it’s even realistic for a device to be permanently recording everything you say, 24/7. You could use Wireshark and see what sort of data is being sent to Amazon’s servers.

      Just because you’re too lazy to acquire knowledge doesn’t mean you get to fill those gaps with whatever tinfoil-hat bullshit appeals most to your mental-illness-addled mind.

      1. Q: How many days of recordings converted into plain ASCII text (don’t tell me Alexa can’t do that, every other modern computer can) could be stored on a cheap 1mb flash memory module? 16mb? 32mb? 1gb?

        It’s possible. Perhaps not likely, but certainly possible.

        1. Kinda depends on if this device can convert your speech on the fly (other than that trigger word) or it sends it to their servers to be processed. (honestly i’m not to sure) either way i think you meant MB not mb

      2. I’d say it’s realistic to assume one hour of voice can be compressed to a few megabytes. GSM used something around 8kbps, and I guess the algorithms got better nowadays. After all, the voice could be preprocessed.
        I don’t talk 24/7, thus any realistic flash memory size would be enough for a month or so. Anyway, I take it as the thing is more or less constantly connected to Internet and could send the recording to Amazon.

  6. To all the tin hatters out there. My keyboard is connected to my PC 24/7. Does that mean Microsoft is recording everything I type and keeping those records ready.. for whatever reason? NO

    Buy all means question the security but don’t go down the ‘government / everyone / aliens / big brother / lizard men’ are watching me rabbit hole.

    1. Very true, merely having a keyboard doesn’t mean interception of all keystrokes. But it does if a keylogger is installed on your machine.

      I can’t speak for tinfoil-hat-wearers, but my interest lies not in what the devices do now, but in their potential for future abuse. It’s not always logging but it is always listening, and a change to what it does with what it hears is only a firmware update – official or malicious, it doesn’t matter which – away.

      Somehow I don’t expect shadowy agencies with black helecopters and lizard controllers to feature, but it wouldn’t surprise me if somewhere down the line these devices were compromised for criminal purposes.

      1. Criminal organizations operate very much like normal businesses. If there exists a positive ROI one will make the investment, no matter how high it is. The days of random grab-and-snatch are long gone.

  7. “Voice assistants are clearly going to become a ubiquitous feature of our lives” Yes I hope so. But I hope to have the backend server doing the heavy lifting on my own private lan also.
    I wonder if anyones hacked a alexa or similar to communicate to a jasper or other cmu sphinx based local backend so the packets never leave the local subnet…

      1. I can understand why after a nights fiddling round….
        Pocketsphinx just barfed on debian8 after it compiled cleanly, and the gentoo box wont even compile it because its dependent on pulseaudio apparently and that box is a pottering free zone.
        I think option b sounds more achieveable, buy a orangePI and run the daemon on that remote and make it the microphone node, and feed that into the homeassistant backend running on the main server…

    1. I don’t know, I still haven’t found a voice assistant that does anything useful.

      Cortana’s the best so far because she can tell me the weather, wake me up from a nap, remind me of things, and even launch Notepad++ and Skyrim (but not Visual Studio or any other game?). Still, there’s a lot of things I’d like an assistant to do (notify me of e-mails, launch any program, dictation into any program, etc.) that nothing had figured out how to do.

      I hope a voice assistant comes along soon that can actually be customized and extended. I’d LOVE to have a voice assistant that works and sounds like the Enterprise’s computer.

      “Computer. Tea, Earl Grey, hot.”
      ::Kurig starts::
      “Computer. Primary 3D printer status.”
      “Primary 3D printer is at 67%, ETA 2 hours and 16 minutes.”
      “Computer. Activate program Gann 8.”
      ::Lights dim, TV turns on, Plex client opens::

      1. Allegedly Homeassistant glued to Jasper in the right way can do all this and take voice input via remote computers with compatible browsers and a plugin so plenty of scope for a diy button style input device. I say allegedly because had a quick look at it but found myself working through Jasper’s dependencies on Debian and decided first priority was make homeassistant speak to existing diy’d home automation gear, and hours disappeared on tangent. Liking the look of it so far…

        1. There’s a couple great voice-based home assistants on Github. I haven’t gone far into home automation (leaving that for when I get an actual house!) but my father’s been working on a robot and I was brainstorming the simplest way to add voice commands to it.

  8. This is inspiring me to write a top 10 hit with every device keyword in it repeatedly, followed by undesirable actions. Obviously the message that this is stupid and an inevitable result of mass stupidity to accept it, is not getting through.

    Ok google order semtex, Xbox find pressure cooker….

        1. That reminds me of a Laurel and Hardy clip my grandma has on a Super 8 reel. In the clip they’re destroying each others stuff. So when played in reverse you can watch them put everything back together ^^

    1. There was an very old programming book that used EXACTLY this example to teach you basic logic. Can’t remember the book’s name, but must have been late 1960s, or maybe early 70s.

      Scenario: household robot comes out under voice command. You only want it to come out at the correct time, so you program it to only come out if it hears “word1” (can’t remember the actual word) followed by “word2” in X seconds. Teaches you “AND” logic.

      But then daughter listens to a popular song that has “word1” and “word2” in same time window, and your cleaning(?) robot keeps needlessly coming out at the wrong time. So now you need to add additional “NOT” logic: “word1” AND “word2”, but NOT if followed by “word3” (that’s in the song) is in the same time window.

      Seems like Alexa can’t be programmed so specifically.

      (Really wish I could remember the name of that book. The only other thing I can remember about it is some chapter called “Socrates and the Hound”, but Google fails me so it must have been an obscure book.)

    1. Just visit each of them and say something like “Alexa, buy me a Southbend KEMTL-100”, or any other expensive item found on Amazon. Repeat this until they either disable Alexa or stop inviting you…

  9. 1. Turn Alexa on.
    2. Loop online ISIS recruitment videos so Alexa can hear it.
    3. Wait…
    4. SWAT breaks down your door, shoots your dog, then takes you into custody.
    Are we having fun yet?

    1. I am certain that someday these 24/7 listeners will be triggered by certain words like “bomb” “jihad” and other buzzwords, and start sending everything to someone.

      Just like e-mails will be scanned for those words too.

      I know I wont be installing microphones and cameras into my house

    2. Won’t work. Instead, “alexa, buy silver nitrate*, nail varnish remover, lots of fertiliser and some bleach”
      And repeat. Take out housing insurance and leave home.
      Return and claim insurance, rinse, repeat.

      *Source of joke-phrase is the four lions film (incase you wondered)

  10. Alexa would become much more intelligent with a simple PIR sensor and a little sensor fusion: “I hear my name. Lets see if there is a warm body around that could be to blame.”

    1. Wouldn’t have stopped it. There WERE warm bodies watching the news at the time. You need to match the warm bodies with the direction the sound’s coming from. Then you need to only allow certain “registered” warm bodies to count… Have fun working out all the corner cases!

  11. Why the heck doesn’t Alexa simply request a confirmation before spending your money? Siri asks for a confirmation before sending a frigging text message.

    Oh, that’s right… It’s not in Amazon’s interests to prevent you from buying stuff from them (whatever the reason).

  12. Many people already have little things that listen to everything they say, remember those things and blurt out those things at inconvenient times. They are called “children.” If you’ve ever dealt with them, you’ll know what I mean.

    If you treat Alexa and other such gadgets as ill-mannered children, you may be able to deal with them effectively.

  13. Right, so I think the UN should mandate all such services to respond to the command ‘Query. Are there any spy devices in range?’ or some such universal statement, so people can test if they can talk freely. And that should apply to google and amazon and facebook but also to cars and CCTV and their audio monitoring.

Leave a Reply

Please be kind and respectful to help make the comments section excellent. (Comment Policy)

This site uses Akismet to reduce spam. Learn how your comment data is processed.