Smart Assistants Need To Get Smarter

Science fiction has regularly portrayed smart computer assistants in a fanciful way. HAL from 2001: A Space Odyssey and J.A.R.V.I.S. from the contemporary Iron Man films are both great examples. They’re erudite, wise, and capable of doing just about any reasonable task that is asked of them, short of opening the pod bay doors.

Cut back to reality, and you’ll only be disappointed at how useless most voice assistants are. It’s been twelve long years since Siri burst onto the scene, with Alexa and Google Assistant following years later. Despite years on the market, their capabilities remain limited and uninspiring. It’s time for voice assistants to level up.

Is There Anything You Can Do?

Alexa allows users to easily purchase common household items via voice queries. It can readily search prior orders to help nail down the correct item. It’s a useful feature many rely on every day. Amazon

The modern crop of voice assistants were, in many ways, a gamechanger when they first hit the market. They gave us our first real taste of interacting with computers in natural language. No more did we have to carefully craft exact commands for a simplistic voice recognition system. Instead, the idea was that we could speak almost normally, and the assistant would respond.

These days, voice assistants can handle a broad spectrum of tasks. You can use them to send a message, if you trust the voice recognition not to misrepresent your words, or you can add events to your calendar. You can do basic maths, play songs, and even switch your lights on and off – assuming you’ve knitted your smarthome together properly. Google and Amazon will let you make purchases, too, within certain parameters.

Fundamentally, though, these are all pretty basic party tricks. In all of these cases, the voice assistant is basically just saving the user a few mouseclicks, or saving them from pulling out their smartphone. The problem is a lack of higher intelligence and thinking that would make them truly useful, like a proper human assistant.

Ask Google Assistant to recommend you a good local restaurant, and you’ll be disappointed. Nine times out of ten, it will just type “restaurants near me” into Google and show you a list. A human assistant would know that you prefer steak and pub food to tapas, do the research, and come back to you accordingly. Big tech companies have all this data on most of us, or are certainly able to collect it, but they’re not employing it in this useful way.

The Flight Booking Test

Picture another scenario. You’re road tripping down the highway towards the airport, and you need to book a flight on the way. Our movie protagonists would surely bark a simple request at their AI assistant, who would respond with a series of convenient flights and prices. The appropriate bookings would then be handled with pre-stored payment information.

Try that with Google Assistant or Bixby today, and you’ll get nowhere. The former will simply dump you into a web search. The latter has a dedicated add-on for looking at flights, but it’s virtually unusable, failing to properly understand the right departing and arriving airports. Siri is similarly weak-minded, faltering when asked to look into available hotels online.

Yes, it’s that bad. You have a powerful smartphone sitting next to you in the car. It can understand what you say perfectly well, but it’s entirely powerless to execute even a simple request.

Contrast that to having a friend in the passenger seat, who could simply read you out a couple of flights and ask which you want to buy. It’s not that hard, but your voice assistant can’t do it.

A user asks Siri to book a hotel in Melbourne, Australia. When that fails, they decide to try Hong Kong instead, with the assistant faring little better. According to the user, at best, Siri would allow the user to make a phone call to the hotel in question. It took over ten attempts just to get that far. Booking directly was impossible.

It’s true that some innovation in this area has been made; Amazon integrated bookings with various airlines with Alexa years ago, for example. The problem is that piecemeal efforts don’t cut it. For such a feature to be useful, it has to work properly almost all of the time. Voice recognition technology has been the subject of mockery since the 1990s for its poor reliability. It’s a lesson that today’s voice assistants could learn from. It’s all well and good if a user can book flights with a certain airline in the continental US using their voice assistant. If it fails every time they’re in a different country, or wanting to fly a different airline, then the users will give up because the feature is functionally useless most of the time.

It bears noting that many of these situations are regionally variable, too. For example, if you’re in the US, you might find that flight and hotel bookings are more readily available to your smart assistant. Or, in Australia, you might note that the Google Assistant has a good handle on movie session times. But the regional variability and the inconsistency are the big problem that really spoils these features.

What’s The Fix?

Smart speakers are selling en masse, as are smartphones with voice assistants baked in. Building one with a real edge in capability could be a competitive advantage. Amazon

These are just a few examples; you can probably think of thousands more. These fundamentally aren’t even technically difficult queries for an assistant to respond to. Not only that, but the required information is already available online. The problem comes down to two factors: integration, and authority.

Solving the integration problem requires a certain level of work on the back end. Companies would need to hook into existing databases and ensure their voice assistants can reliably parse and work with the data. This would require agreements and coordination with external companies in many cases, further complicating the issue.

As for authority, that’s something companies have struggled with since the dawn of smart assistants. Amazon, and more recently Google, will allow you to purchase items with your smart assistant. However, that has required protections to be put in place after awkward instances of TV broadcasts inadvertently triggering home devices. Similarly, there are risks for families, where young children might ask a helpful voice assistant to make purchases without prior parental authorization. However, in the case of a user speaking directly into their smartphone, it’s hard to imagine that voice fingerprinting or a simple device unlock wouldn’t be enough to authorize purchases.

Given a greater level of integration, and thus utility, is possible, why aren’t big tech companies rushing to unlock this functionality? The real key may be that it doesn’t serve them any real purpose. Tech companies could certainly put in the work to advance voice assistant capabilities, but it would take time and money. Nevermind the greater risks to reputation if the newly granted authority allowed smart assistants to do something truly inconvenient or awful for users. The voice assistants we already have aren’t exactly money spinners as it is, so it’s perhaps no surprise the difficult and expensive problems aren’t being solved.

Many will say that the problems listed here are edge cases, and that nobody uses their voice assistants this way. This author would counter that nobody does because it simply doesn’t work right now. The very spawning idea for this article came from a long drive, where it became apparent that I would have to spend half an hour clicking my way through various basic admin tasks because my voice assistant was completely incapable to help. Twelve years after the first one hit the market, it shouldn’t be that way.

If one voice assistant does begin to crest the integration mountain, things could change. If it performs reliably, it will also earn the authority to act that we don’t currently give to humble smart assistants today. At that point, you can expect rival tech companies to improve their own products to match. Until one company makes the first move, though, we’re out of luck. We’ll all be wishing we had a real assistant to help us out, rather than the impotent disembodied voices that currently live in our smartphones.

81 thoughts on “Smart Assistants Need To Get Smarter

  1. Voice assistants partnered with cameras should be able to tell me I left my keys, remote etc in the oven. Maybe with a smart oven refusing the oven to turn on because there is a foreign object inside.

    I’m waiting.

    1. Does a French baguette count as a foreign object? :P

      More seriously, it’s hard to determine what should or shouldn’t be in a oven. Small child? Definitely not. Some thing wrapped in tin foil? Impossible to know. Fish? Hard to tell if it’s a pet or food.

        1. You are wasting energy.
          Yes, you don’t put raw bread dough in a cold oven, but plenty of other things are fine, for example heating up or cooking meat, or vegetables, heating up pizza. If you put it in there when it’s cold, you don’t lose an oven’s full of hot air too. Win-win.

          1. Even apart from the things that don’t work that way, it’s also not going to take the same length of time if you start from cold, and it won’t even be the same length of time as someone else’s oven starting from cold. Maybe it’s a purposeful use of energy that decreases the chance of wasting food.

  2. The problem is there’s no way to monetize them. They can’t play ads on your speakers or they’ll end up in the trash in a hot second; they can’t charge a subscription fee for them, so what’s a major corporation to do?

    1. > can’t charge a subscription fee

      Why not? If I had a human assistant I’d pay them. If a virtual assistant were more useful they’d justify a subscription fee.

    2. “so what’s a major corporation to do?”

      Just what they have been doing for years; track 👣 your movements, Contacts info, Internet usage, television watching, etc. In order to build a huge digital dossier which is exploited for the purposes of making your money theirs!

      1. > Now amazon is getting into healthcare and policing instead

        Given that this whole article is about how useless these “AI” assistants are, this should be scaring the shit out of us. It certainly does for me.

    3. They might not make money directly from the assistant, however if google for example didn’t have an assistant, then they wouldn’t getting money from me from my YouTube subscription, Nest subscription, google drive subscription, and I wouldn’t be buying a pixel phone, watch and tablet either…
      If they invested MORE money into their assistant instead of cutting back, they would get more people come over to their ecosystem.

      You don’t have to charge a fee for everything.

  3. The big problem is with offline or limited bandwidth situations. The smart device should be smarter to handle local needs like lights on/off, temp up/down, security camera display, etc. My DSL problems (Brightspeed/CenturyLink) has at times of low bandwidth caused slow to no responses. A smarter device would only reach out or ‘phone home ‘ when it needed more in-depth data.

    As mentioned in the article about a TV triggering a response…. one night I was watching a comedy show where one person made a suggestion to solve another’s problem with “what you need is to have sex ” … my Lenovo/Google display piped up and said ” I’m sorry… I can’t help you with that right now..”

    1. The big problem is with online solutions which require a connection in order to work.
      Imagine in the future your light swtich isn’t connected to your light but to a server on the internet which sends a message to your light bulb.
      And the internet goes out.
      For some people (fools frankly) this is a reality already.
      Maybe they are starting to realise that is dumb and pushing back ?

  4. I remember reading the keen observation that the reason voice assistants end up relegated to either setting timers or playing music is that they’re limited by the fact that they were made to serve their makers, not their users.

    1. And the fact that all the smarts of the device are foiled by search engine optimizations and algorithm chasing by online vendors in attempts to misdirect the results to something you didn’t want.

      Say “Alexa, buy me a light bulb”, and you’ll never know what you will get.

  5. Yes, the assistants could certainly be smarter and do more, and they are getting that way. I think this article is a bit harsh though, I love my assistants and find them very useful. I’m actually [somewhat] impressed with what-all they can do and look forward to them getting even better.
    Part of the problem as DougM says, is how to monetize them. The difficulty in doing so has caused their developers to slash the R&D budgets. A tiered subscription program would pro’ly be the way to go but convincing people to pay for it will require more and better skills, and a lot of marketing saavy.

  6. The problem is fundamental.

    “Natural Language” is an awful interface.

    We have this idea that ‘normal’ human conversation is somehow the ideal way to pass information and/or convey ideas.

    It absolutely is not. People CONSTANTLY misunderstand each other. Two people can use the same words, with the same inflection, and the same body language, but mean different things.

    However, there is already a solution to this problem.

    “Technical Language” (TL)

    In contrast with “Conversational Language”(CL), TL seeks to limit a statement to as few meanings as possible. Ideally one meaning.

    Since the article brings up the voice controls of Star Trek, we already have a well known example.

    “Computer (replicate); Tea, Earl Grey, Hot.”

    Picard didn’t say “Computer, can you make me some tea?” as one might in conversation. That statement can have multiple interpretations and is highly context specific.

    The problem with TL is it requires people to learn it, and it requires effort to use.

    So, to answer your question, “Is there anything you can do?”.
    Nope. Your whole premise is flawed. Using a mutable language that can mean different things to the speaker and listener every time it is used, will never reliably reach the desired results. Eliminate the fallacy.

    I’m not saying “give up on voice control”.
    I’m saying that this is an interface. It is not unreasonable to require users to learn things.
    The idea that everything needs to be so simple to use that anyone can use it without training or even familiarity is just wrong.
    The clothes you are wearing required training to put on.
    You had to learn words.
    You had to be taught to read.

    We need to be encouraging engagement, because people are efficient/lazy.

    If you make something too easy, people will refuse to learn anything about it.

    But again, the actual premise of a natural language interface is a fallacy.

    1. Needing to know specific commands is a big reason I’m not fond of voice control. At least a graphical interface will show me exactly what the program can do. Voice interfaces are opaque like command line interfaces.

      But I think large language models (text input, I know) do a much better job in that respect. I can keep an ongoing conversation, update old information, ask what the voice assistant is capable of, etc.

    2. Having to say “computer replicate; tea, earl grey, hot” is a bit tedious. You should be able to say “make me a drink” and it know what you are after. you just woke up, here is some coffee, you just came inside and the temperature is below 5c-hot chocolate, or above 20c-water…. The information is all there, it can infer a decent statistical match. If you want something specific you would then ask for it.

      I think thats the premise of the article. If my home assistant could make a drink I should be able to just ask for a drink with a decent chance of getting what I want instead of specifying every small detail for every interaction.

      1. That’s not what you say. You use one word to indicate that the sentence is addressed to the voice assistant, because it can’t make eye contact with you and know who you’re talking to. If you use a button to activate it, you can skip this word. Then you use a word or two to let it know you want it to make you a drink now. If the words for your drink are unambiguous in context, the “I want a drink” can be omitted. To choose what drink, you can either use any valid unambiguous way of identifying the drink, or you can allow it to ask you further questions or make a suggestion for your approval. If it remembers you and your preferences, then asking for “my usual” or asking for a drink and accepting its suggestion may be more possible. That’s because the context and the saved information resolved the ambiguity well enough that most people don’t care it might sometimes do the wrong thing, not because it wasn’t important.

        But in the star trek example? I can *easily* believe that if my magic box was set up to produce just about any conceivable food item for a billion cultures and species, then someone would program in a formal way of addressing it that was chosen to not irritate those people, and that scanning the person asking for food and making obvious assumptions could be taken negatively. Do *you* want to be the one to explain why the machine would automatically include an ingredient which is a necessary supplement for pregnant XYZ’ians if that status might be a secret? I can easily imagine the leader of a ship who needs to think and act diplomatically a lot and simultaneously needs to give orders a lot would get into the habit of asking for what he wants in a somewhat less conversational way, as a combination of efficiency and appearances.

        Therefore, he addresses the device, he starts with the general request “tea”. This (though I haven’t a clue in that fictional universe) would in this universe indicate that water should be heated immediately before the request is finished being decoded. Following, he specifies the type, followed by his personal preference. At a counter, you might say “Hello, I’d like” + “a Mocha” (“medium, with 2% milk”) which is a greeting plus the main request with variations specified in a reasonable order, though multiple phrasings are equivalent.

        Garbage in, garbage out. People don’t realize how much they say that the other person doesn’t actually decode, and just fills in with what they think goes there- and computers aren’t human, so they are less likely to think of the same things unless someone’s done a lot of preparation ahead of time.

        1. “People don’t realize how much they say that the other person doesn’t actually decode”
          I realize this every time I send an email with a specific question, and the supporting information needed to understand the context, and get a reply that indicates the person never even read the email and jumped to a conclusion before they even read to the end of the first sentence.

          Because apparently reading and listening are too hard to do, as is analytical thinking.

          I believe you could tell a human server “Tea, Earl Grey, hot” and you would be asked if you want “sweet tea”… because my diabetic husband routinely requests unsweetened iced tea and they ask if he wants sweet tea, or just give him sweet tea without any clarification. And no, this isn’t in the southern US, where “nobody drinks unsweetened tea”.

          1. I wouldn’t be entirely surprised if it’s easier to get unsweetened tea in the south; the restaurants go through so much iced tea that they have probably gotten used to people asking for it just like lemon/no lemon. On the other hand you’ll have limited options for hot drinks except coffee.

      2. ‘You should be able to say “make me a drink”’

        And the Sirius Cybernetics Corporation Nutrimatic will give you a cup of something almost, but not quite, entirely unlike tea.

        1. Be careful.
          Of he many meanings of “Make me a drink”, some are particularly unpleasant to the requester.

          How big is the confidence interval on “transform me into a liquid intended for imbibing”?

          It might get messy.

          1. (Ford) “you’d better be prepared for the jump into hyperspace. It’s unpleasantly like being drunk.”
            (Arthur) “What’s so unpleasant about being drunk?”
            (Ford) “Ask a glass of water.”

    3. As a voice only interface, TL is vastly more efficient.

      I imagine in the Star Trek example you gave, if you just asked for Tea, it would then ask what kind of tea, then once you make your choice, it would ask you how hot you want it. Even that’s a fairly efficient way of choosing a drink compared to CL.

      I get frustrated with how conversational Virtual Assistants currently are. I want to be able to quickly reply to a message it’s read to me while driving, not spend 2 minutes of back and forth to get a one 7 word reply sent.

      1. Not really.

        The way he asked for tea was a crappy and annoying way to talk.

        All he had to say was “computer, I’d like a cup of hot Earl grey tea”.
        (Or been really lazy, and just said “computer, hot Earl Grey”)

        Much better because it flows NATURALLY.

        You absolutely don’t need to talk like you’ve had a brain injury to talk clearly. Much better to slightly tweak your phrasing here and there, than wasting time learning an unnecessary new way to talk.

    4. The NLP problem occurs between humans too.

      Disambiguation of meaning is achieved through various feedback mechanisms.
      For example, most human-human conversations are two-way where the listener, if confused about the meaning being conveyed, replies back to the speaker to find out more details.
      During the conversation, the speaker “finds the level” of the listener pretty quickly, e.g. is this person intelligent or not, do they know my defaults, are they a native [English] speaker/not. The level and pace of conversation is adjusted accordingly.

      Now let’s consider the state of the art in Computer speech recognition (and then, semantic understanding on top). Currently, speech recognition is used in a unidirectional way that picks out specific utterances, words, phrases etc, with a reasonably useful accuracy (in clean audio conditions with the same speaker/similar accent to training group). What is not happening (much) is a feedback loop where the computer interacts with the speaker in real time to hone the meaning attempted to be conveyed. An upshot of this is that the speaker doesn’t adjust their speech or phraseology to improve the outcome, and then gets frustrated.

      The next step ought to be to add such conversational feedback. Part of which can then pick off the meaning. There’s still a long way to go.

      1. Given that humans who make a career out of assisting people over the phone still struggle with “finding the level” of the other person – or determining what they want and what to do about it, I don’t think we’ll be able to make an AI any better. The people who instead chose make a career out of programming probably won’t be able to correctly express enough of the tactics that the AI should employ, much less successfully teach them to it.

        (The behaviors they keep choosing or neglecting to implement for self driving cars seem like they didn’t involve enough experts on driving, and instead they took something that can operate a car, taught it what the laws are, and let it practice not hitting things. Which isn’t the same as actively participating in traffic, where you do things like realize that someone who is edging to the right of the lane and slowing slightly just before each road sign will turn right when they find the road they are looking for.)

    5. We have precedent for learning technical language. The early Palm PDAs didn’t learn to read our handwriting, they came with their own script, called “Graffiti” that you had to learn. Once you had the hang of it, it was often _faster_ than normal handwriting, and definitely less ambiguous and much less resource-intensive for the machine to process. So I don’t think it would be impossible for people to learn a “technical” syntax that we know we have to use “when talking to a computer.”

      We used to at least have the OPTION of doing something similar when using a search engine. You could use plus and minus signs, quotation marks etc to clarify your search. Google got rid of all that and decided to give us the results Google _thinks_ we want, not what we _actually searched for_.

      Of course, the results we receive on a web search in 2023 are what google thinks will sell more advertising, not what the user actually wants. And that, my friends, is why I:
      A) Don’t expect “voice assistants” to get any better in the current market, and
      B) Won’t let these things in my house or car.

  7. The next step is having the voice assistant always listening and remembering our preferences. Which has huge security and privacy issues. Not that they are not already always listening. We just need more control over how and what data is collected.

    They also need to be strongly keyed to our voices which Siri seems to do well with. My phone won’t respond to my wife and her’s won’t respond to me. But it needs to go further and be more controllable. Purchases for example should require a verification or “password”. That only works if my voice is the one saying it. Still being able to respond to others should also be permitted with authorization. I should be able to say “hey siri respond to my son”.

    Chat GPT and other AI are getting very impressive and I believe we are getting very close to being able to have JARVIS comparable systems.

    The major market for this technology is the elderly. Having a conversational computer that can help and entertain elderly people is very much needed. Smart enough to identify when they are in distress and automatically call for help. These kind of systems could also keep people company. These are services that people will pay for if there is enough utility and reliability. My wife’s grandmother for example has dementia and lives alone. We are already paying for a bracelet that is suppose to alert us if she has a fall or is unresponsive. But an conversational AI that helps her with her memory while monitoring her status would be much better.

  8. Are we anywhere close to AI using GUIs like a person? It sounds like the solution you want. How would an AI handle popup ads? Hmmm

    The current trend is using an LLM to generate code or markup, and then submitting that to an API. There may be multiple layers like an LLM optimized for chat on the frontend and an LLM optimized for code on the back. My employer has something similar, with an AI chat bot that has access to “skill plugins” that are basically API endpoints wrapped in LLM prompts. It’s pretty effective since LLMs do well at translation, even English to JSON translation.

  9. The reason why these voice controlled boxes suck is because the companies don’t produce them to make your life better. They don’t make any profit if you book a flight or go to a restaurant. Improving the voice/authority detection won’t change this fundamental flaw.

    Eventually if these boxes served the buyers and not the makers like Donald P. said above, then a more universal booking system for flights and restaurants etc. would be helpful, like a plugin system to make our lives easier.

    1. I can tell you what they get used for around my house, at least: I use it as a voice control front end for my home automation system, and to set timers to remind me to check on the laundry; roomie uses it to stream music.

      Personally, I’d pay money for an _offline_ voice interaction system instead of having to homebrew one myself- that project’s been on hold for a couple reasons, most of which is a lack of hardware for the nodes, and time to construct a library of intents and getting the various software parts glued together.

      If there was a way to hack or redirect the newer generation Dots to use an on-premise server, I’ve love to see it.

    2. Well, they do have limited utility I think; my late mother who in her 90s had poor eyesight and poor mobility and lived on her own had one installed by a carer and setup so mum could say ‘Alexa play classic FM’ and it would do so. But, when my brother tried setting up smart bulbs so mum could turn on the lights without getting up and potentially having a fall, it proved too much for mum to remember the magic incantations, so she ended up sitting in the dark until a neighbour noticed and came to fix things, so the user experience in this case was very poor and potentially a big problem.

    3. My major use is to add reminders of things I need to do that come to mind while I am driving to and from work. I generate lots of to-dos while commuting, and completely forget them all by the time I get to work. I also think about the research I work on, and come up with a lot of ideas for future experiments, which I would completely lose if I couldn’t get even part of the idea in digital form.

  10. Book you a flight? Recommended local restaurants? Jeez I would be happy if my speaker groups actually worked…… I would be happy if my music would play for more than 15 seconds without stopping for no reason. If only google would stop breaking things in the name of update.

  11. My personal Google Assistant experience has been in sharp decline for about 6 months. A Google Home in the kitchen and nearly a dozen Nest Minis and Chromecasts means we have full voice command coverage around the house and can stop/start/cue media or lights in every room. Everything *seemed* to be getting better in our home automation environment until the speakers started having trouble with commands, performing the wrong actions, not responding, etc. Now, the same everyday requests to ‘play News’, ‘turn it down’, or even just ‘stop playing’ often turn into shouting matches of 10+ failed interactions before I unplug them or manually execute changes using Home Assistant. I have obviously done the power cycle, factory reset, and voice retraining that should be obvious fixes, but no improvement. It *feels* like a lack-luster firmware update may have made them less responsive ( or less NOISY from Google’s perspective) and a weakening of the back-end interpreting power is saving the mothership alot of money. My biggest wish at this point is for the Google Home and Chromecast devices to be reverse engineered, jailbroken, and made to work ‘cloudless’ and directly with my local Home Assistant instance.

    Great comments here. Please keep this discussion going <3

  12. Can’t do it. Don’t even use text to speech. If they want to wow me they can start with working text to speech. Been computering these 35 odd years. “Simple commands carefully crafted” describes the current crop very well.

    1. Text to speech can not just sound like a human voice, but it can mimic a specific person’s voice pretty well given some sample clips. Though that capability is the cornerstone of various businesses who’ve made software that can do it; it’s not everywhere.

      Speech to text is good enough to understand my accent overtop of road noise with just the software built into a 4-year-old phone and not even connected to the internet. Doesn’t know what to do with the words I said, but it does recognize the words.

  13. Personally, I think a person should just get smarter and wiser, get some exercise, and richer by not having ‘voice’ devices planted around the house just to be hip. Get up and shut the light off. Get up and make the coffee, bacon and eggs…. And way less electronic maintenance/waste too :) . I really think the planet is headed for a idiocracy society :) .

  14. I’m on the older side of life but have been involved with electronics and other technical hobbies most of my existence. I’ve used home automation, in one form or another, for decades. For the last 4 years I’ve used Alexa extensively as an on/off remote control, executing automations, scheduling events, reminders, calendar maintenance, quick/simple information retrieval, erc, etc, etc.

    In my mind, we’ve come a very long way from television remote controls, self-installed relay control, X10 devices, to home automation as we now recognize it. I’ve been quite happy following the journey and feel confident things will improve with time.

  15. What I think would be funny is a spoof of the “2001” scene where HAL has locked David Bowman out of the ship, but replace Dave in the pod with Tony Stark in an Iron Man suit and HAL with JARVIS. Of course played and voiced by Robery Downey Jr and Paul Bettany.
    Tony: Open the pod bay doors, JARVIS.
    JARVIS: I’m sorry, Tony. I’m afraid I can’t do that.
    Riff on from there. 🙂

    1. “Tony: Open the pod bay doors, JARVIS.
      JARVIS: I’m sorry, Tony. I’m afraid I can’t do that.
      Tony: I need the LOWER pod bay doors opened.. I need to make a leak…
      JARVIS: I’m sorry but you will need to hold it… “

  16. I do industrial automation and would happily set up my home in the same manner.
    Walk up to the front door, it recognizes you and unlocks before your hand pushes it open.
    It’s 6PM so it automatically checks the weigh sensor under the kettle before asking to turn it on then alerting you.
    It turns the TV on as you enter the living room.
    At 8PM it asks if you want a bath before closing the plug and monitoring the bath level and temperature then alerting you it’s ready.
    As is, I have my lights on timers and/or PIR so I can wander around.
    Connect it to the net? No effin way. We don’t connect power stations, chemical plants etc to the net and there’s a simple reason for that – Max Headroom (series 1 ep4 I think) where he gets locked out of his own house…

  17. I dunno, man. Asking real humans to do stuff, even in the correct context, even when accomplishing such a feat is their reason to be employed or engaged in the activity- even that goes wrong enough that I don’t have a lot of hope for an automated version.
    Also, if I rolled out of bed and told (not asked, told because that is how it is phrased) my wife “make me a drink” I can assure you, a drink of any variety let alone a nice hot Earl Grey would absolutely NOT be what I would get. Again, it feels unreasonable to hold a non-human do that standard.
    Finally, I guess lives of people are different but I’ve crafted my career and life such that replying to text message while driving is something I never need to do, or at least maybe 1-2 per year and even at that, the letter “K.” Honestly ask yourself if you are happy “working” or doing whatever it is on your phone while driving (I’m looking at you, 90% of California drivers) constantly. Do you reeealllly need to be doing that, or maybe just maybe a little mental health break while you, I don’t know, careen a 2000lb killmobile past a school crosswalk could use a teeny bit more of your attention.

    1. I have a frequent use case

      me: “I’m on my way home”
      1 minute later when I’m already on the road
      SO: “we’re out of onions, can go to the store?”

      I’d use the voice texting if it was better … and if my phone still had physical controls to activate the damn feature. RIP Active Edge aka squeeze for assistant.

  18. There was a much smarter voice assistant called Vi that could break down and solve complex tasks.

    The startup developing it got bought by apple, who promptly killed it to protect alexa.

  19. I’m glad I’m seeing this. Our Google assistant is literally so dumb that I recently had to look up if it just hadn’t been updated or was an old model and everyone else thought the same.

Leave a Reply

Please be kind and respectful to help make the comments section excellent. (Comment Policy)

This site uses Akismet to reduce spam. Learn how your comment data is processed.