Smart Assistants Need To Get Smarter

July 19, 2023

Science fiction has regularly portrayed smart computer assistants in a fanciful way. HAL from 2001: A Space Odyssey and J.A.R.V.I.S. from the contemporary Iron Man films are both great examples. They’re erudite, wise, and capable of doing just about any reasonable task that is asked of them, short of opening the pod bay doors.

Cut back to reality, and you’ll only be disappointed at how useless most voice assistants are. It’s been twelve long years since Siri burst onto the scene, with Alexa and Google Assistant following years later. Despite years on the market, their capabilities remain limited and uninspiring. It’s time for voice assistants to level up.

Is There Anything You Can Do?

Alexa allows users to easily purchase common household items via voice queries. It can readily search prior orders to help nail down the correct item. It’s a useful feature many rely on every day. *Amazon*

The modern crop of voice assistants were, in many ways, a gamechanger when they first hit the market. They gave us our first real taste of interacting with computers in natural language. No more did we have to carefully craft exact commands for a simplistic voice recognition system. Instead, the idea was that we could speak almost normally, and the assistant would respond.

These days, voice assistants can handle a broad spectrum of tasks. You can use them to send a message, if you trust the voice recognition not to misrepresent your words, or you can add events to your calendar. You can do basic maths, play songs, and even switch your lights on and off – assuming you’ve knitted your smarthome together properly. Google and Amazon will let you make purchases, too, within certain parameters.

Fundamentally, though, these are all pretty basic party tricks. In all of these cases, the voice assistant is basically just saving the user a few mouseclicks, or saving them from pulling out their smartphone. The problem is a lack of higher intelligence and thinking that would make them truly useful, like a proper human assistant.

Ask Google Assistant to recommend you a good local restaurant, and you’ll be disappointed. Nine times out of ten, it will just type “restaurants near me” into Google and show you a list. A human assistant would know that you prefer steak and pub food to tapas, do the research, and come back to you accordingly. Big tech companies have all this data on most of us, or are certainly able to collect it, but they’re not employing it in this useful way.

The Flight Booking Test

Picture another scenario. You’re road tripping down the highway towards the airport, and you need to book a flight on the way. Our movie protagonists would surely bark a simple request at their AI assistant, who would respond with a series of convenient flights and prices. The appropriate bookings would then be handled with pre-stored payment information.

Try that with Google Assistant or Bixby today, and you’ll get nowhere. The former will simply dump you into a web search. The latter has a dedicated add-on for looking at flights, but it’s virtually unusable, failing to properly understand the right departing and arriving airports. Siri is similarly weak-minded, faltering when asked to look into available hotels online.

Yes, it’s that bad. You have a powerful smartphone sitting next to you in the car. It can understand what you say perfectly well, but it’s entirely powerless to execute even a simple request.

Contrast that to having a friend in the passenger seat, who could simply read you out a couple of flights and ask which you want to buy. It’s not that hard, but your voice assistant can’t do it.

A user asks Siri to book a hotel in Melbourne, Australia. When that fails, they decide to try Hong Kong instead, with the assistant faring little better. According to the user, at best, Siri would allow the user to make a phone call to the hotel in question. It took over ten attempts just to get that far. Booking directly was impossible.

It’s true that some innovation in this area has been made; Amazon integrated bookings with various airlines with Alexa years ago, for example. The problem is that piecemeal efforts don’t cut it. For such a feature to be useful, it has to work properly almost all of the time. Voice recognition technology has been the subject of mockery since the 1990s for its poor reliability. It’s a lesson that today’s voice assistants could learn from. It’s all well and good if a user can book flights with a certain airline in the continental US using their voice assistant. If it fails every time they’re in a different country, or wanting to fly a different airline, then the users will give up because the feature is functionally useless most of the time.

It bears noting that many of these situations are regionally variable, too. For example, if you’re in the US, you might find that flight and hotel bookings are more readily available to your smart assistant. Or, in Australia, you might note that the Google Assistant has a good handle on movie session times. But the regional variability and the inconsistency are the big problem that really spoils these features.

What’s The Fix?

Smart speakers are selling en masse, as are smartphones with voice assistants baked in. Building one with a real edge in capability could be a competitive advantage. *Amazon*

These are just a few examples; you can probably think of thousands more. These fundamentally aren’t even technically difficult queries for an assistant to respond to. Not only that, but the required information is already available online. The problem comes down to two factors: integration, and authority.

Solving the integration problem requires a certain level of work on the back end. Companies would need to hook into existing databases and ensure their voice assistants can reliably parse and work with the data. This would require agreements and coordination with external companies in many cases, further complicating the issue.

As for authority, that’s something companies have struggled with since the dawn of smart assistants. Amazon, and more recently Google, will allow you to purchase items with your smart assistant. However, that has required protections to be put in place after awkward instances of TV broadcasts inadvertently triggering home devices. Similarly, there are risks for families, where young children might ask a helpful voice assistant to make purchases without prior parental authorization. However, in the case of a user speaking directly into their smartphone, it’s hard to imagine that voice fingerprinting or a simple device unlock wouldn’t be enough to authorize purchases.

Given a greater level of integration, and thus utility, is possible, why aren’t big tech companies rushing to unlock this functionality? The real key may be that it doesn’t serve them any real purpose. Tech companies could certainly put in the work to advance voice assistant capabilities, but it would take time and money. Nevermind the greater risks to reputation if the newly granted authority allowed smart assistants to do something truly inconvenient or awful for users. The voice assistants we already have aren’t exactly money spinners as it is, so it’s perhaps no surprise the difficult and expensive problems aren’t being solved.

Many will say that the problems listed here are edge cases, and that nobody uses their voice assistants this way. This author would counter that nobody does because it simply doesn’t work right now. The very spawning idea for this article came from a long drive, where it became apparent that I would have to spend half an hour clicking my way through various basic admin tasks because my voice assistant was completely incapable to help. Twelve years after the first one hit the market, it shouldn’t be that way.

If one voice assistant does begin to crest the integration mountain, things could change. If it performs reliably, it will also earn the authority to act that we don’t currently give to humble smart assistants today. At that point, you can expect rival tech companies to improve their own products to match. Until one company makes the first move, though, we’re out of luck. We’ll all be wishing we had a real assistant to help us out, rather than the impotent disembodied voices that currently live in our smartphones.

81 thoughts on “Smart Assistants Need To Get Smarter”

The2dcour says:

July 19, 2023 at 7:19 am

Voice assistants partnered with cameras should be able to tell me I left my keys, remote etc in the oven. Maybe with a smart oven refusing the oven to turn on because there is a foreign object inside.

I’m waiting.

Report comment

Reply
1. TG says:
  
  July 19, 2023 at 9:56 am
  
  Good grief man
  
  Report comment
  
  Reply
2. Dan says:
  
  July 19, 2023 at 11:23 am
  
  Does a French baguette count as a foreign object? :P
  
  More seriously, it’s hard to determine what should or shouldn’t be in a oven. Small child? Definitely not. Some thing wrapped in tin foil? Impossible to know. Fish? Hard to tell if it’s a pet or food.
  
  Report comment
  
  Reply
  1. Dude says:
    
    July 19, 2023 at 11:35 am
    
    Usually there should be nothing in the oven when you first turn it on, because you just want it to heat up. Very few food items go in when the oven is cold.
    
    Report comment
    
    Reply
    1. abjq says:
      
      July 20, 2023 at 4:19 am
      
      You are wasting energy.
      Yes, you don’t put raw bread dough in a cold oven, but plenty of other things are fine, for example heating up or cooking meat, or vegetables, heating up pizza. If you put it in there when it’s cold, you don’t lose an oven’s full of hot air too. Win-win.
      
      Report comment
      
      Reply
      1. spaceminions says:
        
        July 20, 2023 at 6:07 am
        
        Even apart from the things that don’t work that way, it’s also not going to take the same length of time if you start from cold, and it won’t even be the same length of time as someone else’s oven starting from cold. Maybe it’s a purposeful use of energy that decreases the chance of wasting food.
        
        Report comment
3. Lewin Day says:
  
  July 20, 2023 at 4:11 pm
  
  I love how this comment train got entirely into the weeds.
  
  Hope you found the remote.
  
  Report comment
  
  Reply
DougM says:

July 19, 2023 at 7:42 am

The problem is there’s no way to monetize them. They can’t play ads on your speakers or they’ll end up in the trash in a hot second; they can’t charge a subscription fee for them, so what’s a major corporation to do?

Report comment

Reply
1. CampGareth says:
  
  July 19, 2023 at 7:53 am
  
  > can’t charge a subscription fee
  
  Why not? If I had a human assistant I’d pay them. If a virtual assistant were more useful they’d justify a subscription fee.
  
  Report comment
  
  Reply
2. The Commenter Formerly Known As Ren says:
  
  July 19, 2023 at 7:59 am
  
  “so what’s a major corporation to do?”
  
  Just what they have been doing for years; track 👣 your movements, Contacts info, Internet usage, television watching, etc. In order to build a huge digital dossier which is exploited for the purposes of making your money theirs!
  
  Report comment
  
  Reply
  1. Ostracus says:
    
    July 19, 2023 at 10:40 am
    
    Me doing the macarena must make for some interesting data.
    
    Report comment
    
    Reply
    1. Garth Bock says:
      
      July 19, 2023 at 8:04 pm
      
      Gag !
      
      Report comment
      
      Reply
3. M says:
  
  July 20, 2023 at 11:49 am
  
  Amazon hope alexa would lead to frequent ai-assisted amazon purchases. It did not.
  
  Now amazon is getting into healthcare and policing instead.
  
  Report comment
  
  Reply
  1. Nick says:
    
    November 27, 2023 at 5:21 pm
    
    > Now amazon is getting into healthcare and policing instead
    
    Given that this whole article is about how useless these “AI” assistants are, this should be scaring the shit out of us. It certainly does for me.
    
    Report comment
    
    Reply
4. Lewin Day says:
  
  July 20, 2023 at 4:11 pm
  
  Yeah, definitely a problem. Why can’t we have nice things? There’s no revenue stream!
  
  Report comment
  
  Reply
5. Marty says:
  
  July 20, 2023 at 7:17 pm
  
  They might not make money directly from the assistant, however if google for example didn’t have an assistant, then they wouldn’t getting money from me from my YouTube subscription, Nest subscription, google drive subscription, and I wouldn’t be buying a pixel phone, watch and tablet either…
  If they invested MORE money into their assistant instead of cutting back, they would get more people come over to their ecosystem.
  
  You don’t have to charge a fee for everything.
  
  Report comment
  
  Reply
  1. Nick says:
    
    November 27, 2023 at 5:25 pm
    
    Not me, Google already has way more info about me from Gmail. No way would I concentrate even more personal data with them.
    
    Report comment
    
    Reply
Garth Bock says:

July 19, 2023 at 8:11 am

The big problem is with offline or limited bandwidth situations. The smart device should be smarter to handle local needs like lights on/off, temp up/down, security camera display, etc. My DSL problems (Brightspeed/CenturyLink) has at times of low bandwidth caused slow to no responses. A smarter device would only reach out or ‘phone home ‘ when it needed more in-depth data.

As mentioned in the article about a TV triggering a response…. one night I was watching a comedy show where one person made a suggestion to solve another’s problem with “what you need is to have sex ” … my Lenovo/Google display piped up and said ” I’m sorry… I can’t help you with that right now..”

Report comment

Reply
1. Lewin Day says:
  
  July 20, 2023 at 4:12 pm
  
  Yeah, definitely a frustrating edge case.
  
  That said, my current smart assistant sucks all the time, and it’s connected to the Internet 24/7.
  
  Report comment
  
  Reply
2. Petter says:
  
  July 20, 2023 at 4:29 pm
  
  “right now” 😂 a sign of future features then?
  
  Report comment
  
  Reply
3. dave says:
  
  July 21, 2023 at 8:05 am
  
  The big problem is with online solutions which require a connection in order to work.
  Imagine in the future your light swtich isn’t connected to your light but to a server on the internet which sends a message to your light bulb.
  And the internet goes out.
  For some people (fools frankly) this is a reality already.
  Maybe they are starting to realise that is dumb and pushing back ?
  
  Report comment
  
  Reply
Donald Papp says:

July 19, 2023 at 8:15 am

I remember reading the keen observation that the reason voice assistants end up relegated to either setting timers or playing music is that they’re limited by the fact that they were made to serve their makers, not their users.

Report comment

Reply
1. Dude says:
  
  July 19, 2023 at 11:27 am
  
  And the fact that all the smarts of the device are foiled by search engine optimizations and algorithm chasing by online vendors in attempts to misdirect the results to something you didn’t want.
  
  Say “Alexa, buy me a light bulb”, and you’ll never know what you will get.
  
  Report comment
  
  Reply
  1. Garth Bock says:
    
    July 19, 2023 at 8:07 pm
    
    “Purchasing one million light bulbs with overnight shipping…”
    
    Report comment
    
    Reply
DrWizard says:

July 19, 2023 at 8:21 am

Yes, the assistants could certainly be smarter and do more, and they are getting that way. I think this article is a bit harsh though, I love my assistants and find them very useful. I’m actually [somewhat] impressed with what-all they can do and look forward to them getting even better.
Part of the problem as DougM says, is how to monetize them. The difficulty in doing so has caused their developers to slash the R&D budgets. A tiered subscription program would pro’ly be the way to go but convincing people to pay for it will require more and better skills, and a lot of marketing saavy.

Report comment

Reply
Ian says:

July 19, 2023 at 8:46 am

The problem is fundamental.

“Natural Language” is an awful interface.

We have this idea that ‘normal’ human conversation is somehow the ideal way to pass information and/or convey ideas.

It absolutely is not. People CONSTANTLY misunderstand each other. Two people can use the same words, with the same inflection, and the same body language, but mean different things.

However, there is already a solution to this problem.

“Technical Language” (TL)

In contrast with “Conversational Language”(CL), TL seeks to limit a statement to as few meanings as possible. Ideally one meaning.

Since the article brings up the voice controls of Star Trek, we already have a well known example.

“Computer (replicate); Tea, Earl Grey, Hot.”

Picard didn’t say “Computer, can you make me some tea?” as one might in conversation. That statement can have multiple interpretations and is highly context specific.

The problem with TL is it requires people to learn it, and it requires effort to use.

So, to answer your question, “Is there anything you can do?”.
Nope. Your whole premise is flawed. Using a mutable language that can mean different things to the speaker and listener every time it is used, will never reliably reach the desired results. Eliminate the fallacy.

Note:
I’m not saying “give up on voice control”.
I’m saying that this is an interface. It is not unreasonable to require users to learn things.
The idea that everything needs to be so simple to use that anyone can use it without training or even familiarity is just wrong.
The clothes you are wearing required training to put on.
You had to learn words.
You had to be taught to read.

We need to be encouraging engagement, because people are efficient/lazy.

If you make something too easy, people will refuse to learn anything about it.

But again, the actual premise of a natural language interface is a fallacy.

Report comment

Reply
1. Maave says:
  
  July 19, 2023 at 9:18 am
  
  Needing to know specific commands is a big reason I’m not fond of voice control. At least a graphical interface will show me exactly what the program can do. Voice interfaces are opaque like command line interfaces.
  
  But I think large language models (text input, I know) do a much better job in that respect. I can keep an ongoing conversation, update old information, ask what the voice assistant is capable of, etc.
  
  Report comment
  
  Reply
2. GEO says:
  
  July 19, 2023 at 10:56 am
  
  Having to say “computer replicate; tea, earl grey, hot” is a bit tedious. You should be able to say “make me a drink” and it know what you are after. you just woke up, here is some coffee, you just came inside and the temperature is below 5c-hot chocolate, or above 20c-water…. The information is all there, it can infer a decent statistical match. If you want something specific you would then ask for it.
  
  I think thats the premise of the article. If my home assistant could make a drink I should be able to just ask for a drink with a decent chance of getting what I want instead of specifying every small detail for every interaction.
  
  Report comment
  
  Reply
  1. spaceminions says:
    
    July 19, 2023 at 11:55 am
    
    That’s not what you say. You use one word to indicate that the sentence is addressed to the voice assistant, because it can’t make eye contact with you and know who you’re talking to. If you use a button to activate it, you can skip this word. Then you use a word or two to let it know you want it to make you a drink now. If the words for your drink are unambiguous in context, the “I want a drink” can be omitted. To choose what drink, you can either use any valid unambiguous way of identifying the drink, or you can allow it to ask you further questions or make a suggestion for your approval. If it remembers you and your preferences, then asking for “my usual” or asking for a drink and accepting its suggestion may be more possible. That’s because the context and the saved information resolved the ambiguity well enough that most people don’t care it might sometimes do the wrong thing, not because it wasn’t important.
    
    But in the star trek example? I can *easily* believe that if my magic box was set up to produce just about any conceivable food item for a billion cultures and species, then someone would program in a formal way of addressing it that was chosen to not irritate those people, and that scanning the person asking for food and making obvious assumptions could be taken negatively. Do *you* want to be the one to explain why the machine would automatically include an ingredient which is a necessary supplement for pregnant XYZ’ians if that status might be a secret? I can easily imagine the leader of a ship who needs to think and act diplomatically a lot and simultaneously needs to give orders a lot would get into the habit of asking for what he wants in a somewhat less conversational way, as a combination of efficiency and appearances.
    
    Therefore, he addresses the device, he starts with the general request “tea”. This (though I haven’t a clue in that fictional universe) would in this universe indicate that water should be heated immediately before the request is finished being decoded. Following, he specifies the type, followed by his personal preference. At a counter, you might say “Hello, I’d like” + “a Mocha” (“medium, with 2% milk”) which is a greeting plus the main request with variations specified in a reasonable order, though multiple phrasings are equivalent.
    
    Garbage in, garbage out. People don’t realize how much they say that the other person doesn’t actually decode, and just fills in with what they think goes there- and computers aren’t human, so they are less likely to think of the same things unless someone’s done a lot of preparation ahead of time.
    
    Report comment
    
    Reply
    1. J. ODell says:
      
      July 20, 2023 at 12:30 pm
      
      “People don’t realize how much they say that the other person doesn’t actually decode”
      I realize this every time I send an email with a specific question, and the supporting information needed to understand the context, and get a reply that indicates the person never even read the email and jumped to a conclusion before they even read to the end of the first sentence.
      
      Because apparently reading and listening are too hard to do, as is analytical thinking.
      
      I believe you could tell a human server “Tea, Earl Grey, hot” and you would be asked if you want “sweet tea”… because my diabetic husband routinely requests unsweetened iced tea and they ask if he wants sweet tea, or just give him sweet tea without any clarification. And no, this isn’t in the southern US, where “nobody drinks unsweetened tea”.
      
      Report comment
      
      Reply
      1. spaceminions says:
        
        July 21, 2023 at 8:16 am
        
        I wouldn’t be entirely surprised if it’s easier to get unsweetened tea in the south; the restaurants go through so much iced tea that they have probably gotten used to people asking for it just like lemon/no lemon. On the other hand you’ll have limited options for hot drinks except coffee.
        
        Report comment
  2. mike stone says:
    
    July 19, 2023 at 5:59 pm
    
    ‘You should be able to say “make me a drink”’
    
    And the Sirius Cybernetics Corporation Nutrimatic will give you a cup of something almost, but not quite, entirely unlike tea.
    
    Report comment
    
    Reply
    1. The Commenter Formerly Known As Ren says:
      
      July 19, 2023 at 6:35 pm
      
      Finally, someone mentioned SCC’s Nutramatic!
      It is almost as if the writer wanted someone to comment about that!
      
      Report comment
      
      Reply
    2. Ian says:
      
      July 19, 2023 at 8:02 pm
      
      Be careful.
      Of he many meanings of “Make me a drink”, some are particularly unpleasant to the requester.
      
      How big is the confidence interval on “transform me into a liquid intended for imbibing”?
      
      It might get messy.
      
      Report comment
      
      Reply
      1. ONV says:
        
        July 21, 2023 at 3:00 am
        
        “Tea; earl grey; hot”…. “oh and a pan-galactic gargle blaster chaser”
        
        Report comment
      2. Matt Brunton says:
        
        July 21, 2023 at 6:10 am
        
        (Ford) “you’d better be prepared for the jump into hyperspace. It’s unpleasantly like being drunk.”
        (Arthur) “What’s so unpleasant about being drunk?”
        (Ford) “Ask a glass of water.”
        
        Report comment
3. spaceminions says:
  
  July 19, 2023 at 11:11 am
  
  And things like procedure words help a lot – some voice assistants do at least understand that they need to require specific key words, but not to the extent that they are used when people working together want to get things right with an imperfect communication channel. https://en.wikipedia.org/wiki/Procedure_word
  
  Report comment
  
  Reply
4. Dude says:
  
  July 19, 2023 at 11:30 am
  
  >“Computer (replicate); Tea, Earl Grey, Hot.”
  
  Lipton or Twinings?
  
  Report comment
  
  Reply
5. Z00111111 says:
  
  July 19, 2023 at 5:30 pm
  
  As a voice only interface, TL is vastly more efficient.
  
  I imagine in the Star Trek example you gave, if you just asked for Tea, it would then ask what kind of tea, then once you make your choice, it would ask you how hot you want it. Even that’s a fairly efficient way of choosing a drink compared to CL.
  
  I get frustrated with how conversational Virtual Assistants currently are. I want to be able to quickly reply to a message it’s read to me while driving, not spend 2 minutes of back and forth to get a one 7 word reply sent.
  
  Report comment
  
  Reply
  1. Robert Pruitt says:
    
    July 24, 2023 at 10:11 am
    
    Not really.
    
    The way he asked for tea was a crappy and annoying way to talk.
    
    All he had to say was “computer, I’d like a cup of hot Earl grey tea”.
    (Or been really lazy, and just said “computer, hot Earl Grey”)
    
    Much better because it flows NATURALLY.
    
    You absolutely don’t need to talk like you’ve had a brain injury to talk clearly. Much better to slightly tweak your phrasing here and there, than wasting time learning an unnecessary new way to talk.
    
    Report comment
    
    Reply
6. Garth Bock says:
  
  July 19, 2023 at 8:08 pm
  
  Natural Language…
  
  https://youtu.be/Q3bdXctq7DM
  
  Report comment
  
  Reply
7. KJ says:
  
  July 20, 2023 at 5:35 am
  
  I vote for mandating Lojban as a second language.
  
  Report comment
  
  Reply
8. abjq says:
  
  July 20, 2023 at 6:23 am
  
  The NLP problem occurs between humans too.
  
  Disambiguation of meaning is achieved through various feedback mechanisms.
  For example, most human-human conversations are two-way where the listener, if confused about the meaning being conveyed, replies back to the speaker to find out more details.
  During the conversation, the speaker “finds the level” of the listener pretty quickly, e.g. is this person intelligent or not, do they know my defaults, are they a native [English] speaker/not. The level and pace of conversation is adjusted accordingly.
  
  Now let’s consider the state of the art in Computer speech recognition (and then, semantic understanding on top). Currently, speech recognition is used in a unidirectional way that picks out specific utterances, words, phrases etc, with a reasonably useful accuracy (in clean audio conditions with the same speaker/similar accent to training group). What is not happening (much) is a feedback loop where the computer interacts with the speaker in real time to hone the meaning attempted to be conveyed. An upshot of this is that the speaker doesn’t adjust their speech or phraseology to improve the outcome, and then gets frustrated.
  
  The next step ought to be to add such conversational feedback. Part of which can then pick off the meaning. There’s still a long way to go.
  
  Report comment
  
  Reply
  1. spaceminions says:
    
    July 20, 2023 at 12:51 pm
    
    Given that humans who make a career out of assisting people over the phone still struggle with “finding the level” of the other person – or determining what they want and what to do about it, I don’t think we’ll be able to make an AI any better. The people who instead chose make a career out of programming probably won’t be able to correctly express enough of the tactics that the AI should employ, much less successfully teach them to it.
    
    (The behaviors they keep choosing or neglecting to implement for self driving cars seem like they didn’t involve enough experts on driving, and instead they took something that can operate a car, taught it what the laws are, and let it practice not hitting things. Which isn’t the same as actively participating in traffic, where you do things like realize that someone who is edging to the right of the lane and slowing slightly just before each road sign will turn right when they find the road they are looking for.)
    
    Report comment
    
    Reply
9. Lewin Day says:
  
  July 20, 2023 at 4:12 pm
  
  Interesting point, but I’d have to disagree. More often than not, my phone understands exactly what I’m asking for, it just has no way to execute
  
  Report comment
  
  Reply
10. Nick says:
  
  November 27, 2023 at 5:57 pm
  
  We have precedent for learning technical language. The early Palm PDAs didn’t learn to read our handwriting, they came with their own script, called “Graffiti” that you had to learn. Once you had the hang of it, it was often _faster_ than normal handwriting, and definitely less ambiguous and much less resource-intensive for the machine to process. So I don’t think it would be impossible for people to learn a “technical” syntax that we know we have to use “when talking to a computer.”
  
  We used to at least have the OPTION of doing something similar when using a search engine. You could use plus and minus signs, quotation marks etc to clarify your search. Google got rid of all that and decided to give us the results Google _thinks_ we want, not what we _actually searched for_.
  
  Of course, the results we receive on a web search in 2023 are what google thinks will sell more advertising, not what the user actually wants. And that, my friends, is why I:
  A) Don’t expect “voice assistants” to get any better in the current market, and
  B) Won’t let these things in my house or car.
  
  Report comment
  
  Reply
thearduinoguy says:

July 19, 2023 at 8:47 am

Surely a voice assistant that is powered by ChatGPT would be the obvious answer?

Report comment

Reply
1. Piotrsko says:
  
  July 19, 2023 at 9:01 am
  
  Only if you don’t care what the results are that get returned.
  
  Report comment
  
  Reply
Chris McDonald says:

July 19, 2023 at 9:11 am

The next step is having the voice assistant always listening and remembering our preferences. Which has huge security and privacy issues. Not that they are not already always listening. We just need more control over how and what data is collected.

They also need to be strongly keyed to our voices which Siri seems to do well with. My phone won’t respond to my wife and her’s won’t respond to me. But it needs to go further and be more controllable. Purchases for example should require a verification or “password”. That only works if my voice is the one saying it. Still being able to respond to others should also be permitted with authorization. I should be able to say “hey siri respond to my son”.

Chat GPT and other AI are getting very impressive and I believe we are getting very close to being able to have JARVIS comparable systems.

The major market for this technology is the elderly. Having a conversational computer that can help and entertain elderly people is very much needed. Smart enough to identify when they are in distress and automatically call for help. These kind of systems could also keep people company. These are services that people will pay for if there is enough utility and reliability. My wife’s grandmother for example has dementia and lives alone. We are already paying for a bracelet that is suppose to alert us if she has a fall or is unresponsive. But an conversational AI that helps her with her memory while monitoring her status would be much better.

Report comment

Reply
Maave says:

July 19, 2023 at 9:36 am

Are we anywhere close to AI using GUIs like a person? It sounds like the solution you want. How would an AI handle popup ads? Hmmm

The current trend is using an LLM to generate code or markup, and then submitting that to an API. There may be multiple layers like an LLM optimized for chat on the frontend and an LLM optimized for code on the back. My employer has something similar, with an AI chat bot that has access to “skill plugins” that are basically API endpoints wrapped in LLM prompts. It’s pretty effective since LLMs do well at translation, even English to JSON translation.

Report comment

Reply
1. Lewin says:
  
  July 20, 2023 at 4:52 am
  
  An AI shouldn’t have to use a GUI; big tech and big companies can work together viw APIs as usual.
  
  Report comment
  
  Reply
m1ke says:

July 19, 2023 at 9:38 am

The reason why these voice controlled boxes suck is because the companies don’t produce them to make your life better. They don’t make any profit if you book a flight or go to a restaurant. Improving the voice/authority detection won’t change this fundamental flaw.

Eventually if these boxes served the buyers and not the makers like Donald P. said above, then a more universal booking system for flights and restaurants etc. would be helpful, like a plugin system to make our lives easier.

Report comment

Reply
TG says:

July 19, 2023 at 9:56 am

Why do people still use these crappy gimmick machines? Throw it in the trash.

Report comment

Reply
1. J.Cook says:
  
  July 19, 2023 at 10:13 am
  
  I can tell you what they get used for around my house, at least: I use it as a voice control front end for my home automation system, and to set timers to remind me to check on the laundry; roomie uses it to stream music.
  
  Personally, I’d pay money for an _offline_ voice interaction system instead of having to homebrew one myself- that project’s been on hold for a couple reasons, most of which is a lack of hardware for the nodes, and time to construct a library of intents and getting the various software parts glued together.
  
  If there was a way to hack or redirect the newer generation Dots to use an on-premise server, I’ve love to see it.
  
  Report comment
  
  Reply
2. Sailingfree says:
  
  July 19, 2023 at 10:34 am
  
  Well, they do have limited utility I think; my late mother who in her 90s had poor eyesight and poor mobility and lived on her own had one installed by a carer and setup so mum could say ‘Alexa play classic FM’ and it would do so. But, when my brother tried setting up smart bulbs so mum could turn on the lights without getting up and potentially having a fall, it proved too much for mum to remember the magic incantations, so she ended up sitting in the dark until a neighbour noticed and came to fix things, so the user experience in this case was very poor and potentially a big problem.
  
  Report comment
  
  Reply
  1. Ostracus says:
    
    July 19, 2023 at 5:04 pm
    
    Seems physical sensors would have been better, tied maybe to her Lifealert pendant.
    
    Report comment
    
    Reply
3. J ODell says:
  
  July 20, 2023 at 12:39 pm
  
  My major use is to add reminders of things I need to do that come to mind while I am driving to and from work. I generate lots of to-dos while commuting, and completely forget them all by the time I get to work. I also think about the research I work on, and come up with a lot of ideas for future experiments, which I would completely lose if I couldn’t get even part of the idea in digital form.
  
  Report comment
  
  Reply
C Lee says:

July 19, 2023 at 12:06 pm

Book you a flight? Recommended local restaurants? Jeez I would be happy if my speaker groups actually worked…… I would be happy if my music would play for more than 15 seconds without stopping for no reason. If only google would stop breaking things in the name of update.

Report comment

Reply
1. Lewin says:
  
  July 20, 2023 at 4:53 am
  
  Worse case: try getting Google Assistant to play your choice of Weezer album, all of which are self titled in the metadata.
  
  Report comment
  
  Reply
  1. Nick says:
    
    November 27, 2023 at 5:43 pm
    
    You think that’s bad? Try asking an American voice assistant (or a human!) if it’s OK to smoke a fag in this bar.
    
    Report comment
    
    Reply
frolix says:

July 19, 2023 at 1:46 pm

My personal Google Assistant experience has been in sharp decline for about 6 months. A Google Home in the kitchen and nearly a dozen Nest Minis and Chromecasts means we have full voice command coverage around the house and can stop/start/cue media or lights in every room. Everything *seemed* to be getting better in our home automation environment until the speakers started having trouble with commands, performing the wrong actions, not responding, etc. Now, the same everyday requests to ‘play News’, ‘turn it down’, or even just ‘stop playing’ often turn into shouting matches of 10+ failed interactions before I unplug them or manually execute changes using Home Assistant. I have obviously done the power cycle, factory reset, and voice retraining that should be obvious fixes, but no improvement. It *feels* like a lack-luster firmware update may have made them less responsive ( or less NOISY from Google’s perspective) and a weakening of the back-end interpreting power is saving the mothership alot of money. My biggest wish at this point is for the Google Home and Chromecast devices to be reverse engineered, jailbroken, and made to work ‘cloudless’ and directly with my local Home Assistant instance.

Great comments here. Please keep this discussion going <3

Report comment

Reply
1. Lewin says:
  
  July 20, 2023 at 4:56 am
  
  Yeah, my Pixel in 2016 was great at voice commands. Fast forward to last year and it couldn’t even respond to its own wake command on a Pixel 5.
  
  Report comment
  
  Reply
Miles says:

July 19, 2023 at 8:01 pm

Can’t do it. Don’t even use text to speech. If they want to wow me they can start with working text to speech. Been computering these 35 odd years. “Simple commands carefully crafted” describes the current crop very well.

Report comment

Reply
1. spaceminions says:
  
  July 20, 2023 at 12:55 pm
  
  Text to speech can not just sound like a human voice, but it can mimic a specific person’s voice pretty well given some sample clips. Though that capability is the cornerstone of various businesses who’ve made software that can do it; it’s not everywhere.
  
  Speech to text is good enough to understand my accent overtop of road noise with just the software built into a 4-year-old phone and not even connected to the internet. Doesn’t know what to do with the words I said, but it does recognize the words.
  
  Report comment
  
  Reply
rclark says:

July 19, 2023 at 9:06 pm

Personally, I think a person should just get smarter and wiser, get some exercise, and richer by not having ‘voice’ devices planted around the house just to be hip. Get up and shut the light off. Get up and make the coffee, bacon and eggs…. And way less electronic maintenance/waste too :) . I really think the planet is headed for a idiocracy society :) .

Report comment

Reply
1. Lewin says:
  
  July 20, 2023 at 4:55 am
  
  You’re welcome to get out of your bed to turn off the lights. I’ma stay under my blanket snug and warm.
  
  Report comment
  
  Reply
2. AwD says:
  
  July 20, 2023 at 6:47 am
  
  We should get rid of those cellular telephones, rap music, hula hoops, fax machines and everything else the kids these days are into too!
  
  Report comment
  
  Reply
MmmDee says:

July 19, 2023 at 10:12 pm

I’m on the older side of life but have been involved with electronics and other technical hobbies most of my existence. I’ve used home automation, in one form or another, for decades. For the last 4 years I’ve used Alexa extensively as an on/off remote control, executing automations, scheduling events, reminders, calendar maintenance, quick/simple information retrieval, erc, etc, etc.

In my mind, we’ve come a very long way from television remote controls, self-installed relay control, X10 devices, to home automation as we now recognize it. I’ve been quite happy following the journey and feel confident things will improve with time.

Report comment

Reply
Gregg Eshelman says:

July 20, 2023 at 1:43 am

What I think would be funny is a spoof of the “2001” scene where HAL has locked David Bowman out of the ship, but replace Dave in the pod with Tony Stark in an Iron Man suit and HAL with JARVIS. Of course played and voiced by Robery Downey Jr and Paul Bettany.
Tony: Open the pod bay doors, JARVIS.
JARVIS: I’m sorry, Tony. I’m afraid I can’t do that.
Riff on from there. 🙂

Report comment

Reply
1. Garth Bock says:
  
  July 20, 2023 at 5:33 am
  
  “Tony: Open the pod bay doors, JARVIS.
  JARVIS: I’m sorry, Tony. I’m afraid I can’t do that.
  Tony: I need the LOWER pod bay doors opened.. I need to make a leak…
  JARVIS: I’m sorry but you will need to hold it… “
  
  Report comment
  
  Reply
rtyh4rhr4 says:

July 20, 2023 at 5:06 am

please test llama.cpp and mycroft

Report comment

Reply
The Commenter Formerly Known As Ren says:

July 20, 2023 at 5:39 am

“Computer, list a number of locations to hide this body”

Report comment

Reply
Gareth says:

July 20, 2023 at 6:27 am

I do industrial automation and would happily set up my home in the same manner.
Walk up to the front door, it recognizes you and unlocks before your hand pushes it open.
It’s 6PM so it automatically checks the weigh sensor under the kettle before asking to turn it on then alerting you.
It turns the TV on as you enter the living room.
At 8PM it asks if you want a bath before closing the plug and monitoring the bath level and temperature then alerting you it’s ready.
etc
As is, I have my lights on timers and/or PIR so I can wander around.
Connect it to the net? No effin way. We don’t connect power stations, chemical plants etc to the net and there’s a simple reason for that – Max Headroom (series 1 ep4 I think) where he gets locked out of his own house…

Report comment

Reply
craig says:

July 20, 2023 at 9:06 am

I dunno, man. Asking real humans to do stuff, even in the correct context, even when accomplishing such a feat is their reason to be employed or engaged in the activity- even that goes wrong enough that I don’t have a lot of hope for an automated version.
Also, if I rolled out of bed and told (not asked, told because that is how it is phrased) my wife “make me a drink” I can assure you, a drink of any variety let alone a nice hot Earl Grey would absolutely NOT be what I would get. Again, it feels unreasonable to hold a non-human do that standard.
Finally, I guess lives of people are different but I’ve crafted my career and life such that replying to text message while driving is something I never need to do, or at least maybe 1-2 per year and even at that, the letter “K.” Honestly ask yourself if you are happy “working” or doing whatever it is on your phone while driving (I’m looking at you, 90% of California drivers) constantly. Do you reeealllly need to be doing that, or maybe just maybe a little mental health break while you, I don’t know, careen a 2000lb killmobile past a school crosswalk could use a teeny bit more of your attention.

Report comment

Reply
1. Maave says:
  
  July 20, 2023 at 11:50 am
  
  I have a frequent use case
  
  me: “I’m on my way home”
  1 minute later when I’m already on the road
  SO: “we’re out of onions, can go to the store?”
  
  I’d use the voice texting if it was better … and if my phone still had physical controls to activate the damn feature. RIP Active Edge aka squeeze for assistant.
  
  Report comment
  
  Reply
2. Lewin Day says:
  
  July 20, 2023 at 4:14 pm
  
  I’ll absolutely hold a machine to that standard! I buy a coffee maker to make me coffee!
  
  Report comment
  
  Reply
M says:

July 20, 2023 at 11:50 am

There was a much smarter voice assistant called Vi that could break down and solve complex tasks.

The startup developing it got bought by apple, who promptly killed it to protect alexa.

Report comment

Reply
1. M says:
  
  July 20, 2023 at 11:52 am
  
  sorry, to protect *siri*
  too many voice assistants, too little innovation
  
  Report comment
  
  Reply
William Payne says:

July 20, 2023 at 3:15 pm

Sci fi writer Theodore Sturgeon advises that 10% of everything is good stuff.

Assistants ‘good stuff’ or the other stuff?

Report comment

Reply
Achille says:

July 21, 2023 at 12:12 pm

I’m glad I’m seeing this. Our Google assistant is literally so dumb that I recently had to look up if it just hadn’t been updated or was an old model and everyone else thought the same.

Report comment

Reply
Cyna says:

July 23, 2023 at 2:13 pm

Unless the companies behind start taking privacy serious, I see no use for them.

Report comment

Reply