Science fiction has regularly portrayed smart computer assistants in a fanciful way. HAL from 2001: A Space Odyssey and J.A.R.V.I.S. from the contemporary Iron Man films are both great examples. They’re erudite, wise, and capable of doing just about any reasonable task that is asked of them, short of opening the pod bay doors.
Cut back to reality, and you’ll only be disappointed at how useless most voice assistants are. It’s been twelve long years since Siri burst onto the scene, with Alexa and Google Assistant following years later. Despite years on the market, their capabilities remain limited and uninspiring. It’s time for voice assistants to level up.
Is There Anything You Can Do?
The modern crop of voice assistants were, in many ways, a gamechanger when they first hit the market. They gave us our first real taste of interacting with computers in natural language. No more did we have to carefully craft exact commands for a simplistic voice recognition system. Instead, the idea was that we could speak almost normally, and the assistant would respond.
These days, voice assistants can handle a broad spectrum of tasks. You can use them to send a message, if you trust the voice recognition not to misrepresent your words, or you can add events to your calendar. You can do basic maths, play songs, and even switch your lights on and off – assuming you’ve knitted your smarthome together properly. Google and Amazon will let you make purchases, too, within certain parameters.
Fundamentally, though, these are all pretty basic party tricks. In all of these cases, the voice assistant is basically just saving the user a few mouseclicks, or saving them from pulling out their smartphone. The problem is a lack of higher intelligence and thinking that would make them truly useful, like a proper human assistant.
Ask Google Assistant to recommend you a good local restaurant, and you’ll be disappointed. Nine times out of ten, it will just type “restaurants near me” into Google and show you a list. A human assistant would know that you prefer steak and pub food to tapas, do the research, and come back to you accordingly. Big tech companies have all this data on most of us, or are certainly able to collect it, but they’re not employing it in this useful way.
The Flight Booking Test
Picture another scenario. You’re road tripping down the highway towards the airport, and you need to book a flight on the way. Our movie protagonists would surely bark a simple request at their AI assistant, who would respond with a series of convenient flights and prices. The appropriate bookings would then be handled with pre-stored payment information.
Try that with Google Assistant or Bixby today, and you’ll get nowhere. The former will simply dump you into a web search. The latter has a dedicated add-on for looking at flights, but it’s virtually unusable, failing to properly understand the right departing and arriving airports. Siri is similarly weak-minded, faltering when asked to look into available hotels online.
Yes, it’s that bad. You have a powerful smartphone sitting next to you in the car. It can understand what you say perfectly well, but it’s entirely powerless to execute even a simple request.
Contrast that to having a friend in the passenger seat, who could simply read you out a couple of flights and ask which you want to buy. It’s not that hard, but your voice assistant can’t do it.
A user asks Siri to book a hotel in Melbourne, Australia. When that fails, they decide to try Hong Kong instead, with the assistant faring little better. According to the user, at best, Siri would allow the user to make a phone call to the hotel in question. It took over ten attempts just to get that far. Booking directly was impossible.
It’s true that some innovation in this area has been made; Amazon integrated bookings with various airlines with Alexa years ago, for example. The problem is that piecemeal efforts don’t cut it. For such a feature to be useful, it has to work properly almost all of the time. Voice recognition technology has been the subject of mockery since the 1990s for its poor reliability. It’s a lesson that today’s voice assistants could learn from. It’s all well and good if a user can book flights with a certain airline in the continental US using their voice assistant. If it fails every time they’re in a different country, or wanting to fly a different airline, then the users will give up because the feature is functionally useless most of the time.
It bears noting that many of these situations are regionally variable, too. For example, if you’re in the US, you might find that flight and hotel bookings are more readily available to your smart assistant. Or, in Australia, you might note that the Google Assistant has a good handle on movie session times. But the regional variability and the inconsistency are the big problem that really spoils these features.
What’s The Fix?
These are just a few examples; you can probably think of thousands more. These fundamentally aren’t even technically difficult queries for an assistant to respond to. Not only that, but the required information is already available online. The problem comes down to two factors: integration, and authority.
Solving the integration problem requires a certain level of work on the back end. Companies would need to hook into existing databases and ensure their voice assistants can reliably parse and work with the data. This would require agreements and coordination with external companies in many cases, further complicating the issue.
As for authority, that’s something companies have struggled with since the dawn of smart assistants. Amazon, and more recently Google, will allow you to purchase items with your smart assistant. However, that has required protections to be put in place after awkward instances of TV broadcasts inadvertently triggering home devices. Similarly, there are risks for families, where young children might ask a helpful voice assistant to make purchases without prior parental authorization. However, in the case of a user speaking directly into their smartphone, it’s hard to imagine that voice fingerprinting or a simple device unlock wouldn’t be enough to authorize purchases.
Given a greater level of integration, and thus utility, is possible, why aren’t big tech companies rushing to unlock this functionality? The real key may be that it doesn’t serve them any real purpose. Tech companies could certainly put in the work to advance voice assistant capabilities, but it would take time and money. Nevermind the greater risks to reputation if the newly granted authority allowed smart assistants to do something truly inconvenient or awful for users. The voice assistants we already have aren’t exactly money spinners as it is, so it’s perhaps no surprise the difficult and expensive problems aren’t being solved.
Many will say that the problems listed here are edge cases, and that nobody uses their voice assistants this way. This author would counter that nobody does because it simply doesn’t work right now. The very spawning idea for this article came from a long drive, where it became apparent that I would have to spend half an hour clicking my way through various basic admin tasks because my voice assistant was completely incapable to help. Twelve years after the first one hit the market, it shouldn’t be that way.
If one voice assistant does begin to crest the integration mountain, things could change. If it performs reliably, it will also earn the authority to act that we don’t currently give to humble smart assistants today. At that point, you can expect rival tech companies to improve their own products to match. Until one company makes the first move, though, we’re out of luck. We’ll all be wishing we had a real assistant to help us out, rather than the impotent disembodied voices that currently live in our smartphones.