How to Upgrade Jasper’s Voice Recognition with AT&T’s Speech-to-Text API

Jarvis upgrade

Jasper is an open-source platform for developing always-on voice-controlled applications — you talk and your electronics listen! It’s designed to run on a Raspberry Pi. [Zach] has been playing around with it and wasn’t satisfied with Jasper’s built-in speech-to-text recognition system. He decided to take the advice of the Jasper development team and modify the system to use AT&T’s speech-to-text engine.

The built-in system works, but it has limitations. Mainly, you have to specify exactly which keywords you want Jasper to look out for. This can be problematic if you aren’t sure what the user is going to say. It can also cause problems when there are many possibilities of what the user might say. For example if the user is going to say a number between one and one hundred, you don’t want to have to type out all one hundred numbers into the voice recognition system in order to make it work.

The Jasper FAQ does recommend using the AT&T’s speech-to-text engine in this situation but this has its own downsides. You are limited to only one request per second and it’s also slower to recognize the speech. [Zach] was just fine with these restrictions but he couldn’t find much information online about how to modify Jasper to make the AT&T engine work. Now that he’s gotten it functional, he shared his work to make it easier for others.

The modification first requires that you have at AT&T developer account. Once that’s setup, you need to make some changes to Jasper’s mic.py module. That’s the only part of Jasper’s core that must be changed, and it’s only a few lines of code. Outside of that, there are a couple of other Python scripts that need to be added. We won’t go into the finer details here since [Zach] goes into great detail on his own page, including the complete scripts. If you are interested in using the AT&T module with your Jasper installation, be sure to check out [Zach's] work. He will likely save you a lot of time.

 

PhoneTag helps you read your voicemail

AnsweringMachine

Have you ever been too busy to check in with your voicemail service? PhoneTag might have the solution for you.

Some of us might have done it before, let voicemails pile up if we know nothing urgent or important is coming down the pipes. Wouldn’t it be much simpler and more convenient if those voicemails played by our rules? PhoneTag is a speech to text service that converts a voicemail into text and sends it via email or SMS which you can read through and reference at will. The accuracy on this type of service is usually pretty good, but some translation is required as spoken words can sometimes be misinterpreted depending on the clarity of the call. On the security side of things, we tend to be a little hesitant of personal and business voicemails running through an extra service. PhoneTag does state that they use some kind of “special algorithm” that will guarantee voicemails are secure and private.

While there is a free trial period, this service is going to cost you. You can sign up for anything from a per message price of $.35 to an unlimited plan of $29.95/month. You are going to have to do your own calculations here to see if this is the best way to go, but this will save you from using your monthly minutes for checking the voicemails in your mailbox. As alternatives, Google Voice offers the same service for free and SpinVox charges a fee per use.

Follow

Get every new post delivered to your Inbox.

Join 92,288 other followers