How To Upgrade Jasper’s Voice Recognition With AT&T’s Speech-to-Text API

June 7, 2014

Jarvis upgrade

Jasper is an open-source platform for developing always-on voice-controlled applications — you talk and your electronics listen! It’s designed to run on a Raspberry Pi. [Zach] has been playing around with it and wasn’t satisfied with Jasper’s built-in speech-to-text recognition system. He decided to take the advice of the Jasper development team and modify the system to use AT&T’s speech-to-text engine.

The built-in system works, but it has limitations. Mainly, you have to specify exactly which keywords you want Jasper to look out for. This can be problematic if you aren’t sure what the user is going to say. It can also cause problems when there are many possibilities of what the user might say. For example if the user is going to say a number between one and one hundred, you don’t want to have to type out all one hundred numbers into the voice recognition system in order to make it work.

The Jasper FAQ does recommend using the AT&T’s speech-to-text engine in this situation but this has its own downsides. You are limited to only one request per second and it’s also slower to recognize the speech. [Zach] was just fine with these restrictions but he couldn’t find much information online about how to modify Jasper to make the AT&T engine work. Now that he’s gotten it functional, he shared his work to make it easier for others.

The modification first requires that you have at AT&T developer account. Once that’s setup, you need to make some changes to Jasper’s mic.py module. That’s the only part of Jasper’s core that must be changed, and it’s only a few lines of code. Outside of that, there are a couple of other Python scripts that need to be added. We won’t go into the finer details here since [Zach] goes into great detail on his own page, including the complete scripts. If you are interested in using the AT&T module with your Jasper installation, be sure to check out [Zach’s] work. He will likely save you a lot of time.

9 thoughts on “How To Upgrade Jasper’s Voice Recognition With AT&T’s Speech-to-Text API”

rasz_pl says:

June 7, 2014 at 11:21 am

next step is using this:
http://honnibal.wordpress.com/2013/12/18/a-simple-fast-algorithm-for-natural-language-dependency-parsing/
to expand capabilities even further, this python syntactic parser should be able to help with figuring out meaning of whole sentences without hardcoding everything.

Report comment

Reply
1. notabena4us says:
  
  June 7, 2014 at 12:41 pm
  
  +1… It just works better!
  
  Report comment
  
  Reply
2. zachb1121 says:
  
  June 7, 2014 at 1:17 pm
  
  It’d be another fun thing to try, I simply went with the Speech API because it’s backed by AT&T
  
  Report comment
  
  Reply
notabena4us says:

June 7, 2014 at 12:39 pm

+1 ~ Speechless what more is there to say ;^)

Report comment

Reply
Eirinn says:

June 7, 2014 at 3:49 pm

I tried Jasper and… it took me hours to get up and running. The guide is not up to date and some packages are different. When I finally had it up and running the recognition was exceptionally poor :( Experiences may vary though!

Report comment

Reply
Mac Cartier says:

June 7, 2014 at 4:03 pm

I feel like there is a way to use Google voice recognition using the site interface to get the live word display from the search bar online. You could also try to port something from the android app… anyone know how feasible this is?

Report comment

Reply
1. Kerimil says:
  
  June 7, 2014 at 10:44 pm
  
  https://www.youtube.com/watch?v=FtWM7M-rfus
  
  This is a slightly different app as it’s main purpose is to translate speech from one language to another but it relies on Google speech to text and Microsoft text to speech API’s. Getting this to work with Google’s API is just waaaaay to easy
  
  Report comment
  
  Reply
Ben says:

June 11, 2014 at 3:06 am

I keep receiving a 500 error from this Speech to text API :

http://stackoverflow.com/questions/24159867/500-error-with-att-att-speech-to-text-api-python

The information returned by the API is very vague as to the problem,
if anyone knows the answer – Thanks! :)

Report comment

Reply
x-hamiltonian says:

January 29, 2015 at 4:18 am

Fixed by changing:
r = requests.post(‘https://api.att.com/oauth/token’
to
r = requests.post(‘https://api.att.com/oauth/v4/token’

in STT.py in the jasper/client folder.

I’ve submitted a pull request to include this into the latest build of jasper

Report comment

Reply