Create Your Own J.A.R.V.I.S. Using Jasper

JARVIS

Tony Stark’s J.A.R.V.I.S. needs no introduction. With [Shubhro's] and [Charlie's] recent release of Jasper, an always on voice-controlled development platform for the Raspberry Pi, you too can start making your own J.A.R.V.I.S..

Both [Shubhro] and [Charlie] are undergraduate students at Princeton University, and decided to make their voice-controlled project open-source (code is available on GitHub). Jasper is build on inexpensive off-the-shelf hardware, making it very simple to get started. All you really need is an internet connected Raspberry Pi with a microphone and speaker. Simply install Jasper, and get started using the built in functionality that allows you to interface with Spotify, Facebook, Gmail, knock knock jokes, and more. Be sure to check out the demo video after break!

With the easy to use developer API, you can integrate Jasper into any of your existing Raspberry Pi projects with little effort. We could see Jasper integrated with wireless microphones and speakers to enable advanced voice control from anywhere in your home. What a great project! Thanks to both [Shubhro] and [Charlie] for making this open-source.

Comments

  1. hemalchevli says:

    Looks cool, I’m gonna make it this weekend!

  2. No Hack says:

    Looks cool, I’m gonna make it after I get back from the toilet :)

    Seriously, this is cool. The project I mean…

  3. Eirinn says:

    Ok that’s amazing… I was working on a web based Virtual Intelligence called Lydia at one point with commands through XML and retro representation of answers (black screen, green text and blinky cursor). This will fit right in.

  4. Felix says:

    Would love to see this ported to ARMv7 platforms like UDOO. Wonder how much effort it would be?

    • DBommarito says:

      You should be able to follow the Method #2 steps in the software documentation from the project’s site and instead of downloading the Jasper binaries and transferring to the pi, compile the Jasper project from source for the UDOO. You could clone the repo to the UDOO and compile on the UDOO, or cross-compile from another computer.

      (All stated without testing first. Nothing looked out of place on the dependencies and Jasper itself is Python)

    • nope says:

      It should work as is…….

  5. Lwatcdr says:

    It looks like it is all software if so then it should be portable to say a PC or any other system running Linux as along as it has enough memory and cpu that is.

    • Rollyn01 says:

      I was thinking thinking the same thing. Almost makes me want to take my old laptop and turn it into a Linux machine. though I would see if I can use a different name instead of Jasper. No offense to the Jasper crew.

      • JE Carter II says:

        I was thinking “Computer”, ala Startrek naturally.

      • Lwatcdr says:

        Depending on the hardware requirements I wonder if you could run it on something like a NexusS HTC Evo or some older hackable cell phone. Just thinking what great remotes they would make.Another option is the first get Nexus 7s. For me the home control possibilities are what is most interesting.

        • kerimil says:

          Hate to post my own stuff (nah actually I don’t) but how is this simpler than using an android device and a simple app created using MIT’s app inventor ? You just use one block to change from speech to text and that’s it. Here’s my app that uses it and microsoft translation API to do real translation and voice production

      • JesseD says:

        If you look at the main.py on their github it has

        conversation = Conversation(“JASPER”, mic, profile)

        which is what sets up the name it listens for. Beyond changing it there possibly the languagemodel_persona.lm and dictionary_persona.dic files need editing. I’m assuming that they somehow are used by the voice recognition.

        I plan on playing with this over the weekend, my goal is to get it running for my 4yo son so he can request movies.

        • nope says:

          Their static/audio/jasper.wav is decoded by python in their main mic function as the “persona” so I assume that’s also part of the listen process, it would be very easy to change the conversation(“namehere” bit and then make your own wav with whatever name, seems all you’d need to do?

      • Dan Netwalker says:

        Actually, I was looking at the custom install method (as opposed to get a full SD card image) and it’s all apt-get installs, get source code and compile. Maybe you can get away with anything looselly simillar to debian or ubuntu directly, or “translate” those installs to your distro package manager equivalents.

  6. Sam says:

    Does this have to be run from a Pi? Is there a possibility of using it from a laptop? Emulating the pi for example?

    • qwerty says:

      Running it in an emulated environment would be an useless waste of resources. Anyway it uses pocketsphinx which is cross platform, so at least the core part of it can be ported easily.

  7. asdf_the_third says:

    Are there any OpenSource TTS that have a more human sounding voice?

    If Vocaloids can be made, I don’t see why we can’t make a more natural voice.

  8. mipi says:

    Could this be used to control XBMC?

  9. Jasper Always-Running Virtual Information System

  10. I already have a raspberry pi handling X10 lighting. I wonder if I could use my android phone as the microphone and speaker over a local network.

  11. Chris C. says:

    Is it speaker independent? Does it recognize only prerecorded phrases or vocabularies, or arbitrary speech? How is accuracy? Is it tolerant of some ambient room noise when using an open microphone, or does it have to be dead silent?

    None of these BASIC details are covered here, in the video, or in any obvious location on the creators’ website… FAIL.

    But SQUEEE! It integrates with social networking! And it’s been compared to an impossibly intelligent AI in a popular movie! Good enough for the mindless masses I suppose.

    • Aaron B says:

      If you read the documentation on the linked page, you would see the answers to your questions. It uses arbitrary speech recognition. Commands and functions can be defined by the user
      Why make a wild accuracy claim? If you enunciate clearly into a decent microphone, it should work. If you want to believe claims of amazing voice recognition, go buy Dragon.

      • Chris C. says:

        Which part of the documentation?

        It’s certainly not in the home page, “FAQ”, “Software”, or “Usage”. All that’s left is the “Developer API”, which I admit I didn’t do; but if I have to dig through that just to find out basic info like this, that’s utter crap.

        And as for the responses here, they conflict as well, so I’m not the only one confused. Funny we just featured an article on Mr. Widlar, wish he was around, I’m sure he’d have some particularly choice words about the quality of this “documentation”. ;)

        I certainly don’t need or believe marketing department generated accuracy claims either. But I would like to have at least some idea what to expect before investing too much time looking into it. Is it closer to Dragon, or a old IC-based speech recognition chip? Could I expect the actions depicted in the carefully edited video to work at least 50% of the time in the real world? Was careful training (of both software and speaker) and a dead silent room required? Did it take 100 attempts to film this success? Did the software already have the particular artist name programmed and *spoken* to it in advance, or was it able to recognize it without hearing it before?

        • Lion XL says:

          wow…sounds like you should be developing your Speech package, since you definitely seem well versed in the subject. I mean I’d definitely prefer to use yours as you seem so confident, and sound very intelligent, your version would probably kill theirs.

          I mean their only undergrads, so that they couldn’t have have made as good a device as yours would be. Ans i also agree that marketing machine BS on their website is soooo mis-leading. It was probably all bought and paid for, I would never believe any of it.

          YET AGAIN SOMEONE FEELS THE WHOLE WORLD SHOULD CATER TO THEIR WANTS AND NEEDS, AND WHEN THAT DOESN’T HAPPEN IT MUST MEAN THE OTHER PERSON IS INFERIOR OR SOME SUCH.

          If you’re too lazy to read the documentation as provided, then you FAIL.Everything you speak of might have validity if this were being sold on amazon or something, but its not. Its undergrad project that was probably used for learning stuff, which probably means its at a beta (if not alpha) level of development. Don’t want to use? Don’t. But don’t piss on their work because it makes you think it makes you look cool to do so..you asctually look pretty stupid to me!
          …..

          • Chris C. says:

            Because expecting a decent basic description of WHAT SOMETHING DOES is a special need? Because undergraduates and site editors can’t be expected to tackle the herculean task of writing up a paragraph or two that provides this information, and placing it somewhere prominent?

            I spent about 15 minutes reading, which should be more than enough to find such basic info. If you’re going to assume I’m lazy, I’m going to assume I spent 15 minutes more than you, and you have absolutely no idea what you’re talking about.

            And I’m only criticizing the documentation. Nowhere did I criticize their software or programming skills, or infer I could do better. Which makes you the stupid one for claiming otherwise, no assumptions necessary there.

        • Jerry says:

          Hi Chris,

          I’ll respond to your comment below here, as it seems it is too deep to reply there.

          You said:

          I spent about 15 minutes reading, which should be more than enough to find such basic info. If you’re going to assume I’m lazy, I’m going to assume I spent 15 minutes more than you, and you have absolutely no idea what you’re talking about

          I spent about 10 minutes, and I found out it uses the CMU
          speech recognition engine, and found out lots of information about that engine as well – including the answers to all your questions.

          I did not spend as much time as you, so I would certainly not think you lazy…however I’m not sure you should be calling others stupid.

    • Lwatcdr says:

      Is it speaker independent? Yes
      Does it recognize only prerecorded phrases or vocabularies? Yes
      Is it tolerant of some ambient room noise when using an open microphone, or does it have to be dead silent? Don’t know.
      It uses CMUSphinx for the speech engine.
      The website for the project is actually pretty nice.

    • Rasekov says:

      I was thinking something similar(but not so critical, it’s still a nice project).

      It lacks information and the raspberry pi it’s not a very powerful system to begin with.

      Apple, Google and Microsoft get away with natural language processing in smartphones by sending the query as audio to be analyzed in their(in Apple’s case, wolfram’s servers) servers but if this depends on the CPU available to the rPi the accuracy might not be so great.

      I’m guessing it works by keyword recognition so it’s usefulness will be way less “Jarvis” or “Siri” and more “predefined list of commands”.

      Also they don’t mention anything about support for languages other than English(Not all Speech recognition frameworks have the same level of accuracy for every language).

      As I said, nice project but the presentation might give people a incorrect impression on the scope of the project.

      PD: Now I’m wondering how well would the Natural Language Toolkit ( http://www.nltk.org/ ) work on the rPi.

      • mik says:

        NLTK is a framework/toolset for computational linguistics on a body of text (corpus linguistics). It’s fun to play with but is not related to this story in any appreciable way.

        As far as running it, you shouldn’t have any trouble. NLTK is written python and its dependencies (Python 2.6-2.7, Numpy, PyYaml) are all available for raspberry pi.

        NLTK processing can be slow(seconds) even on a core2quad 2.5 GHz, but that’s not a problem considering what it’s used for. “Jarvis, what are the hapaxes for Obama’s speech?”

        • Rasekov says:

          I know it’s not related to the story and I know it’s slow, I just mentioned it because the story made me wonder how well would it work in the rPi and if it would be even barely usable to create something more “Jarvis like” than what’s mentioned in the article.

          As you say it can be slow on a modern x86 computer so I don’t have high hopes for it being useful in any current ARM board.

    • Wim says:

      The recognition is all done by the CMU Sphinx package, which has been around for ages— what Jasper/JARVIS does is use Sphinx to recognize a specific handful of commands. The social-networking callouts and stuff like that is what’s new here.

  12. I do this for a living says:

    This says “internet connected pi” So it isn’t stand alone, right? It sounds as though you are sending your recorded (or live) audio off to the cloud to be processed. What ever could go wrong with that?

    • asdf_the_third says:

      I get that you’re trying to troll, but in all honesty: what really could go wrong?

      It’s not like your voice could be sold for marketing reasons and as long as you aren’t saying blatantly illegal (“Jasper, buy 100 pounds of C4.”) things there shouldn’t be a prob.

      Then again, you could route it through TOR.. or not. Thanks OpenSSL.

    • Yarr says:

      Fucking arrogant twat.

      • Greenaum says:

        Dude, control yourself. At least provide a counter-argument or some point of disagreement. What’s arrogant about a gentle reminder of the wild Internet’s lack of inherent safeness? Personally i think he’s being over-cautious, but that’s not arrogant, fucking, or twat.

        If you’re going to be unpleasant, at least provide a basic level of intellectual justification with it. If you just want someone to say “fucking arrogant twat” at all day, I’m sure there’s a gutter-full of unmet friends waiting for you somewhere round the back of a cheap booze shop.

    • Jerry says:

      It uses pocketsphinx – which does not seem to require sending the audio to the cloud for processing.

      Spenidng 2 seconds looking something up before complaining…what could go wring with that?

      I agree with Yarr….

    • JesseD says:

      After setting it up I think the internet connection is for the email notification and things like spotify, not something that it absolutely has to have to work.

      The thing I like the least about it is that they have the music/spotify integrated into a main program component instead of in one of the modules like the other options. IDK python well enough to know if they had a valid reason for that.

  13. Darren says:

    I had to Google J.A.R.V.I.S. Tony Stark didn’t ring any bells either. Cool tech but am I the only person annoyed by voice activated gadgets?

  14. Nick says:

    I’m a newcomer to raspberry pi and would really like to try this project. I see that the microphone the recommend is no longer available on amazon. Does anybody have a recommendation for one that would work for this project?

    • Greenaum says:

      This

      is recommended by Amazon on the link from the sold-out one. The recommended one is only a $15-$25 USB microphone. At that price it’s not going to be anything special, so presumably any old USB mic would do. Or maybe another source of sound altogether, a USB soundcard, say, with any old mic plugged in. That depends on what inputs the software accepts, should be pretty tweakable being a Linux project.

      From what I can tell, any USB mic should do. That’s Linux again, they like to get everything connected to a standardised interface from the software side, so if your mic works with Linux it’s driver should present the same interface to Jasper as any other mic would. The “quirks” you have to put up with in Windows usually aren’t permitted.

      That’s all I can say without experience. You could always try out whatever bits and pieces you have lying around, or can borrow. I’d bet that it doesn’t really matter and any simple USB mic would do.

  15. sparerobot says:
  16. Bhawani Bhateja says:

    Hii,
    According to the documentation provided by the two, I’ve managed to make it on my pi within a few hours but I’m stuck at a very funny point..
    I am new to the raspberry pi and the linux environment so don’t know much..
    The problem I am facing is that ‘How to run jasper after configuring the jasper client ??’
    I’ve done all the coding and everything has went fine but I don’t know how to start jasper!
    I am at the shell right now [pi@raspberry -$] and which command shall I run so that jasper listens to my commands ??
    Also, in the documentation we had to add the Facebook API key in the ‘Profile.YML’ file, now how shall I do that ??
    Where is that file and how to edit it ???

    Thanks!

    • Mike says:

      This is directly from the documentation and should run Jasper when the OS boots up.
      “””
      Run crontab -e, then add the following line, if it’s not there already:

      @reboot /home/pi/jasper/boot/boot.sh;
      Set permissions inside the home directory:

      sudo chmod 777 -R *
      “””
      I have to say, this is a nice project and yet another addition to the list of projects to build with my kids.

  17. Gabe says:

    I’ve got jasper up and running on my pi…mostly.
    Upon startup, I get the voice prompt “this is jasper, please wait a moment”
    Even after a few min, jasper doesn’t respond to his name with the beep.
    I’ve verified the mic works with an arecord test file and playback of the file with aplay is good also.
    I’ve run the main.py –local to try and see what’s happening and my spoken command of “jasper” doesn’t register as input on the main.py test. Its as though jasper is running properly but isn’t listening on the mic. But that’s just a guess.

    Does anyone have some ideas that may help?

  18. Ben says:

    Are we able to edit commands and answers?

  19. 26ct2957 says:

    Gabe,
    I had a similar problem. Being a noob I followed the documentation literally. It instructs that some of the commands are carried out in the ‘home’ directory, so I changed to cd home. Could not get it working. I then formatted the sd card and started again but carried out all the commands in the directory that the sd card booted in to. I think the key point being all the core code and modules and any configuration commands you make must be in the same directory. Other than that make sure your syntax and indentation is correct ( sorry if that sounds condescending – not meant to be, as I said I’m a complete noob ).
    Gareth.

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s

Follow

Get every new post delivered to your Inbox.

Join 92,115 other followers