Google AIY: Artificial Intelligence Yourself

When Amazon released the API to their voice service Alexa, they basically forced any serious players in this domain to bring their offerings out into the hacker/maker market as well. Now Google and Raspberry Pi have come together to bring us ‘Artificial Intelligence Yourself’ or AIY.

A free hardware kit made by Google was distributed with Issue 57 of the MagPi Magazine which is targeted at makers and hobbyists which you can see in the video after the break. The kit contains a Raspberry Pi Voice Hat, a microphone board, a speaker and a number of small bits to mount the kit on a Raspberry Pi 3. Putting all of it together and following the instruction on the official site gets you a Google Voice Interaction Kit with a bunch of IOs just screaming to be put to good use.

The source code for the python app can be downloaded from GitHub and consists of a loop that awaits a trigger. This trigger can be a press of a button or a clap near the microphones. When a trigger is detected, the recorder function takes over sending the stream to the Google Cloud. Speech-to-Text conversion happens there and the result is returned via a Text-To-Speech engine that helps the system talk back. The repository suggests that the official Voice Kit SD Image (893 MB download) is based on Raspbian so don’t go reflashing a memory card right away, you should be able to add this to an existing install.

And if you don’t have access to the official kit yet but are just itching to give it a try then look no further. Google was kind enough to put up a Guide to add Google Assistant support to the Raspberry Pi 3. The single board computer already has a speaker output and there are a plethora of USB microphones out that will do the job. USB sounds cards work just fine as well and after you follow the instructions to setup the Google SDK, you got yourself an Assistant.

If you want to complete the Google AIY Kit experience, you will have to do a bit of hacking. Adding a push button to trigger the Assistant Script is pretty simple and if someone wants to add a DIY Clap Trigger instead, go right ahead.

35 thoughts on “Google AIY: Artificial Intelligence Yourself

        1. The best fully open-source one I know of is Pocketsphinx, which is a full speech recognition engine, or if you’re willing to use a partially open-source one then Snowboy is supposed to be quite good – the only issue is that you have to train the voice model online, but after that the recognition runs completely offline.

    1. I’d also prefer to have something that can run outside of Google’s servers and therefore out of their sight. It creeps me out enough that they cache all queries you make via the Google Now voice questions.

    2. Yeah I’m not a fan of all these AI products that run in a black box on somebody else’s machines. I’ve already got enough devices reporting on my behavior, I don’t need to add more with a hobby project.

      Surely there’s a voice recognition approach that’s lightweight enough to run on a Pi or some other hobby-sized machine?

    3. Perhaps you can use Google cloud to fast train your own ANN? If an embedded machine can encompass it, that is. Feed it broadcasts and Google obtained records of the same, so that you don’t compromise your own privacy.

    4. Not even for privacity, but sometimes you want to install something using this in a place without internet connection.

      Also, about running in the pi : we had reasonably workable voice recognitin since OS/2 time. So, Dragon Dictate and others worked in 386´s and 486´s. Given the processing power of the pi, I would hope it should be more than capable to run something like this. Maybe if google would really code something for it, instead of piling some bloated java libraries to the task

    5. Cant seem to find how to do it in the instructions but on their developer blog page it does say ” instructions to build a Voice User Interface (VUI) that can use cloud services (like the new Google Assistant SDK or Cloud Speech API) or run completely on-device. ” So I’m guessing that there is a way to run it offline just with less functionality depending on your device, or maybe i’m wrong…

        1. cool, Thanks for not telling me i was wrong, I hadn’t looked that hard yet. This might make a better way to access the security feed of the driveway on the tv, cause you know, finding the remote and pressing 3 buttons or just looking out the window when i hear a car pull up is way to hard sometimes.

    1. It can’t hear you until you press enter or use something to trigger the detection. As of right now the only standalone device that can OK Google or run “always on” is the Google Home.

    1. It is an educational toy, if you need security and privacy such systems are not appropriate. However you can use voice APIs to help train your own voice-to-text-to-action system and then just run the local neural network when it is competent enough.

      1. The Voice Kit ships out to all MagPi Magazine subscribers on May 4, 2017, and we’ve published a parts list, assembly instructions, source code and suggested extensions to our website: aiyprojects.withgoogle.com. The complete kit is also for sale at over 500 Barnes & Noble stores nationwide, as well as UK retailers WH Smith, Tesco, Sainsburys, and Asda.

        1. I think they may mean those UK retailers as stocking the magpi magazine, as they don’t stock raspi, or any other electronics stuff. I just returned from a couple that had empty spaces where magpi should have been, and all online places have sold out. £20 start price on ebay.

  1. So how much data is send to goolge? I sgoogle only use for speech-to-text or also the complte AI task? If it was only the STT then we simply use keyboard as input. But currently this sounds like speech to google. goolge is making it to text feed an AI, AI doing magic, google send back text, text-to-spech on raspi is talking to you.

  2. You can do all this, and a lot more with a $50 Android 6.0 phone, a custom App and any number of ESP8266 enabled “things”. Google’s offering is for children, not hardware hackers.

      1. I’m not “dumping” anything you are just kidding yourself about the fact that you can get everything and more already packaged for less money, this is a matter of verifiable facts and only a complete dickhead would deny it.

  3. There are a ton of speech recognition module(s) (<- search it) on Aliexpress. They are mostly limited to phrase recognition and have hard limits of around 20 to 170 phrases for ~$15-$80. Probably good enough for most uses but definitely not an AI interface. Need more phrases? get a second board ;)

  4. Got a Mag Pi and the kit this afternoon at an out-of-the-way WH Smiths near where I was working. All works quite nicely.
    Was very easy to get it up and running.

    The Voice Hat is really interesting though – lots of other breakouts on it – not just a speaker output, mic input and button input. The kit comes with a header strip to let you populate :

    I2C and SPI breakouts

    What look like 6 Servo outputs and 4 ‘Drivers’ (motor drivers?) There is an unpopulated DC-in barrel jack pads (the Mag Pi article boards show it populated) and some other jumpers too.

    Not bad for a giveaway with a GBP £5.99 magazine.

  5. All good but the limitation of having to allow a connection to googles cloud is a “pain in the ass” especially when you want to make use of it on a phone in a country back lane, it cant even send a text without being online.

    They definitely need to make a offline version.

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s