Google has announced their soon to be available Vision Kit, their next easy to assemble Artificial Intelligence Yourself (AIY) product. You’ll have to provide your own Raspberry Pi Zero W but that’s okay since what makes this special is Google’s VisionBonnet board that they do provide, basically a low power neural network accelerator board running TensorFlow.
The VisionBonnet is built around the Intel® Movidius™ Myriad 2 (aka MA2450) vision processing unit (VPU) chip. See the video below for an overview of this chip, but what it allows is the rapid processing of compute-intensive neural networks. We don’t think you’d use it for training the neural nets, just for doing the inference, or in human terms, for making use of the trained neural nets. It may be worth getting the kit for this board alone to use in your own hacks. An alternative is to get Modivius’s Neural Compute Stick, which has the same chip on a USB stick for around $80, not quite double the Vision Kit’s $45 price tag.
The Vision Kit isn’t out yet so we can’t be certain of the details, but based on the hardware it looks like you’ll point the camera at something, press a button and it will speak. We’ve seen this before with this talking object recognizer on a Pi 3 (full disclosure, it was made by yours truly) but without the hardware acceleration, a single object recognition took around 10 seconds. In the vision kit we expect the recognition will be in real-time. So the Vision Kit may be much more dynamic than that. And in case it wasn’t clear, a key feature is that nothing is done on the cloud here, all processing is local.
The kit comes with three different applications: an object recognition one that can recognize up to 1000 different classes of objects, another that recognizes faces and their expressions, and a third that detects people, cats, and dogs. While you can get up to a lot of mischief with just that, you can run your own neural networks too. If you need a refresher on TensorFlow then check out our introduction. And be sure to check out the Myriad 2 VPU video below the break.
Google’s voice assistant has been around for a while now and when Amazon released its Alexa API and ported the PaaS Cloud code to the Raspberry Pi 2 it was just a matter of time before everyone else jumped on the fast train to maker kingdom. Google just did it in style.
Few know that the Google Assistant API for the Raspberry Pi 3 has been out there for some time now but when they decided to give away a free kit with the May 2017 issues of MagPi magazine, they made an impression on everyone. Unfortunately the world has more makers and hackers and the number of copies of the magazine are limited.
In this writeup, I layout the DIY version of the AIY kit for everyone else who wants to talk to a cardboard box. I take a closer look at the free kit, take it apart, put it together and replace it with DIY magic. To make things more convenient, I also designed an enclosure that you can 3D print to complete the kit. Lets get started.
When Amazon released the API to their voice service Alexa, they basically forced any serious players in this domain to bring their offerings out into the hacker/maker market as well. Now Google and Raspberry Pi have come together to bring us ‘Artificial Intelligence Yourself’ or AIY.
A free hardware kit made by Google was distributed with Issue 57 of the MagPi Magazine which is targeted at makers and hobbyists which you can see in the video after the break. The kit contains a Raspberry Pi Voice Hat, a microphone board, a speaker and a number of small bits to mount the kit on a Raspberry Pi 3. Putting all of it together and following the instruction on the official site gets you a Google Voice Interaction Kit with a bunch of IOs just screaming to be put to good use.
The source code for the python app can be downloaded from GitHub and consists of a loop that awaits a trigger. This trigger can be a press of a button or a clap near the microphones. When a trigger is detected, the recorder function takes over sending the stream to the Google Cloud. Speech-to-Text conversion happens there and the result is returned via a Text-To-Speech engine that helps the system talk back. The repository suggests that the official Voice Kit SD Image (893 MB download) is based on Raspbian so don’t go reflashing a memory card right away, you should be able to add this to an existing install.