This Cardboard Box Can Tell You What It Sees

It wasn’t that long ago that talking to computers was the preserve of movies and science fiction. Slowly, voice recognition improved, and these days it’s getting to be pretty usable. The technology has moved beyond basic keywords, and can now parse sentences in natural language. [Liz Meyers] has been working with the technology, creating WhatIsThat – an AI that can tell you what it’s looking at.

Adding a camera to Google’s AIY Voice Kit makes for a versatile object identification system.

The device is built around Google’s AIY Voice Kit, which consists of a Raspberry Pi with some additional hardware and software to enable it to process voice queries. [Liz] combined this with a Raspberry Pi camera and the Google Cloud Vision API. This allows WhatIsThat to respond to users asking questions by taking a photo, and then identifying what it sees in the frame.

It may seem like a frivolous project to those with working vision, but there is serious potential for this technology in the accessibility space. The device can not only describe things like animals or other objects, it can also read text aloud and even identify logos. The ability of the software to go beyond is impressive – a video demonstration shows the AI correctly identifying a Boston Terrier, and attributing a quote to Albert Einstein.

Artificial intelligence has made a huge difference to the viability of voice recognition – because it’s one thing to understand the words, and another to understand what they mean when strung together. Video after the break.

[Thanks to Baldpower for the tip!]

Continue reading “This Cardboard Box Can Tell You What It Sees”

Control Anything With A Chat Bot

In the world of Internet of Things, it’s easy enough to get something connected to the Internet. But what should you use to communicate with and control it? There are many standards and tools available, but the best choice is always to use the tools you have on hand. [Victor] found himself in this situation, and found that the best way to control an Internet-connected car was to use the Flask server he already had.

The remote controlled car was originally supposed to come with an Arduino, but the microcontroller was missing upon arrival. He had a Raspberry Pi around, and was able to set that up to replace the Arduino. He also took the opportunity to use the expanded functionality of the Pi compared to the Arduino and wrote a Flask server to control it, which is accessed as if you are communicating with a chat bot. Sending the words “go left/forward” to the Flask server will control the car accordingly, for example.

The chat bot itself contains some gems as well, and would be useful for any project that makes use of regular expressions. It also seems to be easily expandable. The project also uses voice commands, and does so by making extensive use of Mozilla’s voice recognition suite. If you want to get deep in the weeds of voice recognition on your own though, you can also explore TensorFlow at your leisure.

Robot: Do My Bidding!

Remote control robots are nothing new. Using Bluetooth isn’t all that unusual, either. What [SayantanM4] did was make a Bluetooth robot that accepts voice commands via his phone. The robot itself isn’t very remarkable. An Arduino and an HC05 module make up most of the electronics. A standard motor driver runs the two wheels.

The Arduino doesn’t usually do much voice processing, and the trick is–of course–in the phone application. BT Voice Control for Arduino is a free download that simply sends strings to a host computer via Bluetooth. If you say “Hello” into your phone, the robot receives *Hello# and that string could be processed by any computer that can receive Bluetooth data.

Continue reading “Robot: Do My Bidding!”

Siri Controls Your PC Through Python and Gmail

Voice-based assistants are becoming more common on devices these days. Siri is known for being particularly good at responding to natural language and snarky responses. In comparison, Google’s Assistant is only capable of the most obvious commands, and this writer isn’t even sure Microsoft’s Cortana can understand English at all. So it makes sense then, if you want voice control for your PC, to choose Siri as your weapon of choice. [Sanjeet] is here to help, enabling Siri to control a PC through Python.

The first step is hooking up the iPhone’s Notes app to a Gmail account. [Sanjeet] suggests using a separate account for security reasons, as you’ll need to place the username and password in a Python script. The Python script checks the Gmail account every second, looking for new Notes from the iPhone. Then, it’s as simple as telling Siri to make a Note (for example, “Siri, Note shutdown”) and the Python script can then pick up the command, and act accordingly.

It’s a quick and easy way to get Siri to do your bidding. There’s other fancy ways to do it, too — like capturing Siri’s WiFi data on your home network.

A Smart Wand for all us Muggles

Arthur C. Clarke said that “any sufficiently advanced technology is indistinguishable from magic.” Even though we know that something isn’t “magic”, it’s nice to see how close we can get. [Dofl] and his friends, big fans of the magic in Harry Potter, thought the same thing, and decided to create a magic wand that they could use themselves.

muggle-wand-internalsThe wand itself is 3D printed and has a microcontroller and WiFi board, a voice recognition board, a microphone, and a vibrating motor stuffed inside. The wand converts the voice into commands and since the wand is connected to WiFi, the commands can be used to communicate with your WiFi connected lights (or your WiFi connected anything, really.) Five voice commands are recognized to turn on and off music, the lights, and a “summon” command which is used in the video to request a hamburger from For feedback, the motor is vibrated when a command is recognized.

There’s not much technical information in the original article, but I’m sure our readers could figure out the boards used and could suggest some alternatives to get the wand’s form factor down a bit.  Over the years, other wands have appeared on our pages, using some different technologies.  It’s a fun way to interact with the environment around you, even if you know the “magic” involved is just boring old technology.

Continue reading “A Smart Wand for all us Muggles”

Seeed Studio’s ReSpeaker Speaks All the Voice Recognition Languages

Seeed Studio recently launched its third Kickstarter campaign: ReSpeaker, an open hardware voice interface. After their previous Kickstarted IoT hardware, such as the RePhone, mostly focused on connectivity, the electronics manufacturer from Shenzhen now tackles another highly contested area of IoT: Voice recognition.

The ReSpeaker Core is a capable development board based on Mediatek’s MT7688 WiFi module and runs OpenWrt. Onboard is a WM8960 stereo audio codec with integrated 1W speaker/headphone driver, a microphone, an ATMega32U4 coprocessor, 12 addressable RGB LEDs and 8 touch sensors. There are also two expansion headers with GPIOs, I2S, I2C, analog audio and USB 2.0 and an onboard microSD card slot.

The latter is especially useful to feed the ReSpeaker’s integrated speech recognition engine PocketSphinx with a vocabulary and audio file library, enabling it to respond to keywords and commands even when it’s not hooked up to the internet. Once it’s online, ReSpeaker also supports most of the available cloud based cognitive speech recognition services, such as Microsoft Cognitive Service, Amazon Alexa Voice Service, Google Speech API, and Houndify. It also comes with an SDK and Python API, supports JavaScript, Lua and C/C++, and it looks like the coprocessor features an Arduino-compatible bootloader.

The expansion header accepts shield-like hardware add-ons. Some of them are also available through the campaign. The most important one is the circular, far-field microphone array. Based on 7 XVSM-2000 respeaker_meow2digital microphones, the extension board enhances the device’s hearing with sound localization, beam forming, reverb and noise suppression. A Grove extension board connects the ReSpeaker to the Seeed’s current lineup on ready-to-use sensors, actuators and other peripherals.

Seeed also cooperates with the Meow King Audio Electronic Company to develop a nice tower-shaped enclosure with built-in speaker, 5W amplifier and battery. As a portable speaker, the Meow King Drive Unit (shown on the right) certainly doesn’t knock your socks off, but it practically turns the ReSpeaker into an open source version of the Amazon Echo — including the ability to run offline instead of piping everything you say to Big Brother.

According to Seeed, the freshly baked hardware will ship to backers in November 2016, and they do have a track-record of on-schedule shipped Kickstarter rewards. At the time of writing, some of the Crazy Early Birds are still available for $39. Enjoy the campaign video below and let us know what you think of think hardware in the comments!

Talking Star Trek

Speech generation and recognition have come a long way. It wasn’t that long ago that we were in a breakfast place and endured 30 minutes of a teenaged girl screaming “CALL JUSTIN TAYLOR!” into her phone repeatedly, with no results. Now speech on phones is good enough you might never use the keyboard unless you want privacy. Every time we ask Google or Siri a question and get an answer it makes us feel like we are living in Star Trek.

[Smcameron] probably feels the same way. He’s been working on a Star Trek-inspired bridge simulator called “Space Nerds in Space” for some time. He decided to test out the current state of Linux speech support by adding speech commands and response to it. You can see the results in the video below.

Continue reading “Talking Star Trek”