Real-time object detection, which uses neural networks and deep learning to rapidly identify and tag objects of interest in a video feed, is a handy feature with great hacker potential. Happily, it’s also possible to make customized CNNs (convolutional neural networks) tailored for one’s own needs, and that process just got easier thanks to some new documentation for the Vizy “AI camera” by Charmed Labs.
Charmed Labs has been making hacker-friendly machine vision devices for a long time, and the Vizy camera impressed us mightily when we checked it out last year. Out of the box, Vizy has a perfectly functional object detector application that runs locally on the device, and can detect and tag many common everyday objects in real time. But what if that default application doesn’t quite meet one’s project needs? Good news, because it’s possible to create a custom-trained CNN, and that process got a lot more accessible thanks to step-by-step examples of training a model to recognize hands doing rock-paper-scissors.
The basic process is this: Start with a variety of images that show the item of interest. Then identify and label the item of interest in each photo. These photos (a “training set”) are then sent to Google Colab, which will be used to generate a neural network. The resulting CNN model can then be downloaded and used, to see how well it performs.
Of course things rarely work perfectly the first time around, so at this point it’s pretty common for some refinement to be needed to increase accuracy. Luckily there are a number of tools to help do this without creating a new model from scratch, so it’s just a matter of tweaking until things perform acceptably.
Google Colab is free and the resulting CNNs are implemented in the TensorFlow Lite framework, meaning it’s possible to use them elsewhere. So if custom object detection has been holding up a project idea of yours, this might be what gets you over that hump.
For those who choose to let their cats live a more or less free-range life, there are usually two choices. One, you can adopt the role of servant and run for the door whenever the cat wants to get back inside from their latest bird-murdering jaunt. Or two, install a cat door and let them come and go as they please, sometimes with a “present” for you in their mouth. Heads you win, tails you lose.
There’s another way, though: just let the cat ask to be let back in. That’s the approach that [Tennis Smith] took with this machine-learning kitty doorbell. It’s based on a Raspberry Pi 4, which lives inside the house, and a USB microphone that’s outside the front door. The Pi uses Tensorflow Lite to classify the sounds it picks up outside, and when one of those sounds fits the model of a cat’s meow, a message is dispatched to AWS Lambda. From there a text message is sent to alert [Tennis] that the cat is ready to come back in.
There’s a ton of useful information included in the repo for this project, including step-by-step instructions for getting Amazon Web Services working on the Pi. If you’re a dog person, fear not: changing from meows to barks is as simple as tweaking a single line of code. And if you’d rather not be at the beck and call of a cat but still want to avoid the evidence of a prey event on your carpet, machine learning can help with that too.
TensorFlow is a machine learning and AI library that has enabled so much and brought AI within the reach of most developers. But it’s fair to say that it’s not for the less powerful computers. For them there’s TensorFlow Lite, in which a model is created on a larger machine and exported to a microcontroller or similarly resource-constrained one. [Nick Bild] has probably taken this to its extreme though, by achieving this feat on a Commodore 64. Not just that, but he’s also done it using Commodore BASIC.
TensorFlow Lite works by the model being created as a C array which is then parsed and run by an interpreter on the microcontroller. This is a little beyond the capabilities of the mighty 64, so he has instead created a Python script that does the job of the interpreter and produces Commodore BASIC code that can run on the 64. The trusty Commodore was one of the more powerful home computers of its day, but we’re fairly certain that its designers never in their wildest dreams expected it to be capable of this!
“With the power of edge AI in the palm of your hand, your business will be unstoppable.”
That’s what the marketing seems to read like for artificial intelligence companies. Everyone seems to have cloud-scale AI-powered business intelligence analytics at the edge. While sounding impressive, we’re not convinced that marketing mumbo jumbo means anything. But what does AI on edge devices look like these days?
Being on the edge just means that the actual AI evaluation and maybe even fine-tuning runs locally on a user’s device rather than in some cloud environment. This is a double win, both for the business and for the user. Privacy can more easily be preserved as less information is transmitted back to a central location. Additionally, the AI can work in scenarios where a server somewhere might not be accessible or provide a response quickly enough.
Google and Apple have their own AI libraries, ML Kit and Core ML, respectively. There are tools to convert Tensorflow, PyTorch, XGBoost, and LibSVM models into formats that CoreML and ML Kit understand. But other solutions try to provide a platform-agnostic layer for training and evaluation. We’ve also previously covered Tensorflow Lite (TFL), a trimmed-down version of Tensorflow, which has matured considerably since 2017.
For this article, we’ll be looking at PyTorch Live (PTL), a slimmed-down framework for adding PyTorch models to smartphones. Unlike TFL (which can run on RPi and in a browser), PTL is focused entirely on Android and iOS and offers tight integration. It uses a react-native backed environment which means that it is heavily geared towards the node.js world.
Have you ever tried waving your hand around like a magic wand and summoning a calculator? We would guess not since you’d probably look a little silly doing so. That is unless you had [Andrei’s] cool gesture-controlled calculator. [Andrei] thought it would be helpful to use a calculator in his research lab without having to take his gloves off and the results are pretty cool.
His hardware consists of a PocketBeagle, an OLED, and an MPU6050 inertial measurement unit for capturing his hand motions using an accelerometer and gyroscope. The hardware is pretty straightforward, so the beauty of this project lies in its machine learning implementation.
[Andrei] first captured a few example datasets to train his algorithm by recreating the hand gestures for each number, 0-9, and recording the resulting accelerometer and gyroscope outputs. He processed the data first with a wavelet transform. The intent of the transform was two-fold. First, the transform allowed him to reduce the number of samples in his datasets while preserving the shape of the accelerometer and gyroscope signals, the key features in the machine learning classification. Secondly, he was able to increase the number of features for the classification since the wavelet transform resulted in both approximation and detailed coefficients which can both be fed into the algorithm.
Because he had a small dataset, he used the Stratified Shuffle Split technique instead of the test train split method which is generally more suited for larger datasets. The Stratified Shuffle Split ensured approximately the same number of train and test samples for each gesture. He was also very conscious of optimizing his model for running on a portable processing unit like the PocketBeagle. He spent some time optimizing the parameters of his algorithm and ultimately converted his model to a TensorFlowLite model using the built-in “TFLiteConverter” function within TensorFlow.
Mind control might seem like something out of a sci-fi show, but like the tablet computer, universal translator, or virtual reality device, is actually a technology that has made it into the real world. While these devices often requires on advanced and expensive equipment to interpret brain waves properly, with the right machine learning system it’s possible to do things like this mind-controlled flame thrower on a much smaller budget. (Video, embedded below.)
[Nathaniel F] was already experimenting with using brain-computer interfaces and machine learning, and wanted to see if he could build something practical combining these two technologies. Instead of turning to an EEG machine to read brain patterns, he picked up a much less expensive Mindflex and paired it with a machine learning system running TensorFlow to make up for some of its shortcomings. The processing is done by a Raspberry Pi 4, which sends commands to an Arduino to fire the flamethrower when it detects the proper thought patterns. Don’t forget the flamethrower part of this build either: it was designed and built entirely by [Nathanial F] as well using gas and an arc lighter.
While the build took many hours of training to gather the proper amount of data to build the neural network and works as the proof of concept he was hoping for, [Nathaniel F] notes that it could be improved by replacing the outdated Mindflex with a better EEG. For now though, we appreciate seeing sci-fi in the real world in projects like this, or in other mind-controlled projects like this one which converts a prosthetic arm into a mind-controlled music synthesizer.
The idea is simple enough: just place a INA219 current sensor between the power supply and the microcontroller under observation, and record the resulting measurements as it goes about its business. Of course in this case, [Mark] knew what the target Arduino Nano was doing because he wrote the code that blinks its onboard LED.
This allowed him to create training data for TensorFlow, which was ultimately optimized into a model that could fit onto the Arduino Nano 33 BLE Sense which stands in for our magic wand. The end result is that the model can accurately predict when the Nano has fired up its LED based on the amount of power it’s using. [Mark] has done a fantastic job of documenting the whole process, which also doubles as a great intro for putting machine learning to work on a microcontroller.
Now we already know what you’re thinking: obviously the current would go up when the LED was lit, so the machine learning aspect is completely unnecessary. That may be true in this limited context, but remember, this is just a proof of concept to base further work on. In the future, with more training data, this technique could potentially be used to identify a whole range of nuanced activities. You’d be able to see when the MCU was sitting idle, when it was writing to flash, or when it was reading from sensors. In fact, with a good enough model, it might even be possible to identify the individual sensors that are being polled.