Machine learning is starting to come online in all kinds of arenas lately, and the trend is likely to continue for the forseeable future. What was once only available for operators of supercomputers has found use among anyone with a reasonably powerful desktop computer. The downsizing isn’t stopping there, though, as Microsoft is pushing development of machine learning for embedded systems now.
The Embedded Learning Library (ELL) is a set of tools for allowing Arduinos, Raspberry Pis, and the like to take advantage of machine learning algorithms despite their small size and reduced capability. Microsoft intended this library to be useful for anyone, and has examples available for things like computer vision, audio keyword recognition, and a small handful of other implementations. The library should be expandable to any application where machine learning would be beneficial for a small embedded system, though, so it’s not limited to these example applications.
How do you tell how much load is on a CPU? On a desktop or laptop, the OS usually has some kind of gadget to display the basics. On a microcontroller, though, you’ll have to roll your own CPU load meter with a few parts, some code, and a voltmeter.
We like [Dave Marples]’s simple approach to quantifying something as complex as CPU load. His technique relies on the fact that most embedded controllers are just looping endlessly waiting for something to do. By strategically placing commands that latch an output on while the CPU is busy and then turn it off again when idle, a PWM signal with a duty cycle proportional to the CPU load is created. A voltage divider then scales the maximum output to 1.0 volt, and a capacitor smooths out the signal so the load is represented by a value between 0 and 1 volt. How you display the load is your own choice; [Dave] just used a voltmeter, but anything from an LED strip to some kind of audio feedback would work too.
Every once in a while a project comes along with that magical power to consume your time and attention for many months. When you finally complete it, you feel sorry that you don’t have to do anything more.
What is so special about this Bingo ball reader? It may seem like an ordinary OCR project at first glance; a camera captures the image and OCR software recognizes the number. Simple as that. And it works without problems, like every simple gadget should.
But then again, maybe it’s not that simple. Numbers are scattered all over the ball, so they have to be located first, and the best candidate for reading must be selected. Then, numbers are painted onto a sphere rather than a flat surface, sometimes making them deformed to the point where their shape has to be recovered first. Also, the angle of reading is not fixed but somewhere on a 360° scale. And then we have the glare problem to boot, as Bingo balls are so shiny that every light source reflects as a saturated bright spot.
So, is that all of it? Well, almost. The task is supposed to be performed by an embedded microcontroller, with limited speed and memory, yet the recognition process for one ball has to be fast — 500 ms at worst. But that’s just one part of the process. The project includes the pipelined mechanism which accepts the ball, transports it to be scanned by the OCR and then shot by the public broadcast camera before it gets dumped. And finally, if the reading was not reliable enough, the ball has to be subtly rotated so that the numbers would be repositioned for another reading attempt.
Despite these challenges I did manage to build this system. It’s fast and reliable, and I discovered some very interesting tricks along the way. Take a look at the quick demo video below to get a feel for the speed, and what the system “sees”. Then join me after the break to dive into the details of this interesting embedded build.
JeVois is a small, open-source, smart machine vision camera that was funded on Kickstarter in early 2017. I backed it because cameras that embed machine vision elements are steadily growing more capable, and JeVois boasts an impressive range of features. It runs embedded Linux and can process video at high frame rates using OpenCV algorithms. It can run standalone, or as a USB camera streaming raw or pre-processed video to a host computer for further action. In either case it can communicate to (and be controlled by) other devices via serial port.
But none of that is what really struck me about the camera when I received my unit. What really stood out was the demo mode. The team behind JeVois nailed an effective demo mode for a complex device. That didn’t happen by accident, and the results are worth sharing.
The first of the BBC Micro Bits are slowly making their ways into hacker circulation, as is to be expected for any inexpensive educational gadget (see: Raspberry Pi). [Martin] was able to get his hands on one and created the “hello world” of LED displays: he created a playable game of snake that runs on this tiny board.
For those new to the scene, the Micro Bit is the latest in embedded ARM systems. It has a 23-pin connector for inputs and outputs, it has Bluetooth and USB connectivity, a wealth of sensors, and a 25-LED display. That’s small for a full display but it’s more than enough for [Martin]’s game of snake. He was able to create a hex file using the upyed tool from [ntoll] and upload it to the Micro Bit. Once he worked out all the kinks he went an additional step further and ported the game to Minecraft and the Raspberry Pi Sense HAT.
[Martin] has made all of the code available if you’re lucky enough to get your hands on one of these. Right now it seems that they are mostly in the hands of some UK teachers and students, but it’s only a matter of time before they become as ubiquitous as the Raspberry Pi or the original BBC Micro. It already runs python, so the sky’s the limit on these new boards.
Embedded C developers shy away from C++ out of concern for performance. The class construct is one of their main concerns. My previous article Code Craft – Embedding C++: Classes explored whether classes cause code bloat. There was little or no bloat and what is there assures that initialization occurs.
Using classes, and C++ overall, is advantageous because it produces cleaner looking code, in part, by organizing data and the operations on the data into one programming structure. This simple use of classes isn’t the raison d’etre for them but to provide inheritance, or more specifically polymorphism, (from Greek polys, “many, much” and morphē, “form, shape”).
Skeptics feel inheritance simply must introduce nasty increases in timing. Here I once more bravely assert that no such increases occur, and will offer side-by-side comparison as proof.
Today, Nvidia announced their latest platform for advanced technology in autonomous machines. They’re calling it the Jetson TX1, and it puts modern GPU hardware in a small and power efficient module. Why would anyone want GPUs in an embedded format? It’s not about frames per second; instead, Nvidia is focusing on high performance computing tasks – specifically computer vision and classification – in a platform that uses under 10 Watts.
For the last several years, tiny credit card sized ARM computers have flooded the market. While these Raspberry Pis, BeagleBones, and router-based dev boards are great for running Linux, they’re not exactly very powerful. x86 boards also exist, but again, these are lowly Atoms and other Intel embedded processors. These aren’t the boards you want for computationally heavy tasks. There simply aren’t many options out there for high performance computing on low-power hardware.
Tiny ARM computers the size of a credit card have served us all well for general computing tasks, and this leads to the obvious question – what is the purpose of putting so much horsepower on such a small board. The answer, at least according to Nvidia, is drones, autonomous vehicles, and image classification.
Image classification is one of the most computationally intense tasks out there, but for autonomous robots, there’s no other way to tell the difference between a cyclist and a mailbox. To do this on an embedded platform, you either need to bring a powerful general purpose CPU that sucks down 60 or so Watts, or build a smaller, more efficient GPU-based solution that sips a meager 10 Watts.