Tiny Machine Learning On As Little As 2 KB Of RAM

All of the machine language stuff coming out lately doesn’t affect you if you are developing with embedded microcontrollers, right? Perhaps not. Microsoft Research India wants you to use their EdgeML tool to do machine learning tasks such as gesture recognition in tiny devices like an Arduino Uno. According to the developers, you might need as little as 2 KB of RAM. There’s no network connection required and the work is using Tensorflow underneath, so it is compatible with much of what you’ll find for bigger computers.

If you add processing power, you can get more capability. For example, one of the demonstrations is a wake-word recognizer on a Raspberry Pi Zero (although the page for that demo seems to be missing at the moment; try the GesturePod, instead).

The system generally uses Python, but there are efficient C++ implementations for selected algorithms. The code lives on GitHub. There are also a number of research papers about each tool that you can find on the GitHub page. There’s also a recent paper on MinUn, an attempt to make things even more efficient for ARM microcontrollers. In particular, MinUn can store approximate numbers to save space, allows for variable precision of tensors, and tries to reduce memory fragmentation, an important feature for CPUs that don’t have memory management units.

If you haven’t studied TensorFlow yet, start here. Why use something like this with a microcontroller? How about smarter robots?

How To Roll Your Own Custom Object Detection Neural Network

Real-time object detection, which uses neural networks and deep learning to rapidly identify and tag objects of interest in a video feed, is a handy feature with great hacker potential. Happily, it’s also possible to make customized CNNs (convolutional neural networks) tailored for one’s own needs, and that process just got easier thanks to some new documentation for the Vizy “AI camera” by Charmed Labs.

Raspberry Pi-based Vizy camera

Charmed Labs has been making hacker-friendly machine vision devices for a long time, and the Vizy camera impressed us mightily when we checked it out last year. Out of the box, Vizy has a perfectly functional object detector application that runs locally on the device, and can detect and tag many common everyday objects in real time. But what if that default application doesn’t quite meet one’s project needs? Good news, because it’s possible to create a custom-trained CNN, and that process got a lot more accessible thanks to step-by-step examples of training a model to recognize hands doing rock-paper-scissors.

Person and cat with machine-generated tags identifying them
Default object detection works well, but sometimes one needs custom results.

The basic process is this: Start with a variety of images that show the item of interest. Then identify and label the item of interest in each photo. These photos (a “training set”) are then sent to Google Colab, which will be used to generate a neural network. The resulting CNN model can then be downloaded and used, to see how well it performs.

Of course things rarely work perfectly the first time around, so at this point it’s pretty common for some refinement to be needed to increase accuracy. Luckily there are a number of tools to help do this without creating a new model from scratch, so it’s just a matter of tweaking until things perform acceptably.

Google Colab is free and the resulting CNNs are implemented in the TensorFlow Lite framework, meaning it’s possible to use them elsewhere. So if custom object detection has been holding up a project idea of yours, this might be what gets you over that hump.

Getty Images Is Suing An AI Image Generator For Using Its Images

As per the Getty Images legal complaint, the Stable Diffusion AI seems to reproduce gooey versions of the Getty Images watermark in some of its output. Credit: Getty Images

Many AI systems require huge training datasets in order to achieve their impressive feats. This applies whether or not you’re talking about an AI that works with images, natural language, or just about anything else. AI developers are starting to come under scrutiny for where they’re sourcing their datasets. Unsurprisingly, stock photo site Getty Images is at the forefront of this, and is now suing the creators of Stable Diffusion over the matter, as reported by The Verge.

Stability AI, the company behind Stable Diffusion, is the target of the lawsuit for one good reason: there’s compelling evidence the company used Getty Images content without permission. The Stable Diffusion AI has been seen to generate output images that actually include blurry approximations of the Getty Images watermark. This is somewhat of a smoking gun to suggest that Stability AI may have scraped Getty Images content for use as training material.

The copyright implications are unclear, but using any imagery from a stock photo database without permission is always asking for trouble. Various arguments will likely play out in court. Stability AI may make claims that their activity falls under fair use guidelines, while Getty Images may claim that the appearance of perverted versions of their watermark may break trademark rules. The lawsuit could have serious implications for AI image generators worldwide, and is sure to be watched closely by the nascent AI industry. As with any legal matter, just don’t expect a quick answer from the courts.

[Thanks to Dan for the tip!]

Does Programming A Robot With ChatGPT Work At All?

ChatGPT has been put to all manner of silly uses since it first became available online. [Engineering After Hours] decided to see if its coding skills were any chop, and put it to work programming a circular saw. Pun intended.

The aim was to build a line following robot armed with a circular saw to handle lawn edging tasks.  The circular saw itself consists of a motor with a blade on it, and precisely no safety features. It’s mounted on the front of a small RC car with a rack and pinion to control its position. [Engineering After Hours] has some sage advice in this area: don’t try this at home.

ChatGPT was not only able to give advice on what parts to use, it was able to tell [Engineering After Hours] on how to hook everything up to an Arduino and even write the code. The AI language model even recommended a PID loop to control the position of the circular saw. Initial tests were messy, but some refinement got things impressively functional.

As a line following robot, the performance is pretty crummy. However, as a robot programmed by an AI, it does pretty okay. Obviously, it’s hard to say how much help the AI had, and how many corrections [Engineering After Hours] had to make to the code to get everything working. But the fact that this kind of project is even possible shows us just how far AI has really come.

Continue reading “Does Programming A Robot With ChatGPT Work At All?”

Understanding AI Chat Bots With Stanford Online

The news is full of speculation about chatbots like GPT-3, and even if you don’t care, you are probably the kind of person that people will ask about it. The problem is, the popular press has no idea what’s going on with these things. They aren’t sentient or alive, despite some claims to the contrary. So where do you go to learn what’s really going on? How about Stanford? Professor [Christopher Potts] knows a lot about how these things work and he shares some of it in a recent video you can watch below.

One of the interesting things is that he shows some questions that one chatbot will answer reasonably and another one will not. As a demo or a gimmick, that’s not a problem. But if you are using it as, say, your search engine, getting the wrong answer won’t amuse you. Sure, you can do a conventional search and find wrong things, but it will be embedded in a lot of context that might help you decide it is wrong and, hopefully, some other things that are not wrong. You have to decide.
Continue reading “Understanding AI Chat Bots With Stanford Online”

With ChatGPT, Game NPCs Get A Lot More Interesting

Not only is AI-driven natural language processing a thing now, but you can even select from a number of different offerings, each optimized for different tasks. It took very little time for [Bloc] to mod a computer game to allow the player to converse naturally with non-player characters (NPCs) by hooking it into ChatGPT, a large language model AI optimized for conversational communication.

If you can look past the painfully-long loading times, even buying grain (7:36) gains a new layer of interactivity.

[Bloc] modified the game Mount & Blade II: Bannerlord to reject traditional dialogue trees and instead accept free-form text inputs, using ChatGPT on the back end to create more natural dialogue interactions with NPCs. This is a refinement of an earlier mod [Bloc] made and shared, so what you see in the video below is quite a bit more than a proof of concept. The NPCs communicate as though they are aware of surrounding events and conditions in the game world, are generally less forthcoming when talking to strangers, and the new system can interact with game mechanics and elements such as money, quests, and hirelings.

Starting around 1:08 into the video, [Bloc] talks to a peasant about some bandits harassing the community, and from there demonstrates hiring some locals and haggling over prices before heading out to deal with the bandits.

The downside is that ChatGPT is currently amazingly popular. As a result, [Bloc]’s mod is stuck using an overloaded service which means some painfully-long load times between each exchange. But if you can look past that, it’s a pretty fascinating demonstration of what’s possible by gluing two systems together with a mod and some clever coding.

Take a few minutes to check out the video, embedded below. And if you’re more of a tabletop gamer? Let us remind you that it might be fun to try replacing your DM with ChatGPT.

Continue reading “With ChatGPT, Game NPCs Get A Lot More Interesting”

Now ChatGPT Can Make Breakfast For Me

The world is abuzz with tales of the ChatGPT AI chatbot, and how it can do everything, except perhaps make the tea. It seems it can write code, which is pretty cool, so if it can’t make the tea as such, can it make the things I need to make some tea? I woke up this morning, and after lying in bed checking Hackaday I wandered downstairs to find some breakfast. But disaster! Some burglars had broken in and stolen all my kitchen utensils! All I have is my 3D printer and laptop, which curiously have little value to thieves compared to a set of slightly chipped crockery. What am I to do!

Never Come Between A Hackaday Writer And Her Breakfast!

OK Jenny, think rationally. They’ve taken the kettle, but I’ve got OpenSCAD and ChatGPT. Those dastardly miscreants won’t come between me and my breakfast, I’m made of sterner stuff! Into the prompt goes the following query:

"Can you write me OpenSCAD code to create a model of a kettle?" Continue reading “Now ChatGPT Can Make Breakfast For Me”