Robot Dinosaur YOLOs Colors And Shapes For Kids

YOLO can mean many things, but in the context of [be_riddickulous]’s AI Talking Robot Dinosaur it refers to the “You Only Look Once” YOLOv11 object-detection algorithm by Ultralytics, the method by which this adorable dino recognizes colors and shapes to teach them to children.

If you’re new to using YOLO or object recognition more generally, [be_riddiculous]’s tutorial is not a bad place to get started. She goes through how many images you’ll need and what types to get the shape-and-color recognition needed for this project, as well as how to annotate them and train the model, either locally or in the cloud.

The project itself is an adorable paper-mache dinosaur with a servo-actuated mouth hiding some LEDs and a Raspberry Pi camera module to provide images. In operation, the dinosaur “talks” to children using pre-recorded voice lines, inviting them to play a game and put a specific shape, or shape of a specific color (or both) in its mouth. Then the aforementioned object detection (running on a laptop) goes “YOLO” and identifies the shape so the toy can provide feedback on the child’s choice via a speaker in the belly of the beast.

The link to the game code is currently not valid, but it looks like they used PyGame for the audio output code. A servo motor controls the mouth, but without that code it’s not entirely clear to us what it’s doing. We expect by the time you read this there’s good odds [be_riddickulous] will have fixed that link and you can see for yourself.

The only thing that holds this back from being a great toy to put in every Kindergarten class is the need to have a laptop close by to plug the webcam into. A Raspberry Pi 5 ought to have the horsepower to run YOLOv11, so with a little extra effort the whole thing could be standalone — there might even be room in there for batteries.We’ve had other hacks aimed at little ones, like a kid-friendly computer to relive the glory days of the school computer lab or one of the many iterations of the RFID jukebox idea. If you want to wow the kiddos with AI, perhaps take a look at this talking Santa plush.

Got a cool project, AI, kid-related, or otherwise? Don’t forget to toss us a tip!

2025 Pet Hacks Contest: Fort Bawks Is Guarded By Object Detection

One of the difficult things about raising chickens is that you aren’t the only thing that finds them tasty. Foxes, raccoons, hawks — if it can eat meat, it probably wants a bite of your flock. [donutsorelse] wanted to protect his flock and to be able to know when predators were about without staying up all night next to the hen-house. What to do but outsource the role of Chicken Guardian to a Raspberry pi?

Object detection is done using a YOLOv8 model trained on images of the various predators local to [donutorelse]. The model is running on a Raspberry Pi and getting images from a standard webcam. Since the webcam has no low-light capability, the system also has a motion-activated light that’s arguably goes a long way towards spooking predators away itself. To help with the spooking, a speaker module plays specific sound files for each detected predator — presumably different sounds might work better at scaring off different predators.

If that doesn’t work, the system phones home to activate a siren inside [donutorelse]’s house, using a Blues Wireless Notecarrier F as a cellular USB modem. The siren is just a dumb unit; activation is handled via a TP-Link smart plug that’s hooked into [donutorelse]’s custom smart home setup. Presumably the siren cues [donutorelse] to take action against the predator assault on the chickens.

Weirdly enough, this isn’t the first time we’ve seen an AI-enabled chicken coop, but it is the first one to make into our ongoing challenge, which incidentally wraps up today.

Meta Doesn’t Allow Camera Access On VR Headsets, So Here’s A Workaround

The cameras at the front of Meta’s Quest VR headsets are off-limits to developers, but developer [Michael Gschwandtner] created a workaround (Linkedin post) and shared implementation details with a VR news site.

The view isn’t a pure camera feed (it includes virtual and UI elements) but it’s a clever workaround.

The demo shows object detection via MobileNet V2, which we’ve seen used for machine vision on embedded systems like the Raspberry Pi. In this case it is running locally on the VR headset, automatically identifying objects even though the app cannot directly access the front-facing cameras to see what’s in front of it.

The workaround is conceptually simple, and leverages the headset’s ability to cast its video feed over Wi-Fi to other devices. This feature is normally used for people to share and spectate VR gameplay.

First, [Gschwandtner]’s app sets up passthrough video, which means that the camera feed from the front of the headset is used as background in VR, creating a mixed-reality environment. Then the app essentially spawns itself a Chromium browser, and casts its video feed to itself. It is this video that is used to — in a roundabout way — access what the cameras see.

The resulting view isn’t really direct from the cameras, it’s akin to snapshotting a through-the-headset view which means it contains virtual elements like the UI. Still, with passthrough turned on it is a pretty clever workaround that is contained entirely on-device.

Meta is hesitant to give developers direct access to camera views on their VR headset, and while John Carmack (former Meta consulting CTO) thinks it’s worth opening up and can be done safely, it’s not there yet.

Need To Pick Objects Out Of Images? Segment Anything Does Exactly That

Segment Anything, recently released by Facebook Research, does something that most people who have dabbled in computer vision have found daunting: reliably figure out which pixels in an image belong to an object. Making that easier is the goal of the Segment Anything Model (SAM), just released under the Apache 2.0 license.

The online demo has a bank of examples, but also works with uploaded images.

The results look fantastic, and there’s an interactive demo available where you can play with the different ways SAM works. One can pick out objects by pointing and clicking on an image, or images can be automatically segmented. It’s frankly very impressive to see SAM make masking out the different objects in an image look so effortless. What makes this possible is machine learning, and part of that is the fact that the model behind the system has been trained on a huge dataset of high-quality images and masks, making it very effective at what it does.

Continue reading “Need To Pick Objects Out Of Images? Segment Anything Does Exactly That”

Render Yourself Invisible To AI With This Adversarial Sweater Of Doom

Ugly sweater season is rapidly approaching, at least here in the Northern Hemisphere. We’ve always been a bit baffled by the tradition of paying top dollar for a loud, obnoxious sweater that gets worn to exactly one social event a year. We don’t judge, of course, but that’s not to say we wouldn’t look a little more favorably on someone’s fashion choice if it were more like this AI-defeating adversarial ugly sweater.

The idea behind this research from the University of Maryland is not, of course, to inform fashion trends, nor is it to create a practical invisibility cloak. It’s really to probe machine learning systems for vulnerabilities by making small changes to the input while watching for changes in the output. In this case, the ML system was a YOLO-based vision system which has little trouble finding humans in an arbitrary image. The adversarial pattern was generated by using a large set of training images, some of which contain the objects of interest — in this case, humans. Each time a human is detected, a random pattern is rendered over the image, and the data is reassessed to see how much the pattern lowers the object’s score. The adversarial pattern eventually improves to the point where it mostly prevents humans from being recognized. Much more detail is available in the research paper (PDF) if you want to dig into the guts of this.

The pattern, which looks a little like a bad impressionist painting of people buying pumpkins at a market and bears some resemblance to one we’ve seen before in similar work, is said to work better from different viewing angles. It also makes a spiffy pullover, especially if you’d rather blend in at that Christmas party.

Box with a hole. Camera and Raspberry Pi inside.

A Label Maker That Uses AI Really Poorly

[8BitsAndAByte] found herself obsessively labeling items around her house, and, like the rest of the world, wanted to see what simple, routine tasks could be made unnecessarily complicated by using AI. Instead of manually identifying objects using human intelligence, she thought it would be fun to offload that task to our AI overlords and the results are pretty amusing.

She constructed a cardboard enclosure that housed a Raspberry Pi 3B+, a Pi Camera Module V2, and a small thermal printer for making the labels. The enclosure included a hole for the camera and a button for taking the picture. The image taken by the Pi is analyzed by the DeepAI DenseCap API which, in theory, should create a label for each object detected within the image. Unfortunately, it doesn’t seem to do that very well and [8BitsAndAByte] is left with labels that don’t match any of the objects she took pictures of. In some cases it didn’t even get close, for example, the model thought an apple was a person’s head and a rotary dial phone was a cup. Go figure. It didn’t really seem to bother her though, and she got a pretty good laugh from the whole thing.

It appears the model detects all objects in the image, but only prints the label for the object it was most certain about. So maybe part of her problem is there were just too many objects in the background? If that were the case, you could probably improve the accuracy of the model by placing the object against a neutral background. That may confuse the AI a lot less and possibly give you better results. Or maybe try a different classifier altogether? Or don’t. Then you could just use it as a fun, gag project at your next get-together. That works too.

Cool project [8BitsAndAByte]! Hey, maybe this is a sign the world will still need some human intelligence after all. Who knows?

Continue reading “A Label Maker That Uses AI Really Poorly”

Real Time Object Detection For $59

There was a time when making a machine to identify objects in a camera was difficult, even without trying to do it in real time. But now, you can do it with a Jetson Nano board for under $60. How well does it work? Watch [Murtaza’s] video below and see what you think.

The first few minutes of the video piqued our interest, and good thing, too, because the 50 lines of code get a 50-plus minute video! It is worth watching, though, because there’s a lot of good information about how to apply this technique in your own projects.

Continue reading “Real Time Object Detection For $59”