Meta Doesn’t Allow Camera Access On VR Headsets, So Here’s A Workaround

The cameras at the front of Meta’s Quest VR headsets are off-limits to developers, but developer [Michael Gschwandtner] created a workaround (Linkedin post) and shared implementation details with a VR news site.

The view isn’t a pure camera feed (it includes virtual and UI elements) but it’s a clever workaround.

The demo shows object detection via MobileNet V2, which we’ve seen used for machine vision on embedded systems like the Raspberry Pi. In this case it is running locally on the VR headset, automatically identifying objects even though the app cannot directly access the front-facing cameras to see what’s in front of it.

The workaround is conceptually simple, and leverages the headset’s ability to cast its video feed over Wi-Fi to other devices. This feature is normally used for people to share and spectate VR gameplay.

First, [Gschwandtner]’s app sets up passthrough video, which means that the camera feed from the front of the headset is used as background in VR, creating a mixed-reality environment. Then the app essentially spawns itself a Chromium browser, and casts its video feed to itself. It is this video that is used to — in a roundabout way — access what the cameras see.

The resulting view isn’t really direct from the cameras, it’s akin to snapshotting a through-the-headset view which means it contains virtual elements like the UI. Still, with passthrough turned on it is a pretty clever workaround that is contained entirely on-device.

Meta is hesitant to give developers direct access to camera views on their VR headset, and while John Carmack (former Meta consulting CTO) thinks it’s worth opening up and can be done safely, it’s not there yet.

Need To Pick Objects Out Of Images? Segment Anything Does Exactly That

Segment Anything, recently released by Facebook Research, does something that most people who have dabbled in computer vision have found daunting: reliably figure out which pixels in an image belong to an object. Making that easier is the goal of the Segment Anything Model (SAM), just released under the Apache 2.0 license.

The online demo has a bank of examples, but also works with uploaded images.

The results look fantastic, and there’s an interactive demo available where you can play with the different ways SAM works. One can pick out objects by pointing and clicking on an image, or images can be automatically segmented. It’s frankly very impressive to see SAM make masking out the different objects in an image look so effortless. What makes this possible is machine learning, and part of that is the fact that the model behind the system has been trained on a huge dataset of high-quality images and masks, making it very effective at what it does.

Continue reading “Need To Pick Objects Out Of Images? Segment Anything Does Exactly That”

Render Yourself Invisible To AI With This Adversarial Sweater Of Doom

Ugly sweater season is rapidly approaching, at least here in the Northern Hemisphere. We’ve always been a bit baffled by the tradition of paying top dollar for a loud, obnoxious sweater that gets worn to exactly one social event a year. We don’t judge, of course, but that’s not to say we wouldn’t look a little more favorably on someone’s fashion choice if it were more like this AI-defeating adversarial ugly sweater.

The idea behind this research from the University of Maryland is not, of course, to inform fashion trends, nor is it to create a practical invisibility cloak. It’s really to probe machine learning systems for vulnerabilities by making small changes to the input while watching for changes in the output. In this case, the ML system was a YOLO-based vision system which has little trouble finding humans in an arbitrary image. The adversarial pattern was generated by using a large set of training images, some of which contain the objects of interest — in this case, humans. Each time a human is detected, a random pattern is rendered over the image, and the data is reassessed to see how much the pattern lowers the object’s score. The adversarial pattern eventually improves to the point where it mostly prevents humans from being recognized. Much more detail is available in the research paper (PDF) if you want to dig into the guts of this.

The pattern, which looks a little like a bad impressionist painting of people buying pumpkins at a market and bears some resemblance to one we’ve seen before in similar work, is said to work better from different viewing angles. It also makes a spiffy pullover, especially if you’d rather blend in at that Christmas party.

Box with a hole. Camera and Raspberry Pi inside.

A Label Maker That Uses AI Really Poorly

[8BitsAndAByte] found herself obsessively labeling items around her house, and, like the rest of the world, wanted to see what simple, routine tasks could be made unnecessarily complicated by using AI. Instead of manually identifying objects using human intelligence, she thought it would be fun to offload that task to our AI overlords and the results are pretty amusing.

She constructed a cardboard enclosure that housed a Raspberry Pi 3B+, a Pi Camera Module V2, and a small thermal printer for making the labels. The enclosure included a hole for the camera and a button for taking the picture. The image taken by the Pi is analyzed by the DeepAI DenseCap API which, in theory, should create a label for each object detected within the image. Unfortunately, it doesn’t seem to do that very well and [8BitsAndAByte] is left with labels that don’t match any of the objects she took pictures of. In some cases it didn’t even get close, for example, the model thought an apple was a person’s head and a rotary dial phone was a cup. Go figure. It didn’t really seem to bother her though, and she got a pretty good laugh from the whole thing.

It appears the model detects all objects in the image, but only prints the label for the object it was most certain about. So maybe part of her problem is there were just too many objects in the background? If that were the case, you could probably improve the accuracy of the model by placing the object against a neutral background. That may confuse the AI a lot less and possibly give you better results. Or maybe try a different classifier altogether? Or don’t. Then you could just use it as a fun, gag project at your next get-together. That works too.

Cool project [8BitsAndAByte]! Hey, maybe this is a sign the world will still need some human intelligence after all. Who knows?

Continue reading “A Label Maker That Uses AI Really Poorly”

Real Time Object Detection For $59

There was a time when making a machine to identify objects in a camera was difficult, even without trying to do it in real time. But now, you can do it with a Jetson Nano board for under $60. How well does it work? Watch [Murtaza’s] video below and see what you think.

The first few minutes of the video piqued our interest, and good thing, too, because the 50 lines of code get a 50-plus minute video! It is worth watching, though, because there’s a lot of good information about how to apply this technique in your own projects.

Continue reading “Real Time Object Detection For $59”

Open Source Self-Driving Smartphone Robot

Our smartphones are incredibly powerful computers in their own right, yet we don’t often see them directly integrated into projects. Intel Intelligent Systems Lab has done exactly that with the release OpenBot, an open source smartphone based self-driving robot.

Most of the magic happens on the smartphone, which runs an app built on TensorFlow Lite, and integrates the camera and array of sensors on the smartphone, as well as the data from ultrasonic sensors and wheel encoders on the robot. The robot itself is relatively simple, with four geared DC motors, motor drivers wired to an Arduino Nano that interfaces with an Android Phone over serial.

The app created by the Intel ISL team comes preloaded with three AI models that can do either person following, or two different modes of autonomous navigation. By connecting a Bluetooth controller to the smartphone and drive the robot around manually in your specific environment while collecting data, you can train a custom autonomous driving policy to suit your environment.

This looks like an excellent way to get a taste of autonomous robots on a small budget, while still being a viable base for more demanding applications. We’ve seen only a few smartphone based robots like DriveMyPhone and SmartiPresense, which don’t have AI capabilities, but are intended for telepresence applications. We’ve always wondered why we don’t see more projects with cellphones, so we welcome the example.

Continue reading “Open Source Self-Driving Smartphone Robot”

OPARP Telepresence Robot

[Erik Knutsson] is stuck inside with a bunch of robot parts, and we know what lies down that path. His Open Personal Assistant Robotic Platform aims to help out around the house with things like filling pet food bowls, but for now, he is taking one step at a time and working out the bugs before adding new features. Wise.

The build started with a narrow base, an underpowered RasPi, and a quiet speaker, but those were upgraded in turn. Right now, it is a personal assistant on wheels. Alexa was the first contender, but Mycroft is in the spotlight because it has more versatility. At first, the mobility was a humble web server with a D-pad, but now it leverages a distance sensor and vision, and can even follow you with a voice command.

The screen up top gives it a personable look, but it is slated to become a display for everything you’d want to see on your robot assistant, like weather, recipes, or a video chat that can walk around with you. [Erik] would like to make something that assists the elderly who might need help with chores and help connect people who are stuck inside like him.

Expressive robots have long since captured our attention and we’re nuts for privacy-centric personal assistants.

Continue reading “OPARP Telepresence Robot”