Helping Robots Learn By Letting Them Fail

The [MIT Technology Review] has just released its annual list of the top innovators under the age of 35, and there are some interesting people on this list of the annoyingly accomplished at a young age. Like [Lerrel Pinto], an associate professor of computer science at NY University. His work focuses on teaching robots how to do things in the home by failing.

Continue reading “Helping Robots Learn By Letting Them Fail”

Machine Learning Robot Runs Arduino Uno

When we think about machine learning, our minds often jump to datacenters full of sweating, overheating GPUs. However, lighter-weight hardware can also be used to these ends, as demonstrated by [Nikodem Bartnik] and his latest robot.

The robot is charged with autonomously navigating a simple racetrack delineated by cardboard barriers. The robot is based on a two-wheeled design with tank-style steering. Controlled by an Arduino Uno, the robot uses a Slamtec RPLIDAR sensor to help map out its surroundings. The microcontroller is also armed with a Bluetooth link and an SD card for storage.

The robot was first driven around the racetrack multiple times under manual control, all the while collecting LIDAR data. This data was combined with control inputs to help create a data set that could be used to train a machine learning model. Feature selection techniques were used to refine down the data points collected to those most relevant to completing the driving task. [Nikodem] explains how the model was created and then refined to drive the robot by itself in a variety of race track designs.

It’s a great primer on machine learning techniques applied to a small embedded platform.

Continue reading “Machine Learning Robot Runs Arduino Uno”

Here’s Why GPUs Are Deep Learning’s Best Friend

If you have a curiosity about how fancy graphics cards actually work, and why they are so well-suited to AI-type applications, then take a few minutes to read [Tim Dettmers] explain why this is so. It’s not a terribly long read, but while it does get technical there are also car analogies, so there’s something for everyone!

He starts off by saying that most people know that GPUs are scarily efficient at matrix multiplication and convolution, but what really makes them most useful is their ability to work with large amounts of memory very efficiently.

Essentially, a CPU is a latency-optimized device while GPUs are bandwidth-optimized devices. If a CPU is a race car, a GPU is a cargo truck. The main job in deep learning is to fetch and move cargo (memory, actually) around. Both devices can do this job, but in different ways. A race car moves quickly, but can’t carry much. A truck is slower, but far better at moving a lot at once. Continue reading “Here’s Why GPUs Are Deep Learning’s Best Friend”

High Quality 3D Scene Generation From 2D Source, In Realtime

Here’s some fascinating work presented at SIGGRAPH 2023 of a method for radiance field rendering using a novel technique called Gaussian Splatting. What’s that mean? It means synthesizing a 3D scene from 2D images, in high quality and in real time, as the short animation shown above shows.

Neural Radiance Fields (NeRFs) are a method of leveraging machine learning to, in a way, do what photogrammetry does: synthesize complex scenes and views based on input images. But NeRFs work in a fraction of the time, and require only a fraction of the source material. There are different ways to go about this and unsurprisingly, there tends to be a clear speed vs. quality tradeoff. But as the video accompanying this new work seems to show, clever techniques mean the best of both worlds.

A short video summary is embedded just below the page break. Interested in deeper details? The research PDF is here. The amount of development this field has seen is nothing short of staggering, and certainly higher in quality than what was state-of-the-art for NeRFs only a year ago.

Continue reading “High Quality 3D Scene Generation From 2D Source, In Realtime”

Teaching A Mini-Tesla To Steer Itself

At the risk of stating the obvious, even when you’ve got unlimited resources and access to the best engineering minds, self-driving cars are hard. Building a multi-ton guided missile that can handle the chaotic environment of rush-hour traffic without killing someone is a challenge, to say the least. So if you’re looking to get into the autonomous car game, perhaps it’s best to start small.

If [Austin Blake]’s fun-sized Tesla go-kart looks familiar, it’s probably because we covered the Teskart back when he whipped up this little demon of an EV from a Radio Flyer toy. Adding self-driving to the kart is a natural next step, so [Austin] set off on a journey into machine learning to make it happen. Having settled on behavioral cloning, which trains a model to replicate a behavior by showing it examples of the behavior, he built a bolt-on frame to hold a steering servo made from an electric wheelchair motor, some drive electronics, and a webcam attached to a laptop. Ten or so human-piloted laps around a walking path at a park resulted in a 48,000-image training set, along with the steering wheel angle at each point.

The first go-around wasn’t so great, with the Teskart seemingly bent on going off the track. [Austin] retooled by adding two more webcams, to get a little parallax data and hopefully improve the training data. After a bug fix, the improved model really seemed to do the trick, with the Teskart pretty much keeping in its lane around the track, no matter how fast [Austin] pushed it. Check out the video below to see the Teskart in action.

It’s important to note that this isn’t even close to “Full Self-Driving.” The only thing being controlled is the steering angle; [Austin] is controlling the throttle himself and generally acting as the safety driver should the car veer off course, which it tends to do at one particular junction. But it’s a great first step, and we’re looking forward to further development.

Continue reading “Teaching A Mini-Tesla To Steer Itself”

Re-Creating Pink Floyd In The Name Of Speech

For people who have lost the ability to speak, the future may include brain implants that bring that ability back. But could these brain implants also allow them to sing? Researchers believe that, all in all, it’s just another brick in the wall.

In a new study published in PLOS Biology, twenty-nine people who were already being monitored for epileptic seizures participated via a postage stamp-sized array of electrodes implanted directly on the surface of their brains. As the participants were exposed to Pink Floyd’s Another Brick In the Wall, Part 1, the researchers gathered data from several areas of the brain, each attuned to a different musical element such as harmony, rhythm, and so on. Then the researchers used machine learning to reconstruct the audio heard by the participants using their brainwaves.

First, an AI model looked at the data generated from the brains’ responses to components of the song, like the changes in rhythm, pitch, and tone. Then a second model rejiggered the piecemeal song and estimated the sounds heard by the patients. Of the seven audio samples published in the study results, we think #3 sounds the most like the song. It’s kind of creepy but ultimately very cool. What do you think?

Continue reading “Re-Creating Pink Floyd In The Name Of Speech”

Several video clips of a robot arm manipulating objects in a kitchen environment, demonstrating some of the 12 generalized skills

RoboAgent Gets Its MT-ACT Together

Researchers at Carnegie Mellon University have shared a pre-print paper on generalized robot training within a small “practical data budget.” The team developed a system that breaks movement tasks into 12 “skills” (e.g., pick, place, slide, wipe) that can be combined to create new and complex trajectories within at least somewhat novel scenarios, called MT-ACT: Multi-Task Action Chunking Transformer. The authors write:

Trained merely on 7500 trajectories, we are demonstrating a universal RoboAgent that can exhibit a diverse set of 12 non-trivial manipulation skills (beyond picking/pushing, including articulated object manipulation and object re-orientation) across 38 tasks and can generalize them to 100s of diverse unseen scenarios (involving unseen objects, unseen tasks, and to completely unseen kitchens). RoboAgent can also evolve its capabilities with new experiences.

Continue reading “RoboAgent Gets Its MT-ACT Together”