Hackaday Links Column Banner

Hackaday Links: December 15, 2024

It looks like we won’t have Cruise to kick around in this space anymore with the news that General Motors is pulling the plug on its woe-beset robotaxi project. Cruise, which GM acquired in 2016, fielded autonomous vehicles in various test markets, but the fleet racked up enough high-profile mishaps (first item) for California regulators to shut down test programs in the state last year. The inevitable layoffs ensued, and GM is now killing off its efforts to build robotaxis to concentrate on incorporating the Cruise technology into its “Super Cruise” suite of driver-assistance features for its full line of cars and trucks. We feel like this might be a tacit admission that surmounting the problems of fully autonomous driving is just too hard a nut to crack profitably with current technology, since Super Cruise uses eye-tracking cameras to make sure the driver is paying attention to the road ahead when automation features are engaged. Basically, GM is admitting there still needs to be meat in the seat, at least for now.

Continue reading “Hackaday Links: December 15, 2024”

Render of life-size robot rat animatronic on blue plane

Robot Rodents: How AI Learned To Squeak And Play

In an astonishing blend of robotics and nature, SMEO—a robot rat designed by researchers in China and Germany — is fooling real rats into treating it like one of their own.

What sets SMEO apart is its rat-like adaptability. Equipped with a flexible spine, realistic forelimbs, and AI-driven behavior patterns, it doesn’t just mimic a rat — it learns and evolves through interaction. Researchers used video data to train SMEO to “think” like a rat, convincing its living counterparts to play, cower, or even engage in social nuzzling. This degree of mimicry could make SMEO a valuable tool for studying animal behavior ethically, minimizing stress on live animals by replacing some real-world interactions.

For builders and robotics enthusiasts, SMEO is a reminder that robotics can push boundaries while fostering a more compassionate future. Many have reservations about keeping intelligent creatures in confined cages or using them in experiments, so imagine applying this tech to non-invasive studies or even wildlife conservation. In a world where robotic dogs, bees, and even schools of fish have come to life, this animatronic rat sounds like an addition worth further exploring. SMEO’s development could, ironically, pave the way for reducing reliance on animal testing.

Continue reading “Robot Rodents: How AI Learned To Squeak And Play”

The Junk Machine Prints Corrupted Advertising On Demand

[ClownVamp]’s art project The Junk Machine is an interactive and eye-catching machine that, on demand, prints out an equally eye-catching and unique yet completely meaningless (one may even say corrupted) AI-generated advertisement for nothing in particular.

The machine is an artistic statement on how powerful software tools that have genuine promise and usefulness to creative types are finding their way into marketer’s hands, and resulting in a deluge of, well, junk. This machine simplifies and magnifies that in a physical way.

We can’t help but think that The Junk Machine is in a way highlighting Sturgeon’s Law (paraphrased as ‘ninety percent of everything is crud’) which happens to be particularly applicable to the current AI landscape. In short, the ease of use of these tools means that crud is also being effortlessly generated at an unprecedented scale, swamping any positive elements.

As for the hardware and software, we’re very interested in what’s inside. Unfortunately there’s no deep technical details, but the broad strokes are that The Junk Machine uses an embedded NVIDIA Jetson loaded up with Stable Diffusion’s SDXL Turbo, an open source AI image generator that can be installed and run locally. When and if a user mashes a large red button, the machine generates a piece of AI junk mail in real time without any need for a network connection of any kind, and prints it from an embedded printer.

Watch it in action in the video embedded below, just under the page break. There are a few more different photos on [ClownVamp]’s X account.

Continue reading The Junk Machine Prints Corrupted Advertising On Demand”

An Animated Walkthrough Of How Large Language Models Work

If you wonder how Large Language Models (LLMs) work and aren’t afraid of getting a bit technical, don’t miss [Brendan Bycroft]’s LLM Visualization. It is an interactively-animated step-by-step walk-through of a GPT large language model complete with animated and interactive 3D block diagram of everything going on under the hood. Check it out!

nano-gpt has only around 85,000 parameters, but the operating principles are all the same as for larger models.

The demonstration walks through a simple task and shows every step. The task is this: using the nano-gpt model, take a sequence of six letters and put them into alphabetical order.

A GPT model is a highly complex prediction engine, so the whole process begins with tokenizing the input (breaking up words and assigning numerical values to the chunks) and ends with choosing an appropriate output from a list of probabilities. There are of course many more steps in between, and different ways to adjust the model’s behavior. All of these are made quite clear by [Brendan]’s process breakdown.

We’ve previously covered how LLMs work, explained without math which eschews gritty technical details in favor of focusing on functionality, but it’s also nice to see an approach like this one, which embraces the technical elements of exactly what is going on.

We’ve also seen a much higher-level peek at how a modern AI model like Anthropic’s Claude works when it processes requests, extracting human-understandable concepts that illustrate what’s going on under the hood.

AI Face Anonymizer Masks Human Identity In Images

We’re all pretty familiar with AI’s ability to create realistic-looking images of people that don’t exist, but here’s an unusual implementation of using that technology for a different purpose: masking people’s identity without altering the substance of the image itself. The result is the photo’s content and “purpose” (for lack of a better term) of the image remains unchanged, while at the same time becoming impossible to identify the actual person in it. This invites some interesting privacy-related applications.

Originals on left, anonymized versions on the right. The substance of the images has not changed.

The paper for Face Anonymization Made Simple has all the details, but the method boils down to using diffusion models to take an input image, automatically pick out identity-related features, and alter them in a way that looks more or less natural. For this purpose, identity-related features essentially means key parts of a human face. Other elements of the photo (background, expression, pose, clothing) are left unchanged. As a concept it’s been explored before, but researchers show that this versatile method is both simpler and better-performing than others.

Diffusion models are the essence of AI image generators like Stable Diffusion. The fact that they can be run locally on personal hardware has opened the doors to all kinds of interesting experimentation, like this haunted mirror and other interactive experiments. Forget tweaking dull sliders like “brightness” and “contrast” for an image. How about altering the level of “moss”, “fire”, or “cookie” instead?

Here’s Code For That AI-Generated Minecraft Clone

A little while ago Oasis was showcased on social media, billing itself as the world’s first playable “AI video game” that responds to complex user input in real-time. Code is available on GitHub for a down-scaled local version if you’d like to take a look. There’s a bit more detail and background in the accompanying project write-up, which talks about both the potential as well as the numerous limitations.

We suspect the focus on supporting complex user input (such as mouse look and an item inventory) is what the creators feel distinguishes it meaningfully from AI-generated DOOM. The latter was a concept that demonstrated AI image generators could (kinda) function as real-time game engines.

Image generators are, in a sense, prediction machines. The idea is that by providing a trained model with a short history of what just happened plus the user’s input as context, it can generate a pretty usable prediction of what should happen next, and do it quickly enough to be interactive. Run that in a loop, and you get some pretty impressive clips to put on social media.

It is a neat idea, and we certainly applaud the creativity of bending an image generator to this kind of application, but we can’t help but really notice the limitations. Sit and stare at something, or walk through dark or repetitive areas, and the system loses its grip and things rapidly go in a downward spiral we can only describe as “dreamily broken”.

It may be more a demonstration of a concept than a properly functioning game, but it’s still a very clever way to leverage image generation technology. Although, if you’d prefer AI to keep the game itself untouched take a look at neural networks trained to use the DOOM level creator tools.

Assessing Developer Productivity When Using AI Coding Assistants

We have all seen the advertisements and glossy flyers for coding assistants like GitHub Copilot, which promised to use ‘AI’ to make you write code and complete programming tasks faster than ever, yet how much of that has worked out since Copilot’s introduction in 2021? According to a recent report by code analysis firm Uplevel there are no significant benefits, while GitHub Copilot also introduced 41% more bugs. Commentary from development teams suggests that while the coding assistant makes for faster writing of code, debugging or maintaining the code is often not realistic.

None of this should be a surprise, of course, as this mirrors what we already found when covering this topic back in 2021. With GitHub Copilot and kin being effectively Large Language Models (LLMs) that are trained on codebases, they are best considered to be massive autocomplete systems targeting code. Much like with autocomplete on e.g. a smartphone, the experience is often jarring and full of errors. Perhaps the most fair assessment of GitHub Copilot is that it can be helpful when writing repetitive, braindead code that requires very little understanding of the code to get right, while it’s bound to helpfully carry in a bundle of sticks and a dead rodent like an overly enthusiastic dog when all you wanted was for it to grab that spanner.

Until Copilot and kin develop actual intelligence, it would seem that software developer jobs are still perfectly safe from being taken over by our robotic overlords.