Hackaday Links Column Banner

Hackaday Links: June 2, 2024

So you say you missed the Great Solar Storm of 2024 along with its attendant aurora? We feel you on that; the light pollution here was too much for decent viewing, and it had been too long a day to make a drive into the deep dark of the countryside survivable. But fear not — the sunspot that raised all the ruckus back at the beginning of May has survived the trip across the far side of the sun and will reappear in early June, mostly intact and ready for business. At least sunspot AR3664 seems like it’s still a force to be reckoned with, having cooked off an X-class flare last Tuesday just as it was coming around from the other side of the Sun. Whether 3664 will be able to stir up another G5 geomagnetic storm remains to be seen, but since it fired off an X-12 flare while it was around the backside, you never know. Your best bet to stay informed in these trying times is the indispensable Dr. Tamitha Skov.

Continue reading “Hackaday Links: June 2, 2024”

Tabletop Handybot Is Handy, And Powered By AI

Decently useful AI has been around for a little while now, and robotic arms have been around much longer. Yet somehow, we don’t have little robot helpers on our desks yet! Thankfully, [Yifei] is working towards that reality with Tabletop Handybot.

What [Yifei] has developed is a robotic arm that accepts voice commands. The robot relies on a Realsense D435 RGB-D camera, which provides color vision with depth information as well. Grounding DINO is used for object detection on the RGB images. Segment Anything and Open3D are used for further processing of the visual and depth data to help the robot understand what it’s looking at. Meanwhile, voice commands are interpreted via OpenAI Whisper, which can feed prompts to ChatGPT for further processing.

[Yifei] demonstrates his robot picking up markers on command, which is a pretty cool demo. With so many modern AI tools available, we’re getting closer to the ideal of robots that can understand and execute on general spoken instructions. This is a great example. We may not be all the way there yet, but perhaps soon. Video after the break.

Continue reading “Tabletop Handybot Is Handy, And Powered By AI”

Generative AI Hits The Commodore 64

Image-generating AIs are typically trained on huge arrays of GPUs and require great wads of processing power to run. Meanwhile, [Nick Bild] has managed to get something similar running on a Commodore 64. (via Tom’s Hardware).

A figure generated by [Nick]’s C64. We shall name him… “Sword Guy”!
As you might imagine, [Nick’s] AI image generator isn’t churning out 4K cyberpunk stills dripping in neon. Instead, he aimed at a smaller target, more befitting the Commodore 64 itself. His image generator creates 8×8 game sprites instead.

[Nick’s] model was trained on 100 retro-inspired sprites that he created himself. He did the training phase on a modern computer, so that the Commodore 64 didn’t have to sweat this difficult task on its feeble 6502 CPU. However, it’s more than capable of generating sprites using the model, thanks to some BASIC code that runs off of the training data. Right now, it takes the C64 about 20 minutes to run through 94 iterations to generate a decent sprite.

8×8 sprites are generally simple enough that you don’t need to be an artist to create them. Nonetheless, [Nick] has shown that modern machine learning techniques can be run on slow archaic hardware, even if there is limited utility in doing so. Video after the break.

Continue reading “Generative AI Hits The Commodore 64”

How AI Large Language Models Work, Explained Without Math

Large Language Models (LLMs ) are everywhere, but how exactly do they work under the hood? [Miguel Grinberg] provides a great explanation of the inner workings of LLMs in simple (but not simplistic) terms that eschews the low-level mathematics of how they work in favor of laying bare what it is they do.

At their heart, LLMs are prediction machines that work on tokens (small groups of letters and punctuation) and are as a result capable of great feats of human-seeming communication. Most technical-minded people understand that LLMs have no idea what they are saying, and this peek at their inner workings will make that abundantly clear.

Be sure to also review an illustrated guide to how image-generating AIs work. And if a peek under the hood of LLMs left you hungry for more low-level details, check out our coverage of training a GPT-2 LLM using pure C code.

The Perfect Desktop Kit For Experimenting With Self Driving Cars

When we think about self-driving cars, we normally think about big projects measured in billions of dollars, all funded by major automakers. But you can still dive into this world on a smaller scale, as [jmoreno555] demonstrates.

The build consists of a small RC car—an HSP 94123, in fact. It’s got a simple brushed motor inside, driven by a conventional speed controller, and servo-driven steering. A Raspberry Pi 4 is charged with driving the car, but it’s not alone. It’s outfitted with a Google Coral USB stick, which is a machine learning accelerator card capable of 4 trillion operations per second. The car also has a Wemos D1 onboard, charged with interfacing distance sensors to give the car a sense of its environment. Vision is courtesy of a 1.2-megapixel camera with a 160-degree lens, and a stereoscopic camera with twin 75-degree lenses. Software-wise, it’s early days yet. [jmoreno555] is exploring the use of Python and OpenCV to implement basic lane detection and other self driving routines, while using Blender as a simulator.

The real magic idea, though, is the treadmill. [jmoreno555] realized that one of the frustrations of working in this space is in having to chase a car around a test track. Instead, the use of a desktop treadmill allows the car to be programmed and debugged with less fuss in the early stages of development.

If you’re looking for a platform to experiment with AI and self-driving, this could be an project to dive in to. We’ve covered some other great builds in this space, too. Meanwhile, if you’ve cracked driving autonomy and want to let us know, our tipsline is always standing by!

AI Helps Make Web Scraping Faster And Easier

Web scraping is usually only a first step towards extracting meaningful data. Once you’ve got everything pulled down, you’ve still got to process it into something useful. Here to assist with that is Scrapegraph-ai, a Python tool that promises to automate the process using a selection of large language models (LLMs).

Scrapegraph-ai is able to accept a URL as well as a prompt, which is a plain-English instruction on what to do with the data. Examples include summarizing, describing images, and more. In other words, gathering the data and analyzing or formatting it can now be done as one.

The project is actually pretty flexible in terms of the AI back-end. It’s able to work with locally-installed AI tools (via ollama) or with API keys for services like OpenAI and more. If you have an OpenAI API key, there’s an online demo that will show you the capabilities pretty effectively. Otherwise, local installation is only a few operations away.

This isn’t the first time we have seen the flexibility of AI tools like large language models leveraged to ease the notoriously-fiddly task of web scraping, and it’s great to see the results have only gotten better.

Train A GPT-2 LLM, Using Only Pure C Code

[Andrej Karpathy] recently released llm.c, a project that focuses on LLM training in pure C, once again showing that working with these tools isn’t necessarily reliant on sprawling development environments. GPT-2 may be older but is perfectly relevant, being the granddaddy of modern LLMs (large language models) with a clear heritage to more modern offerings.

LLMs are fantastically good at communicating despite not actually knowing what they are saying, and training them usually relies on PyTorch deep learning library, itself written in Python. llm.c takes a simpler approach by implementing the neural network training algorithm for GPT-2 directly. The result is highly focused and surprisingly short: about a thousand lines of C in a single file. It is a highly elegant process that does the same thing the bigger, clunkier methods accomplish. It can run entirely on a CPU, or it can take advantage of GPU acceleration, where available.

This isn’t the first time [Andrej Karpathy] has bent his considerable skills and understanding towards boiling down these sorts of concepts into bare-bones implementations. We previously covered a project of his that is the “hello world” of GPT, a tiny model that predicts the next bit in a given sequence and offers low-level insight into just how GPT (generative pre-trained transformer) models work.