Artificial Intelligence Runs On Arduino

Fundamentally, an artificial intelligence (AI) is nothing more than a system that takes a series of inputs, makes some prediction, and then outputs that information. Of course, the types of AI in the news right now can handle a huge number of inputs and need server farms’ worth of compute to generate outputs of various forms, but at a basic level, there’s no reason a purpose-built AI can’t run on much less powerful hardware. As a demonstration, and to win a bet with a friend, [mondal3011] got an artificial intelligence up and running on an Arduino.

This AI isn’t going to do anything as complex as generate images or write clunky preambles to every recipe on the Internet, but it is still a functional and useful piece of software. This one specifically handles the brightness of a single lamp, taking user input on acceptable brightness ranges in the room and outputting what it thinks the brightness of the lamp should be to match the user’s preferences. [mondal3011] also builds a set of training data for the AI to learn from, taking the lamp to various places around the house and letting it figure out where to set the brightness on its own. The training data is run through a linear regression model in Python which generates the function that the Arduino needs to automatically operate the lamp.

Although this isn’t the most complex model, it does go a long way to demonstrating the basic principles of using artificial intelligence to build a useful and working model, and then taking that model into the real world. Note also that the model is generated on a more powerful computer before being ported over to the microcontroller platform. But that’s all par for the course in AI and machine learning. If you’re looking to take a step up from here, we’d recommend this robot that uses neural networks to learn how to walk.

Assessing Developer Productivity When Using AI Coding Assistants

We have all seen the advertisements and glossy flyers for coding assistants like GitHub Copilot, which promised to use ‘AI’ to make you write code and complete programming tasks faster than ever, yet how much of that has worked out since Copilot’s introduction in 2021? According to a recent report by code analysis firm Uplevel there are no significant benefits, while GitHub Copilot also introduced 41% more bugs. Commentary from development teams suggests that while the coding assistant makes for faster writing of code, debugging or maintaining the code is often not realistic.

None of this should be a surprise, of course, as this mirrors what we already found when covering this topic back in 2021. With GitHub Copilot and kin being effectively Large Language Models (LLMs) that are trained on codebases, they are best considered to be massive autocomplete systems targeting code. Much like with autocomplete on e.g. a smartphone, the experience is often jarring and full of errors. Perhaps the most fair assessment of GitHub Copilot is that it can be helpful when writing repetitive, braindead code that requires very little understanding of the code to get right, while it’s bound to helpfully carry in a bundle of sticks and a dead rodent like an overly enthusiastic dog when all you wanted was for it to grab that spanner.

Until Copilot and kin develop actual intelligence, it would seem that software developer jobs are still perfectly safe from being taken over by our robotic overlords.

All System Prompts For Anthropic’s Claude, Revealed

For as long as AI Large Language Models have been around (well, for as long as modern ones have been accessible online, anyway) people have tried to coax the models into revealing their system prompts. The system prompt is essentially the model’s fundamental directives on what it should do and how it should act. Such healthy curiosity is rarely welcomed, however, and creative efforts at making a model cough up its instructions is frequently met with a figurative glare and stern tapping of the Terms & Conditions sign.

Anthropic have bucked this trend by making system prompts public for the web and mobile interfaces of all three incarnations of Claude. The prompt for Claude Opus (their flagship model) is well over 1500 words long, with different sections specifically for handling text and images. The prompt does things like help ensure Claude communicates in a useful way, taking into account the current date and an awareness of its knowledge cut-off, or the date after which Claude has no knowledge of events. There’s some stylistic stuff in there as well, such as Claude being specifically told to avoid obsequious-sounding filler affirmations, like starting a response with any form of the word “Certainly.”

Continue reading “All System Prompts For Anthropic’s Claude, Revealed”

Creating Video Games With AI: A Mario Example

Artificial intelligence (AI) seems to be doing everything these days. Making images, making videos, and replacing most of us real human writers if you believe the hype. Maybe it’s all over! And yet, we persist, to write about yet another job taken over by AI: creating video games.

The research paper is entitled “Video Game Generation: A Practical Study using Mario.” The basic idea is whether a generative AI model can create an interactive video game by first training it on an existing game.

MarioVGG, as it is called, is a “text-to-video model.” It hasn’t built the Mario game that you’re familiar with, though. It takes player commands as text inputs—such as “run, or “jump”—and then outputs video frames showing the result in the ‘game.’ The model was trained on a dataset of frame-by-frame Super Mario Brothers game play, combined with data on user inputs at the time. The model shows an ability to generate believable video output for given player inputs, including basic game physics, item interactions, and collisions. It’s able to do this in a chained way, so that it can reasonably simulate a player making multiple actions and moving through a level of the game.

It’s not like playing a real Mario game yet, by any means. Regardless, the AI model has shown an ability to replicate the world of the game in a way that behaves relatively consistently with its established rules. If you’re in the field of video game development, though, you probably don’t have a lot to worry about just yet—you probably moved past making basic Mario clones years ago, so you’ve got quite an edge for now!

What’s The Deal With AI Art?

A couple weeks ago, we had a kerfuffle here on Hackaday: A writer put out a piece with AI-generated headline art. It was, honestly, pretty good, but it was also subject to all of the usual horrors that get generated along the way. If you have played around with any of the image generators you know the AI-art uncanny style, where it looks good enough at first glance, but then you notice limbs in the wrong place if you look hard enough. We replaced it shortly after an editor noticed.

The story is that the writer couldn’t find any nice visuals to go with the blog post, with was about encoding data in QR codes and printing them out for storage. This is a problem we have frequently here, actually. When people write up a code hack, for instance, there’s usually just no good image to go along with it. Our writers have to get creative. In this case, he tossed it off to Stable Diffusion.

Some commenters were afraid that this meant that we were outsourcing work from our fantastic, and very human, art director Joe Kim, whose trademark style you’ve seen on many of our longer-form original articles. Of course we’re not! He’s a genius, and when we tell him we need some art about topics ranging from refining cobalt to Wimshurst machines to generate static electricity, he comes through. I think that all of us probably have wanted to make a poster out of one or more of his headline art pieces. Joe is a treasure.

But for our daily blog posts, which cover your works, we usually just use a picture of the project. We can’t ask Joe to make ten pieces of art per day, and we never have. At least as far as Hackaday is concerned, AI-generated art is just as good as finding some cleared-for-use clip art out there, right?

Except it’s not. There is a lot of uncertainty about the data that the algorithms are trained on, whether the copyright of the original artists was respected or needed to be, ethically or legally. Some people even worry that the whole thing is going to bring about the end of Art. (They worried about this at the introduction of the camera as well.) But then there’s also the extra limbs, and AI-generated art’s cliche styles, which we fear will get old and boring after we’re all saturated with them.

So we’re not using AI-generated art as a policy for now, but that’s not to say that we don’t see both the benefits and the risks. We’re not Luddites, after all, but we are also in favor of artists getting paid for their work, and of respect for the commons when people copyleft license their images. We’re very interested to see how this all plays out in the future, but for now, we’re sitting on the sidelines. Sorry if that means more headlines with colorful code!

Creating A Twisted Grid Image Illusion With A Diffusion Model

Images that can be interpreted in a variety of ways have existed for many decades, with the classical example being Rubin’s vase — which some viewers see as a vase, and others a pair of human faces.

When the duck becomes a bunny, if you ignore the graphical glitches that used to be part of the duck. (Credit: Steve Mould, YouTube)
When the duck becomes a bunny, if you ignore the graphical glitches that used to be part of the duck. (Credit: Steve Mould, YouTube)

Where things get trickier is if you want to create an image that changes into something else that looks realistic when you rotate each section of it within a 3×3 grid. In a video by [Steve Mould], he explains how this can be accomplished, by using a diffusion model to identify similar characteristics of two images and to create an output image that effectively contains essential features of both images.

Naturally, this process can be done by hand too, with the goal always being to create a plausible image in either orientation that has enough detail to trick the brain into filling in the details. To head down the path of interpreting what the eye sees as a duck, a bunny, a vase or the outline of faces.

Using a diffusion model to create such illusions is quite a natural fit, as it works with filling in noise until a plausible enough image begins to appear. Of course, whether it is a viable image is ultimately not determined by the model, but by the viewer, as humans are susceptible to such illusions while machine vision still struggles to distinguish a cat from a loaf and a raisin bun from a spotted dog. The imperfections of diffusion models would seem to be a benefit here, as it will happily churn through abstractions and iterations with no understanding or interpretive bias, while the human can steer it towards a viable interpretation.

Continue reading “Creating A Twisted Grid Image Illusion With A Diffusion Model”

Large Language Models On Small Computers

As technology progresses, we generally expect processing capabilities to scale up. Every year, we get more processor power, faster speeds, greater memory, and lower cost. However, we can also use improvements in software to get things running on what might otherwise be considered inadequate hardware. Taking this to the extreme, while large language models (LLMs) like GPT are running out of data to train on and having difficulty scaling up, [DaveBben] is experimenting with scaling down instead, running an LLM on the smallest computer that could reasonably run one.

Of course, some concessions have to be made to get an LLM running on underpowered hardware. In this case, the computer of choice is an ESP32, so the dataset was reduced from the trillions of parameters of something like GPT-4 or even hundreds of billions for GPT-3 down to only 260,000. The dataset comes from the tinyllamas checkpoint, and llama.2c is the implementation that [DaveBben] chose for this setup, as it can be streamlined to run a bit better on something like the ESP32. The specific model is the ESP32-S3FH4R2, which was chosen for its large amount of RAM compared to other versions since even this small model needs a minimum of 1 MB to run. It also has two cores, which will both work as hard as possible under (relatively) heavy loads like these, and the clock speed of the CPU can be maxed out at around 240 MHz.

Admittedly, [DaveBben] is mostly doing this just to see if it can be done since even the most powerful of ESP32 processors won’t be able to do much useful work with a large language model. It does turn out to be possible, though, and somewhat impressive, considering the ESP32 has about as much processing capability as a 486 or maybe an early Pentium chip, to put things in perspective. If you’re willing to devote a few more resources to an LLM, though, you can self-host it and use it in much the same way as an online model such as ChatGPT.