Two laptops, side by side, running Llama2 in DOS.

Will It Run Llama 2? Now DOS Can

Will a 486 run Crysis? No, of course not. Will it run a large language model (LLM)? Given the huge buildout of compute power to do just that, many people would scoff at the very notion. But [Yeo Kheng Meng] is not many people.

He has set up various DOS computers to run a stripped down version of the Llama 2 LLM, originally from Meta. More specifically, [Yeo Kheng Meng] is implementing [Andreq Karpathy]’s Llama2.c library, which we have seen here before, running on Windows 98.

Llama2.c is a wonderful bit of programming that lets one inference a trained Llama2 model in only seven hundred lines of C. It it is seven hundred lines of modern C, however, so porting to DOS 6.22 and the outdated i386 architecture took some doing. [Yeo Kheng Meng] documents that work, and benchmarks a few retrocomputers. As painful as it may be to say — yes, a 486 or a Pentium 1 can now be counted as “retro”.

The models are not large, of course, with TinyStories-trained  260 kB model churning out a blistering 2.08 tokens per second on a generic 486 box. Newer machines can run larger models faster, of course. Ironically a Pentium M Thinkpad T24 (was that really 21 years ago?) is able to run a larger 110 Mb model faster than [Yeo Kheng Meng]’s modern Ryzen 5 desktop. Not because the Pentium M is going blazing fast, mind you, but because a memory allocation error prevented that model from running on the modern CPU. Slow and steady finishes the race, it seems.

This port will run on any 32-bit i386 hardware, which leaves the 16-bit regime as the next challenge. If one of you can get an Llama 2 hosted locally on an 286 or a 68000-based machine, then we may have to stop asking “Does it run DOOM?” and start asking “Will it run an LLM?”

Continue reading “Will It Run Llama 2? Now DOS Can”

DIY AI Butler Is Simpler And More Useful Than Siri

[Geoffrey Litt] shows that getting an effective digital assistant that’s tailored to one’s own needs just needs a little DIY, and thanks to the kinds of tools that are available today, it doesn’t even have to be particularly complex. Meet Stevens, the AI assistant who provides the family with useful daily briefs. The back end? Little more than one SQLite table and a few cron jobs.

A sample of Stevens’ notebook entries, both events and things to simply remember.

Every day, Stevens sends a daily brief via Telegram that includes calendar events, appointments, weather notes, reminders, and even a fun fact for the day. Stevens isn’t just send-only, either. Users can add new entries or ask questions about items through Telegram.

It’s rudimentary, but [Geoffrey] already finds it far more useful than Siri. This is unsurprising, as it has been astutely observed that big tech’s digital assistants are designed to serve their makers rather than their users. Besides, it’s also fun to have the freedom to give an assistant its own personality, something existing offerings sorely lack.

Architecture-wise, the assistant has a notebook (the single SQLite table) that gets populated with entries. These entries come from things like reading family members’ Google calendars, pulling data from a public weather API, processing delivery notices from the post office, and Telegram conversations. With a notebook of such entries (along with a date the entry is expected to be relevant), generating a daily brief is simple. After all, LLMs (Large Language Models) are amazingly good at handling and formatting natural language. That’s something even a locally-installed LLM can do with ease.

[Geoffrey] says that even this simple architecture is super useful, and it’s not even a particularly complex system. He encourages anyone who’s interested to check out his project, and see for themselves how useful even a minimally-informed assistant can be when it’s designed with ones’ own needs in mind.

GLaDOS Potato Assistant

This Potato Virtual Assistant Is Fully Baked

There are a number of reasons you might want to build your own smart speaker virtual assistant. Usually, getting your weather forecast from a snarky, malicious AI potato isn’t one of them, unless you’re a huge Portal fan like [Binh Pham].

[Binh Pham] built the potato incarnation of GLaDOS from the Portal 2 video game with the help of a ReSpeaker Light kit, an ESP32-based board designed for speech recognition and voice control, and as an interface for home assistant running on a Raspberry Pi.

He resisted the temptation to use a real potato as an enclosure and wisely opted instead to print one from a 3D file he found on Thingiverse of the original GLaDOS potato. Providing the assistant with the iconic synthetic voice of GLaDOS was a matter of repackaging an existing voice model for use with Home Assistant.

Of course all of this attention to detail would be for naught if you had to refer to the assistant as “Google” or “Alexa” to get its attention. A bit of custom modelling and on-device wake word detection, and the cyborg tuber was ready to switch lights on and off with it’s signature sinister wit.

We’ve seen a number of projects that brought Portal objects to life for fans of the franchise to enjoy, even an assistant based on another version of the GLaDOS the character. This one adds a dimension of absurdity to the collection.

Continue reading “This Potato Virtual Assistant Is Fully Baked”

The Incomplete JSON Pretty Printer (Brought To You By Vibes)

Incomplete JSON (such as from a log that terminates unexpectedly) doesn’t parse cleanly, which means anything that usually prints JSON nicely, won’t. Frustration with this is what led [Simon Willison] to make The Incomplete JSON pretty printer, a single-purpose web tool that pretty-prints JSON regardless of whether it’s complete or not.

Making a tool to solve a particular issue is a fantastic application of software, but in this case it also is a good lead-in to some thoughts [Simon] has to share about vibe coding. The incomplete JSON printer is a perfect example of vibe coding, being the product of [Simon] directing an LLM to iteratively create a tool and not looking at the actual code once.

Sometimes, however the machine decides to code something is fine.

[Simon] shares that the term “vibe coding” was first used in a social media post by [Andrej Karpathy], who we’ve seen shared a “hello world” of GPT-based LLMs as well as how to train one in pure C, both of which are the product of a deep understanding of the subject (and fantastically educational) so he certainly knows how things work.

Anyway, [Andrej] had a very specific idea he was describing with vibe coding: that of engaging with the tool in almost a state of flow for something like a weekend project, just focused on iterating one’s way to what they want without fussing the details. Why? Because doing so is new, engaging, and fun.

Since then, vibe coding as a term seems to get used to refer to any and all AI-assisted coding, a subject on which folks have quite a few thoughts (many of which were eagerly shared on a recent Ask Hackaday on the subject).

Of course human oversight is critical to a solid and reliable development workflow. But not all software is the same. In the case of the Incomplete JSON Pretty Printer, [Simon] really doesn’t care what the code actually looks like. He got it made in a short amount of time, the tool does exactly what he wants, and it’s hard to imagine the stakes being any lower. To [Simon], however the LMM decided to do things is fine, and there’s a place for that.

A flowchart demonstrating the exploit described.

Vibe Check: False Packages A New LLM Security Risk?

Lots of people swear by large-language model (LLM) AIs for writing code. Lots of people swear at them. Still others may be planning to exploit their peculiarities, according to [Joe Spracklen] and other researchers at USTA. At least, the researchers have found a potential exploit in ‘vibe coding’.

Everyone who has used an LLM knows they have a propensity to “hallucinate”– that is, to go off the rails and create plausible-sounding gibberish. When you’re vibe coding, that gibberish is likely to make it into your program. Normally, that just means errors. If you are working in an environment that uses a package manager, however (like npm in Node.js, or PiPy in Python, CRAN in R-studio) that plausible-sounding nonsense code may end up calling for a fake package.

A clever attacker might be able to determine what sort of false packages the LLM is hallucinating, and inject them as a vector for malicious code. It’s more likely than you think– while CodeLlama was the worst offender, the most accurate model tested (ChatGPT4) still generated these false packages at a rate of over 5%. The researchers were able to come up with a number of mitigation strategies in their full paper, but this is a sobering reminder that an AI cannot take responsibility. Ultimately it is up to us, the programmers, to ensure the integrity and security of our code, and of the libraries we include in it.

We just had a rollicking discussion of vibe coding, which some of you seemed quite taken with. Others agreed that ChatGPT is the worst summer intern ever.  Love it or hate it, it’s likely this won’t be the last time we hear of security concerns brought up by this new method of programming.

Special thanks to [Wolfgang Friedrich] for sending this into our tip line.

A humanoid robot packs a lunch bag in the kitchen

Gemini 2.0 + Robotics = Slam Dunk?

Over on the Google blog [Joel Meares] explains how Google built the new family of Gemini Robotics models.

The bi-arm ALOHA robot equipped with Gemini 2.0 software can take general instructions and then respond dynamically to its environment as it carries out its tasks. This family of robots aims to be highly dexterous, interactive, and general-purpose by applying the sort of non-task-specific training methods that have worked so well with LLMs, and applying them to robot tasks.

There are two things we here at Hackaday are wondering. Is there anything a robot will never do? And just how cherry-picked are these examples in the slick video? Let us know what you think in the comments!

Continue reading “Gemini 2.0 + Robotics = Slam Dunk?”

Ask Hackaday: Vibe Coding

Vibe coding is the buzzword of the moment. What is it? The practice of writing software by describing the problem to an AI large language model and using the code it generates. It’s not quite as simple as just letting the AI do your work for you because the developer is supposed to spend time honing and testing the result, and its proponents claim it gives a much more interactive and less tedious coding experience. Here at Hackaday, we are pleased to see the rest of the world catch up, because back in 2023, we were the first mainstream hardware hacking news website to embrace it, to deal with a breakfast-related emergency.

Jokes aside, though, the fad for vibe coding is something which should be taken seriously, because it’s seemingly being used in enough places that vibe coded software will inevitably affect our lives.  So here’s the Ask Hackaday: is this a clever and useful tool for making better software more quickly, or a dangerous tool for creating software nobody quite understands, containing bugs which could cause a disaster?

Our approach to writing software has always been one of incrementally building something from the ground up, which satisfies the need. Readers will know that feeling of being in touch with how a project works at all levels, with a nose for immediately diagnosing any problems that might occur. If an AI writes the code for us, the feeling is that we might lose that connection, and inevitably this will lead to less experienced coders quickly getting out of their depth. Is this pessimism, or the grizzled voice of experience? We’d love to know your views in the comments. Are our new AI overlords the new senior developers? Or are they the worst summer interns ever?