Two laptops, side by side, running Llama2 in DOS.

Will It Run Llama 2? Now DOS Can

Will a 486 run Crysis? No, of course not. Will it run a large language model (LLM)? Given the huge buildout of compute power to do just that, many people would scoff at the very notion. But [Yeo Kheng Meng] is not many people.

He has set up various DOS computers to run a stripped down version of the Llama 2 LLM, originally from Meta. More specifically, [Yeo Kheng Meng] is implementing [Andreq Karpathy]’s Llama2.c library, which we have seen here before, running on Windows 98.

Llama2.c is a wonderful bit of programming that lets one inference a trained Llama2 model in only seven hundred lines of C. It it is seven hundred lines of modern C, however, so porting to DOS 6.22 and the outdated i386 architecture took some doing. [Yeo Kheng Meng] documents that work, and benchmarks a few retrocomputers. As painful as it may be to say — yes, a 486 or a Pentium 1 can now be counted as “retro”.

The models are not large, of course, with TinyStories-trained  260 kB model churning out a blistering 2.08 tokens per second on a generic 486 box. Newer machines can run larger models faster, of course. Ironically a Pentium M Thinkpad T24 (was that really 21 years ago?) is able to run a larger 110 Mb model faster than [Yeo Kheng Meng]’s modern Ryzen 5 desktop. Not because the Pentium M is going blazing fast, mind you, but because a memory allocation error prevented that model from running on the modern CPU. Slow and steady finishes the race, it seems.

This port will run on any 32-bit i386 hardware, which leaves the 16-bit regime as the next challenge. If one of you can get an Llama 2 hosted locally on an 286 or a 68000-based machine, then we may have to stop asking “Does it run DOOM?” and start asking “Will it run an LLM?”

Continue reading “Will It Run Llama 2? Now DOS Can”

A flowchart demonstrating the exploit described.

Vibe Check: False Packages A New LLM Security Risk?

Lots of people swear by large-language model (LLM) AIs for writing code. Lots of people swear at them. Still others may be planning to exploit their peculiarities, according to [Joe Spracklen] and other researchers at USTA. At least, the researchers have found a potential exploit in ‘vibe coding’.

Everyone who has used an LLM knows they have a propensity to “hallucinate”– that is, to go off the rails and create plausible-sounding gibberish. When you’re vibe coding, that gibberish is likely to make it into your program. Normally, that just means errors. If you are working in an environment that uses a package manager, however (like npm in Node.js, or PiPy in Python, CRAN in R-studio) that plausible-sounding nonsense code may end up calling for a fake package.

A clever attacker might be able to determine what sort of false packages the LLM is hallucinating, and inject them as a vector for malicious code. It’s more likely than you think– while CodeLlama was the worst offender, the most accurate model tested (ChatGPT4) still generated these false packages at a rate of over 5%. The researchers were able to come up with a number of mitigation strategies in their full paper, but this is a sobering reminder that an AI cannot take responsibility. Ultimately it is up to us, the programmers, to ensure the integrity and security of our code, and of the libraries we include in it.

We just had a rollicking discussion of vibe coding, which some of you seemed quite taken with. Others agreed that ChatGPT is the worst summer intern ever.  Love it or hate it, it’s likely this won’t be the last time we hear of security concerns brought up by this new method of programming.

Special thanks to [Wolfgang Friedrich] for sending this into our tip line.

A humanoid robot packs a lunch bag in the kitchen

Gemini 2.0 + Robotics = Slam Dunk?

Over on the Google blog [Joel Meares] explains how Google built the new family of Gemini Robotics models.

The bi-arm ALOHA robot equipped with Gemini 2.0 software can take general instructions and then respond dynamically to its environment as it carries out its tasks. This family of robots aims to be highly dexterous, interactive, and general-purpose by applying the sort of non-task-specific training methods that have worked so well with LLMs, and applying them to robot tasks.

There are two things we here at Hackaday are wondering. Is there anything a robot will never do? And just how cherry-picked are these examples in the slick video? Let us know what you think in the comments!

Continue reading “Gemini 2.0 + Robotics = Slam Dunk?”

Ask Hackaday: Vibe Coding

Vibe coding is the buzzword of the moment. What is it? The practice of writing software by describing the problem to an AI large language model and using the code it generates. It’s not quite as simple as just letting the AI do your work for you because the developer is supposed to spend time honing and testing the result, and its proponents claim it gives a much more interactive and less tedious coding experience. Here at Hackaday, we are pleased to see the rest of the world catch up, because back in 2023, we were the first mainstream hardware hacking news website to embrace it, to deal with a breakfast-related emergency.

Jokes aside, though, the fad for vibe coding is something which should be taken seriously, because it’s seemingly being used in enough places that vibe coded software will inevitably affect our lives.  So here’s the Ask Hackaday: is this a clever and useful tool for making better software more quickly, or a dangerous tool for creating software nobody quite understands, containing bugs which could cause a disaster?

Our approach to writing software has always been one of incrementally building something from the ground up, which satisfies the need. Readers will know that feeling of being in touch with how a project works at all levels, with a nose for immediately diagnosing any problems that might occur. If an AI writes the code for us, the feeling is that we might lose that connection, and inevitably this will lead to less experienced coders quickly getting out of their depth. Is this pessimism, or the grizzled voice of experience? We’d love to know your views in the comments. Are our new AI overlords the new senior developers? Or are they the worst summer interns ever?

How To Use LLMs For Programming Tasks

[Simon Willison] has put together a list of how, exactly, one goes about using a large language models (LLM) to help write code. If you have wondered just what the workflow and techniques look like, give it a read. It’s full of examples, strategies, and useful tips for effectively using AI assistants like ChatGPT, Claude, and others to do useful programming work.

It’s a very practical document, with [Simon] emphasizing realistic expectations and the importance of managing context (both in terms of giving the LLM direction, as well as the model’s context in terms of being mindful of how much the LLM can fit in its ‘head’ at once.) It is useful to picture an LLM as a capable and obedient but over-confident programming intern or assistant, albeit one that never gets bored or annoyed. Useful work can be done, but testing is crucial and human oversight simply cannot be automated away.

Even if one has no interest in using LLMs to help in writing production code, there’s still a lot of useful work they can do to speed up the process of software development in general, especially when learning. They can help research options, interactively explore unfamiliar codebases, or prototype ideas quickly. [Simon] provides useful strategies for all these, and more.

If you have wondered how exactly glorified chatbots can meaningfully help with software development, [Simon]’s writeup hopefully gives you some new ideas. And if this is is all leaving you curious about how exactly LLMs work, in the time it takes to enjoy a warm coffee you can learn how they do what they do, no math required.

A blue-gloved hand holds a glass plate with a small off-white rectangular prism approximately one quarter the area of a fingernail in cross-section.

AI Helps Researchers Discover New Structural Materials

Nanostructured metamaterials have shown a lot of promise in what they can do in the lab, but often have fatal stress concentration factors that limit their applications. Researchers have now found a strong, lightweight nanostructured carbon. [via BGR]

Using a multi-objective Bayesian optimization (MBO) algorithm trained on finite element analysis (FEA) datasets to identify the best candidate nanostructures, the researchers then brought the theoretical material to life with 2 photon polymerization (2PP) photolithography. The resulting “carbon nanolattices achieve the compressive strength of carbon steels (180–360 MPa) with the density of Styrofoam (125–215 kg m−3) which exceeds the specific strengths of equivalent low-density materials by over an order of magnitude.”

While you probably shouldn’t start getting investors for your space elevator startup just yet, lighter materials like this are promising for a lot of applications, most notably more conventional aviation where fuel (or energy) prices are a big constraint on operations. As with any lab results, more work is needed until we see this in the real world, but it is nice to know that superalloys and composites aren’t the end of the road for strong and lightweight materials.

We’ve seen AI help identify battery materials already and this seems to be one avenue where generative AI isn’t just about making embarrassing photos or making us less intelligent.

Hackaday Links Column Banner

Hackaday Links: February 23, 2025

Ho-hum — another week, another high-profile bricking. In a move anyone could see coming, Humane has announced that their pricey AI Pin widgets will cease to work in any meaningful way as of noon on February 28. The company made a splash when it launched its wearable assistant in April of 2024, and from an engineering point of view, it was pretty cool. Meant to be worn on one’s shirt, it had a little bit of a Star Trek: The Next Generation comm badge vibe as the primary UI was accessed through tapping the front of the thing. It also had a display that projected information onto your hand, plus the usual array of sensors and cameras which no doubt provided a rich stream of user data. Somehow, though, Humane wasn’t able to make the numbers work out, and as a result they’ll be shutting down their servers at the end of the month, with refunds offered only to users who bought their AI Pins in the last 90 days.

Continue reading “Hackaday Links: February 23, 2025”