Why GitHub Copilot Isn’t Your Coding Partner

July 4, 2025 by Maya Posch 65 Comments

These days ‘AI’ is everywhere, including in software development. Coming hot on the heels of approaches like eXtreme Programming and Pair Programming, there’s now a new kind of pair programming in town in the form of an LLM that’s been digesting millions of lines of code. Purportedly designed to help developers program faster and more efficiently, these ‘AI programming assistants’ have primarily led to heated debate and some interesting studies.

In the case of [Jj], their undiluted feelings towards programming assistants like GitHub Copilot burn as brightly as the fire of a thousand Suns, and not a happy kind of fire.

Whether it’s Copilot or ChatGPT or some other chatbot that may or may not be integrated into your IDE, the frustration with what often feels like StackOverflow-powered-autocomplete is something that many of us can likely sympathize with. Although [Jj] lists a few positives of using an LLM trained on codebases and documentation, their overall view is that using Copilot degrades a programmer, mostly because of how it takes critical thinking skills out of the loop.

Regardless of whether you agree with [Jj] or not, the research so far on using LLMs with software development and other tasks strongly suggests that they’re not a net positive for one’s mental faculties. It’s also important to note that at the end of the day it’s still you, the fleshy bag of mostly salty water, who has to justify the code during code review and when something catches on fire in production. Your ‘copilot’ meanwhile gets off easy.

USB Stick Hides Large Language Model

February 17, 2025 by Bryan Cockfield 27 Comments

Large language models (LLMs) are all the rage in the generative AI world these days, with the truly large ones like GPT, LLaMA, and others using tens or even hundreds of billions of parameters to churn out their text-based responses. These typically require glacier-melting amounts of computing hardware, but the “large” in “large language models” doesn’t really need to be that big for there to be a functional, useful model. LLMs designed for limited hardware or consumer-grade PCs are available now as well, but [Binh] wanted something even smaller and more portable, so he put an LLM on a USB stick.

This USB stick isn’t just a jump drive with a bit of memory on it, though. Inside the custom 3D printed case is a Raspberry Pi Zero W running llama.cpp, a lightweight, high-performance version of LLaMA. Getting it on this Pi wasn’t straightforward at all, though, as the latest version of llama.cpp is meant for ARMv8 and this particular Pi was running the ARMv6 instruction set. That meant that [Binh] needed to change the source code to remove the optimizations for the more modern ARM machines, but with a week’s worth of effort spent on it he finally got the model on the older Raspberry Pi.

Getting the model to run was just one part of this project. The rest of the build was ensuring that the LLM could run on any computer without drivers and be relatively simple to use. By setting up the USB device as a composite device which presents a filesystem to the host computer, all a user has to do to interact with the LLM is to create an empty text file with a filename, and the LLM will automatically fill the file with generated text. While it’s not blindingly fast, [Binh] believes this is the first plug-and-play USB-based LLM, and we’d have to agree. It’s not the least powerful computer to ever run an LLM, though. That honor goes to this project which is able to cram one on an ESP32.

Continue reading “USB Stick Hides Large Language Model” →

Examining The Vulnerability Of Large Language Models To Data-Poisoning

February 3, 2025 by Maya Posch 23 Comments

Large language models (LLMs) are wholly dependent on the quality of the input data with which these models are trained. While suggestions that people eat rocks are funny to you and me, in the case of LLMs intended to help out medical professionals, any false claims or statements dripping out of such an LLM can have dire consequences, ranging from incorrect diagnoses to much worse. In a recent study published in Nature Medicine by [Daniel Alexander Alber] et al. the ease with which this data poisoning can occur is demonstrated.

According to their findings, only 0.001% of training tokens have to be replaced with medical misinformation to order to create models that are likely to produce medically erroneous statement. Most concerning is that such a corrupted model isn’t readily discovered using standard medical LLM benchmarks. There are filters for erroneous content, but these tend to be limited in scope due to the overhead. Post-training adjustments can be made, as can the addition of RAG, but none of this helps with the confident bull excrement due to corruption.

The mitigation approach that the researchers developed cross-references LLM output against biomedical knowledge graphs, to reduce the LLM mostly for generating natural language. In this approach LLM outputs are matched against the graphs and if LLM ‘facts’ cannot be verified, it’s marked as potential misinformation. In a test with 1,000 random passages detected issues with a claimed effectiveness of 91.9%.

Naturally, this does not guarantee that misinformation does not make it past these knowledge graphs, and largely leaves the original problem with LLMs in place, namely that their outputs can never be fully trusted. This study also makes it abundantly clear how easy it is to corrupt an LLM via the input training data, as well as underlining the broader problem that AI is making mistakes that we don’t expect.

New Open Source DeepSeek V3 Language Model Making Waves

January 27, 2025 by Maya Posch 82 Comments

In the world of large language models (LLMs) there tend to be relatively few upsets ever since OpenAI barged onto the scene with its transformer-based GPT models a few years ago, yet now it seems that Chinese company DeepSeek has upended the status quo. Its new DeepSeek-V3 model is not only open source, it also claims to have been trained for only a fraction of the effort required by competing models, while performing significantly better.

The full training of DeepSeek-V3’s 671B parameters is claimed to have only taken 2.788 M hours on NVidia H800 (Hopper-based) GPUs, which is almost a factor of ten less than others. Naturally this has the LLM industry somewhat up in a mild panic, but for those who are not investors in LLM companies or NVidia can partake in this new OSS model that has been released under the MIT license, along with the DeepSeek-R1 reasoning model.

Both of these models can be run locally, using both AMD and NVidia GPUs, as well as using the online APIs. If these models do indeed perform as efficiently as claimed, they stand to massively reduce the hardware and power required to not only train but also query LLMs.

Trap Naughty Web Crawlers In Digestive Juices With Nepenthes

January 23, 2025 by Maya Posch 44 Comments

In the olden days of the WWW you could just put a robots.txt file in the root of your website and crawling bots from search engines and kin would (generally) respect the rules in it. These days, however, we have especially web crawlers from large language model (LLM) companies happily ignoring such signs on the lawn before proceeding to hover up every scrap of content on websites. Naturally this makes a lot of people very angry, but what can you do about it? The answer by [Aaron B] is Nepenthes, described on the project page as a ‘tar pit for catching web crawlers’.

More commonly known as ‘pitcher plants’, nepenthes is a genus of carnivorous plants that use a fluid-filled cup to trap insects and small critters unfortunate enough to slip & slide down into it. In the case of this Lua-based project the idea is roughly the same. Configured as a trap behind a web server (e.g. /nepenthes), any web crawler that accesses it will be presented with an endless number of (randomly generated) pages with many URLs to follow. Page generating is deliberately quite slow to not soak up significant CPU time, while still giving the LLM scrapers plenty of random nonsense to chew on.

Considering that these web crawlers deemed adhering to the friendly sign on the lawn beneath them, the least we can do in response, is to hasten model collapse by feeding these LLM scrapers whatever rolls out of a simple (optionally Markov-based) text generator.

A Robot Meant For Humans

November 26, 2024 by Bryan Cockfield 11 Comments

Although humanity was hoping for a more optimistic robotic future in the post-war era, with media reflecting that sentiment like The Jetsons or Lost in Space, we seem to have shifted our collective consciousness (for good reasons) to a more Black Mirror/Terminator future as real-world companies like Boston Dynamics are actually building these styles of machines instead of helpful Rosies. But this future isn’t guaranteed, and a PhD researcher is hoping to claim back a more hopeful outlook with a robot called Blossom which is specifically built to investigate how humans interact with robots.

For a platform this robot is not too complex, consisting of an accessible frame that can be laser-cut from wood with only a few moving parts controlled by servos. The robot is not too large, either, and can be set on a desk to be used as a telepresence robot. But Blossom’s creator [Michael] wanted this to help understand how humans interact with robots so the latest version is outfitted not only with a large language model with text-to-speech capabilities, but also with a compelling backstory, lore, and a voice derived from Animal Crossing that’s neither human nor recognizable synthetic robot, all in an effort to make the device more approachable.

To that end, [Michael] set the robot up at a Maker Faire to see what sorts of interactions Blossom would have with passers by, and while most were interested in the web-based control system for the robot a few others came by and had conversations with it. It’s certainly an interesting project and reminds us a bit of this other piece of research from MIT that looked at how humans and robots can work productively alongside one another.

Using AI To Help With Assembly

November 7, 2024 by Bryan Cockfield 30 Comments

Although generative AI and large language models have been pushed as direct replacements for certain kinds of workers, plenty of businesses actually doing this have found that using this new technology can cause more problems than it solves when it is given free reign over tasks. While this might not be true indefinitely, the real use case for these tools right now is as a kind of assistant to certain kinds of work. For this they can be incredibly powerful as [Ricardo] demonstrates here, using Amazon Q to help with game development on the Commodore 64.

The first step here was to generate code that would show a sprite moving across the screen. The AI first generated code in all caps, as was the style at the time of the C64, but in [Ricardo]’s development environment this caused some major problems, so the code was converted to lowercase. A more impressive conversion was done in the next steps, as the program needed to take advantage of the optimizations found in the Assembly language. With the code converted to 6502 Assembly that can run on the virtual Commodore, [Ricardo] was eventually able to show four sprites moving across the screen after several iterations with the AI, as well as change the style of the sprites to arbitrary designs.

Although the post is a bit over-optimistic on Amazon Q as a tool specifically for developers, it might have some benefits over other generative AIs especially if it’s capable at the chore of programming in Assembly language. We’d love to hear anyone with real-world experience with this and whether it is truly worth the extra cost over something like Copilot or GPT 4. For any of these generative AI models, though, it’s probably worth trying them out while they’re in their early stages. Keep in mind that there’s a lot more than programming that can be done with some of them as well.

Hackaday

large language model

26 Articles

Why GitHub Copilot Isn’t Your Coding Partner

USB Stick Hides Large Language Model

Examining The Vulnerability Of Large Language Models To Data-Poisoning

New Open Source DeepSeek V3 Language Model Making Waves

Trap Naughty Web Crawlers In Digestive Juices With Nepenthes

A Robot Meant For Humans

Using AI To Help With Assembly

Search

Never miss a hack

If you missed it

Back To The Future, 40 Years Old, Looks Like The Past

Why The Latest Linux Kernel Won’t Run On Your 486 And 586 Anymore

One Laptop Manufacturer Had To Stop Janet Jackson Crashing Laptops

The 2025 Iberian Peninsula Blackout: From Solar Wobbles To Cascade Failures

Field Guide To The North American Weigh Station

Our Columns

Hackaday Podcast Episode 327: A Ploopy Knob, Rube-Goldberg Book Scanner, Hard Drives And Power Grids Oscillating Out Of Control

Last Chance: 2025 Hackaday Supercon Still Wants You!

FLOSS Weekly Episode 839: I Want To Get Paid Twice

South Korea Brought High-Rise Fire Escape Solutions To The Masses

C++ Encounters Of The Rusty Zig Kind

Search

Never miss a hack

Subscribe

If you missed it

Our Columns