New Open Source DeepSeek V3 Language Model Making Waves

In the world of large language models (LLMs) there tend to be relatively few upsets ever since OpenAI barged onto the scene with its transformer-based GPT models a few years ago, yet now it seems that Chinese company DeepSeek has upended the status quo. Its new DeepSeek-V3 model is not only open source, it also claims to have been trained for only a fraction of the effort required by competing models, while performing significantly better.

The full training of DeepSeek-V3’s 671B parameters is claimed to have only taken 2.788 M hours on NVidia H800 (Hopper-based) GPUs, which is almost a factor of ten less than others. Naturally this has the LLM industry somewhat up in a mild panic, but for those who are not investors in LLM companies or NVidia can partake in this new OSS model that has been released under the MIT license, along with the DeepSeek-R1 reasoning model.

Both of these models can be run locally, using both AMD and NVidia GPUs, as well as using the online APIs. If these models do indeed perform as efficiently as claimed, they stand to massively reduce the hardware and power required to not only train but also query LLMs.

Prompt Injection Tricks AI Into Downloading And Executing Malware

[wunderwuzzi] demonstrates a proof of concept in which a service that enables an AI to control a virtual computer (in this case, Anthropic’s Claude Computer Use) is made to download and execute a piece of malware that successfully connects to a command and control (C2) server. [wonderwuzzi] makes the reasonable case that such a system has therefore become a “ZombAI”. Here’s how it worked.

Referring to the malware as a “support tool” and embedding instructions into the body of the web page is what got the binary downloaded and executed, compromising the system.

After setting up a web page with a download link to the malicious binary, [wunderwuzzi] attempts to get Claude to download and run the malware. At first, Claude doesn’t bite. But that all changes when the content of the HTML page gets rewritten with instructions to download and execute the “Support Tool”. That new content gets interpreted as orders to follow; being essentially a form of prompt injection.

Claude dutifully downloads the malicious binary, then autonomously (and cleverly) locates the downloaded file and even uses chmod to make it executable before running it. The result? A compromised machine.

Now, just to be clear, Claude Computer Use is experimental and this sort of risk is absolutely and explicitly called out in Anthropic’s documentation. But what’s interesting here is that the methods used to convince Claude to compromise the system it’s using are essentially the same one might take to convince a person. Make something nefarious look innocent, and obfuscate the true source (and intent) of the directions. Watch it in action from beginning to end in a video, embedded just under the page break.

Continue reading “Prompt Injection Tricks AI Into Downloading And Executing Malware”

Preventing AI Plagiarism With .ASS Subtitling

Around two years ago, the world was inundated with news about how generative AI or large language models would revolutionize the world. At the time it was easy to get caught up in the hype, but in the intervening months these tools have done little in the way of productive work outside of a few edge cases, and mostly serve to burn tons of cash while turning the Internet into even more of a desolate wasteland than it was before. They do this largely by regurgitating human creations like text, audio, and video into inferior simulacrums and, if you still want to exist on the Internet, there’s basically nothing you can do to prevent this sort of plagiarism. Except feed the AI models garbage data like this YouTuber has started doing.

At least as far as YouTube is concerned, the worst offenders of AI plagiarism work by downloading the video’s subtitles, passing them through some sort of AI model, and then generating another YouTube video based off of the original creator’s work. Most subtitle files are the fairly straightfoward .srt filetype which only allows for timing and text information. But a more obscure subtitle filetype known as Advanced SubStation Alpha, or .ass, allows for all kinds of subtitle customization like orientation, formatting, font types, colors, shadowing, and many others. YouTuber [f4mi] realized that using this subtitle system, extra garbage text could be placed in the subtitle filetype but set out of view of the video itself, either by placing the text outside the viewable area or increasing its transparency. So now when an AI crawler downloads the subtitle file it can’t distinguish real subtitles from the garbage placed into it.

[f4mi] created a few scripts to do this automatically so that it doesn’t have to be done by hand for each one. It also doesn’t impact the actual subtitles on the screen for people who need them for accessibility reasons. It’s a great way to “poison” AI models and make it at least harder for them to rip off the creations of original artists, and [f4mi]’s tests show that it does work. We’ve actually seen a similar method for poisoning data sets used for emails long ago, back when we were all collectively much more concerned about groups like the NSA using automated snooping tools in our emails than we were that machines were going to steal our creative endeavors.

Thanks to [www2] for the tip!

Continue reading “Preventing AI Plagiarism With .ASS Subtitling”

AI Mistakes Are Different, And That’s A Problem

People have been making mistakes — roughly the same ones — since forever, and we’ve spent about the same amount of time learning to detect and mitigate them. Artificial Intelligence (AI) systems make mistakes too, but [Bruce Schneier] and [Nathan E. Sanders] make the observation that, compared to humans, AI models make entirely different kinds of mistakes. We are perhaps less equipped to handle this unusual problem than we realize.

The basic idea is this: as humans we have tremendous experience making mistakes, and this has also given us a pretty good idea of what to expect our mistakes to look like, and how to deal with them. Humans tend to make mistakes at the edges of our knowledge, our mistakes tend to clump around the same things, we make more of them when bored or tired, and so on. We have as a result developed controls and systems of checks and balances to help reduce the frequency and limit the harm of our mistakes. But these controls don’t carry over to AI systems, because AI mistakes are pretty strange.

The mistakes of AI models (particularly Large Language Models) happen seemingly randomly and aren’t limited to particular topics or areas of knowledge. Models may unpredictably appear to lack common sense. As [Bruce] puts it, “A model might be equally likely to make a mistake on a calculus question as it is to propose that cabbages eat goats.” A slight re-wording of a question might be all it takes for a model to suddenly be confidently and utterly wrong about something it just a moment ago seemed to grasp completely. And speaking of confidence, AI mistakes aren’t accompanied by uncertainty. Of course humans are no strangers to being confidently wrong, but as a whole the sort of mistakes AI systems make aren’t the same kinds of mistakes we’re used to.

There are different ideas on how to deal with this, some of which researchers are (ahem) confidently undertaking. But for best results, we’ll need to invent new ways as well. The essay also appeared in IEEE Spectrum and isn’t terribly long, so take a few minutes to check it out and get some food for thought.

And remember, if preventing mistakes at all costs is the goal, that problem is already solved: GOODY-2 is undeniably the world’s safest AI.

Hacked teddybear on a desk

Turning GLaDOS Into Ted: A Tale Of A Talking Toy

What if your old, neglected toys could come to life — with a bit of sass? That’s exactly what [Binh] achieved when he transformed his sister’s worn-out teddy bear into ‘Ted’, an interactive talking plush with a personality of its own. This project, which combines the GLaDOS Personality Core project from the Portal series with clever microcontroller tinkering, brings a whole new personality to a childhood favorite.

[Binh] started with the basics: a teddy bear already equipped with buttons and speakers, which he overhauled with an ESP32 microcontroller. The bear’s personality originated from GLaDOS, but was rewritten by [Binh] to fit a cheeky, teddy-bear tone. With a few tweaks in the Python-based fork, [Binh] created threads to handle touch-based interaction. For example, the ESP32 detects where the bear is touched and sends this input to a modified neural network, which then generates a response. The bear can, for instance, call you out for holding his paw for too long or sarcastically plead for mercy. I hear you say ‘but that bear Ted could do a lot more!’ Well — maybe, all this is just what an innocent bear with a personality should be capable of.

Instead, let us imagine future iterations featuring capacitive touch sensors or accelerometers to detect movement. The project is simple, but showcases the potential for intelligent plush toys. It might raise some questions, too.

Continue reading “Turning GLaDOS Into Ted: A Tale Of A Talking Toy”

Modern AI On Vintage Hardware: LLama 2 Runs On Windows 98

[EXO Labs] demonstrated something pretty striking: a modified version of Llama 2 (a large language model) that runs on Windows 98. Why? Because when it comes to personal computing, if something can run on Windows 98, it can run on anything. More to the point: if something can run on Windows 98 then it’s something no tech company can control how you use, no matter how large or influential they may be. More on that in a minute.

Ever wanted to run a local LLM on 25 year old hardware? No? Well now you can, and at a respectable speed, too!

What’s it like to run an LLM on Windows 98? Aside from the struggles of things like finding compatible peripherals (back to PS/2 hardware!) and transferring the required files (FTP over Ethernet to the rescue) or even compilation (some porting required), it works maybe better than one might expect.

A Windows 98 machine with Pentium II processor and 128 MB of RAM generates a speedy 39.31 tokens per second with a 260K parameter Llama 2 model. A much larger 15M model generates 1.03 tokens per second. Slow, but it works. Going even larger will also work, just ever slower. There’s a video on X that shows it all in action.

It’s true that modern LLMs have billions of parameters so these models are tiny in comparison. But that doesn’t mean they can’t be useful. Models can be shockingly small and still be perfectly coherent and deliver surprisingly strong performance if their training and “job” is narrow enough, and the tools to do that for oneself are all on GitHub.

This is a good time to mention that this particular project (and its ongoing efforts) are part of a set of twelve projects by EXO Labs focusing on ensuring things like AI models can be run anywhere, by anyone, independent of tech giants aiming to hold all the strings.

And hey, if local AI and the command line is something that’s up your alley, did you know they already exist as single-file, multi-platform, command-line executables?

Running AI Locally Without Spending All Day On Setup

There are many AI models out there that you can play with from companies like OpenAI, Google, and a host of others. But when you use them, you get the experience they want, and you run it on their computer. There are a variety of reasons you might not like this. You may not want your data or ideas sent through someone else’s computer. Maybe you want to tune and tweak in ways they aren’t going to let you.

There are many more or less open models, but setting up to run them can be quite a chore and — unless you are very patient — require a substantial-sized video card to use as a vector processor. There’s very little help for the last problem. You can farm out processing, but then you might as well use a hosted chatbot. But there are some very easy ways to load and run many AI models on Windows, Linux, or a Mac. One of the easiest we’ve found is Msty. The program is free for personal use and claims to be private, although if you are really paranoid, you’ll want to verify that yourself.

What is Msty?

Talkin’ about Hackaday!

Msty is a desktop application that lets you do several things. First, it can let you chat with an AI engine either locally or remotely. It knows about many popular options and can take your keys for paid services. For local options, it can download, install, and run the engines of your choice.

For services or engines that it doesn’t know about, you can do your own setup, which ranges from easy to moderately difficult, depending on what you are trying to do.

Of course, if you have a local model or even most remote ones, you can use Python or some basic interface (e.g., with ollama; there are plenty of examples). However, Msty lets you have a much richer experience. You can attach files, for example. You can export the results and look back at previous chats. If you don’t want them remembered, you can chat in “vapor” mode or delete them later.

Each chat lives in a folder, which can have helpful prompts to kick off the chat. So, a folder might say, “You are an 8th grade math teacher…” or whatever other instructions you want to load before engaging in chat.

Continue reading “Running AI Locally Without Spending All Day On Setup”