A Great Use For AI: Wasting Scammers Time!

February 6, 2025 by Al Williams 62 Comments

We may have found the killer app for AI. Well, actually, British telecom provider O2 has. As The Guardian reports, they have an AI chatbot that acts like a 78-year-old grandmother and receives phone calls. Of course, since the grandmother—Daisy, by name—doesn’t get any real phone calls, anyone calling that number is probably a scammer. Daisy’s specialty? Keeping them tied up on the phone.

While this might just seem like a prank for revenge, it is actually more than that. Scamming people is a numbers game. Most people won’t bite. So, to be successful, scammers have to make lots of calls. Daisy can keep one tied up for around 40 minutes or more.

Continue reading “A Great Use For AI: Wasting Scammers Time!” →

More Details On Why DeepSeek Is A Big Deal

February 3, 2025 by Donald Papp 48 Comments

The DeepSeek large language models (LLM) have been making headlines lately, and for more than one reason. IEEE Spectrum has an article that sums everything up very nicely.

We shared the way DeepSeek made a splash when it came onto the AI scene not long ago, and this is a good opportunity to go into a few more details of why this has been such a big deal.

For one thing, DeepSeek (there’s actually two flavors, -V3 and -R1, more on them in a moment) punches well above its weight. DeepSeek is the product of an innovative development process, and freely available to use or modify. It is also indirectly highlighting the way companies in this space like to label their LLM offerings as “open” or “free”, but stop well short of actually making them open source.

The DeepSeek-V3 LLM was developed in China and reportedly cost less than 6 million USD to train. This was possible thanks to developing DualPipe, a highly optimized and scalable method of training the system despite limitations due to export restrictions on Nvidia hardware. Details are in the technical paper for DeepSeek-V3.

There’s also DeepSeek-R1, a chain-of-thought “reasoning” model which handily provides its thought process enclosed within easily-parsed <think> and </think> pseudo-tags that are included in its responses. A model like this takes an iterative step-by-step approach to formulating responses, and benefits from prompts that provide a clear goal the LLM can aim for. The way DeepSeek-R1 was created was itself novel. Its training started with supervised fine-tuning (SFT) which is a human-led, intensive process as a “cold start” which eventually handed off to a more automated reinforcement learning (RL) process with a rules-based reward system. The result avoided problems that come from relying too much on RL, while minimizing the human effort of SFT. Technical details on the process of training DeepSeek-R1 are here.

DeepSeek-V3 and -R1 are freely available in the sense that one can access the full-powered models online or via an app, or download distilled models for local use on more limited hardware. It is free and open as in accessible, but not open source because not everything needed to replicate the work is actually released. Like with most LLMs, the training data and actual training code used are not available.

What is released and making waves of its own are the technical details of how researchers produced what they did, and that means there are efforts to try to make an actually open source version. Keep an eye out for Open-R1!

New Open Source DeepSeek V3 Language Model Making Waves

January 27, 2025 by Maya Posch 82 Comments

In the world of large language models (LLMs) there tend to be relatively few upsets ever since OpenAI barged onto the scene with its transformer-based GPT models a few years ago, yet now it seems that Chinese company DeepSeek has upended the status quo. Its new DeepSeek-V3 model is not only open source, it also claims to have been trained for only a fraction of the effort required by competing models, while performing significantly better.

The full training of DeepSeek-V3’s 671B parameters is claimed to have only taken 2.788 M hours on NVidia H800 (Hopper-based) GPUs, which is almost a factor of ten less than others. Naturally this has the LLM industry somewhat up in a mild panic, but for those who are not investors in LLM companies or NVidia can partake in this new OSS model that has been released under the MIT license, along with the DeepSeek-R1 reasoning model.

Both of these models can be run locally, using both AMD and NVidia GPUs, as well as using the online APIs. If these models do indeed perform as efficiently as claimed, they stand to massively reduce the hardware and power required to not only train but also query LLMs.

Preventing AI Plagiarism With .ASS Subtitling

January 25, 2025 by Bryan Cockfield 57 Comments

Around two years ago, the world was inundated with news about how generative AI or large language models would revolutionize the world. At the time it was easy to get caught up in the hype, but in the intervening months these tools have done little in the way of productive work outside of a few edge cases, and mostly serve to burn tons of cash while turning the Internet into even more of a desolate wasteland than it was before. They do this largely by regurgitating human creations like text, audio, and video into inferior simulacrums and, if you still want to exist on the Internet, there’s basically nothing you can do to prevent this sort of plagiarism. Except feed the AI models garbage data like this YouTuber has started doing.

At least as far as YouTube is concerned, the worst offenders of AI plagiarism work by downloading the video’s subtitles, passing them through some sort of AI model, and then generating another YouTube video based off of the original creator’s work. Most subtitle files are the fairly straightfoward .srt filetype which only allows for timing and text information. But a more obscure subtitle filetype known as Advanced SubStation Alpha, or .ass, allows for all kinds of subtitle customization like orientation, formatting, font types, colors, shadowing, and many others. YouTuber [f4mi] realized that using this subtitle system, extra garbage text could be placed in the subtitle filetype but set out of view of the video itself, either by placing the text outside the viewable area or increasing its transparency. So now when an AI crawler downloads the subtitle file it can’t distinguish real subtitles from the garbage placed into it.

[f4mi] created a few scripts to do this automatically so that it doesn’t have to be done by hand for each one. It also doesn’t impact the actual subtitles on the screen for people who need them for accessibility reasons. It’s a great way to “poison” AI models and make it at least harder for them to rip off the creations of original artists, and [f4mi]’s tests show that it does work. We’ve actually seen a similar method for poisoning data sets used for emails long ago, back when we were all collectively much more concerned about groups like the NSA using automated snooping tools in our emails than we were that machines were going to steal our creative endeavors.

Thanks to [www2] for the tip!

Continue reading “Preventing AI Plagiarism With .ASS Subtitling” →

AI Mistakes Are Different, And That’s A Problem

January 25, 2025 by Donald Papp 59 Comments

People have been making mistakes — roughly the same ones — since forever, and we’ve spent about the same amount of time learning to detect and mitigate them. Artificial Intelligence (AI) systems make mistakes too, but [Bruce Schneier] and [Nathan E. Sanders] make the observation that, compared to humans, AI models make entirely different kinds of mistakes. We are perhaps less equipped to handle this unusual problem than we realize.

The basic idea is this: as humans we have tremendous experience making mistakes, and this has also given us a pretty good idea of what to expect our mistakes to look like, and how to deal with them. Humans tend to make mistakes at the edges of our knowledge, our mistakes tend to clump around the same things, we make more of them when bored or tired, and so on. We have as a result developed controls and systems of checks and balances to help reduce the frequency and limit the harm of our mistakes. But these controls don’t carry over to AI systems, because AI mistakes are pretty strange.

The mistakes of AI models (particularly Large Language Models) happen seemingly randomly and aren’t limited to particular topics or areas of knowledge. Models may unpredictably appear to lack common sense. As [Bruce] puts it, “A model might be equally likely to make a mistake on a calculus question as it is to propose that cabbages eat goats.” A slight re-wording of a question might be all it takes for a model to suddenly be confidently and utterly wrong about something it just a moment ago seemed to grasp completely. And speaking of confidence, AI mistakes aren’t accompanied by uncertainty. Of course humans are no strangers to being confidently wrong, but as a whole the sort of mistakes AI systems make aren’t the same kinds of mistakes we’re used to.

There are different ideas on how to deal with this, some of which researchers are (ahem) confidently undertaking. But for best results, we’ll need to invent new ways as well. The essay also appeared in IEEE Spectrum and isn’t terribly long, so take a few minutes to check it out and get some food for thought.

And remember, if preventing mistakes at all costs is the goal, that problem is already solved: GOODY-2 is undeniably the world’s safest AI.

Turning GLaDOS Into Ted: A Tale Of A Talking Toy

January 14, 2025 by Heidi Ulrich 7 Comments

What if your old, neglected toys could come to life — with a bit of sass? That’s exactly what [Binh] achieved when he transformed his sister’s worn-out teddy bear into ‘Ted’, an interactive talking plush with a personality of its own. This project, which combines the GLaDOS Personality Core project from the Portal series with clever microcontroller tinkering, brings a whole new personality to a childhood favorite.

[Binh] started with the basics: a teddy bear already equipped with buttons and speakers, which he overhauled with an ESP32 microcontroller. The bear’s personality originated from GLaDOS, but was rewritten by [Binh] to fit a cheeky, teddy-bear tone. With a few tweaks in the Python-based fork, [Binh] created threads to handle touch-based interaction. For example, the ESP32 detects where the bear is touched and sends this input to a modified neural network, which then generates a response. The bear can, for instance, call you out for holding his paw for too long or sarcastically plead for mercy. I hear you say ‘but that bear Ted could do a lot more!’ Well — maybe, all this is just what an innocent bear with a personality should be capable of.

Instead, let us imagine future iterations featuring capacitive touch sensors or accelerometers to detect movement. The project is simple, but showcases the potential for intelligent plush toys. It might raise some questions, too.

Continue reading “Turning GLaDOS Into Ted: A Tale Of A Talking Toy” →

Modern AI On Vintage Hardware: LLama 2 Runs On Windows 98

January 13, 2025 by Donald Papp 27 Comments

[EXO Labs] demonstrated something pretty striking: a modified version of Llama 2 (a large language model) that runs on Windows 98. Why? Because when it comes to personal computing, if something can run on Windows 98, it can run on anything. More to the point: if something can run on Windows 98 then it’s something no tech company can control how you use, no matter how large or influential they may be. More on that in a minute.

Ever wanted to run a local LLM on 25 year old hardware? No? Well now you can, and at a respectable speed, too!

What’s it like to run an LLM on Windows 98? Aside from the struggles of things like finding compatible peripherals (back to PS/2 hardware!) and transferring the required files (FTP over Ethernet to the rescue) or even compilation (some porting required), it works maybe better than one might expect.

A Windows 98 machine with Pentium II processor and 128 MB of RAM generates a speedy 39.31 tokens per second with a 260K parameter Llama 2 model. A much larger 15M model generates 1.03 tokens per second. Slow, but it works. Going even larger will also work, just ever slower. There’s a video on X that shows it all in action.

It’s true that modern LLMs have billions of parameters so these models are tiny in comparison. But that doesn’t mean they can’t be useful. Models can be shockingly small and still be perfectly coherent and deliver surprisingly strong performance if their training and “job” is narrow enough, and the tools to do that for oneself are all on GitHub.

This is a good time to mention that this particular project (and its ongoing efforts) are part of a set of twelve projects by EXO Labs focusing on ensuring things like AI models can be run anywhere, by anyone, independent of tech giants aiming to hold all the strings.

And hey, if local AI and the command line is something that’s up your alley, did you know they already exist as single-file, multi-platform, command-line executables?