Australian Library Uses Chatbot To Imitate Veteran With Predictable Results

The educational sector is usually the first to decry large language models and AI, due to worries about cheating. The State Library of Queensland, however, has embraced the technology in controversial fashion. In the lead-up to Anzac Day, the primarily Australian war memorial holiday, the library released a chatbot intended to imitate a World War One veteran. It went as well as you’d expect.

The highlighted line was apparently added to the chatbot’s instructions later on to help shut down tomfoolery.

Twitter users immediately chimed in with dismay at the very concept. Others showed how easy it was to “jailbreak” the AI, convincing Charlie he was actually supposed to teach Python, imitate Frasier Crane, or explain laws like Elle from Legally Blonde. One person figured out how to get Charlie to spit out his initial instructions; these were patched later in the day to try and stop some of the shenanigans.

From those instructions, it’s clear that this was supposed to be educational, rather than some sort of macabre experiment. However, Charlie didn’t do a great job here, either. As with any Large Language Model, Charlie had no sense of objective truth. He routinely spat out incorrect facts regarding the war, and regularly contradicted himself.

Generally, any plan that includes the words “impersonate a veteran” is a foolhardy one at best. Throwing a machine-generated portrait and a largely uncontrolled AI into the mix didn’t help things. Regardless, the State Library has left the “Virtual Veterans” experience up at the time of writing.

The problem with AI is that it’s not a magic box that gets things right all the time. It never has been. As long as organizations keep putting AI to use in ways like this, the same story will keep playing out.

Dump A Code Repository As A Text File, For Easier Sharing With Chatbots

Some LLMs (Large Language Models) can act as useful programming assistants when provided with a project’s source code, but experimenting with this can get a little tricky if the chatbot has no way to download from the internet. In such cases, the code must be provided by either pasting it into the prompt or uploading a file manually. That’s acceptable for simple things, but for more complex projects, it gets awkward quickly.

To make this easier, [Eric Hartford] created github2file, a Python script that outputs a single text file containing the combined source code of a specified repository. This text file can be uploaded (or its contents pasted into the prompt) making it much easier to share code with chatbots.

Continue reading “Dump A Code Repository As A Text File, For Easier Sharing With Chatbots”

Air Canada’s Chatbot: Why RAG Is Better Than An LLM For Facts

Recently Air Canada was in the news regarding the outcome of Moffatt v. Air Canada, in which Air Canada was forced to pay restitution to Mr. Moffatt after the latter had been disadvantaged by advice given by a chatbot on the Air Canada website regarding the latter’s bereavement fare policy. When Mr. Moffatt inquired whether he could apply for the bereavement fare after returning from the flight, the chatbot said that this was the case, even though the link which it provided to the official bereavement policy page said otherwise.

This latter aspect of the case is by far the most interesting aspect of this case, as it raises many questions about the technical details of this chatbot which Air Canada had deployed on its website. Since the basic idea behind such a chatbot is that it uses a curated source of (company) documentation and policies, the assumption made by many is that this particular chatbot instead used an LLM with more generic information in it, possibly sourced from many other public-facing policy pages.

Whatever the case may be, chatbots are increasingly used by companies, but instead of pure LLMs they use what is called RAG: retrieval augmented generation. This bypasses the language model and instead fetches factual information from a vetted source of documentation.

Continue reading “Air Canada’s Chatbot: Why RAG Is Better Than An LLM For Facts”

Hackaday Links Column Banner

Hackaday Links: September 10, 2023

Most of us probably have a vision of how “The Robots” will eventually rise up and deal humanity out of the game. We’ve all seen that movie, of course, and know exactly what will happen when SkyNet becomes self-aware. But for those of you thinking we’ll get off relatively easy with a quick nuclear armageddon, we’re sorry to bear the news that AI seems to have other plans for us, at least if this report of dodgy AI-generated mushroom foraging manuals is any indication. It seems that Amazon is filled with publications these days that do a pretty good job of looking like they’re written by human subject matter experts, but are actually written by ChatGPT or similar tools. That may not be such a big deal when the subject matter concerns stamp collecting or needlepoint, but when it concerns differentiating edible fungi from toxic ones, that’s a different matter. The classic example is the Death Cap mushroom (Amanita phalloides) which varies quite a bit in identifying characteristics like color and size, enough so that it’s often tough for expert mycologists to tell it apart from its edible cousins. Trouble is, when half a Death Cap contains enough toxin to kill an adult human, the margin for error is much narrower than what AI is likely to include in a foraging manual. So maybe that’s AI’s grand plan for humanity — just give us all really bad advice and let Darwin take care of the rest.

Continue reading “Hackaday Links: September 10, 2023”

Self-Hosted Chatbot Focuses On Privacy

Large language models (LLMs) have been all the rage lately, assisting from all kinds of tasks from programming to devising Excel formulas to shortcutting school work. They’re also relatively easy to access for the most part, but as the old saying goes, if something on the Internet is free the real product is you (and your data). Luckily there are ways of hosting LLMs on your own to avoid your personal data getting harvested, as well as taking advantage of open-source solutions, but building these systems takes a little bit of effort. [Stephen] and a team from Mozilla walk us through this process and show us a number of options currently available.

Working from the ground up, the group first decides on hosting, which (unsurprisingly) involves using Mozilla hosting services. The choice of runtime environment was a little bit more challenging. The project was time constrained, so they looked at two options here: Hugging Face and llama.cpp. Eventually deciding to move forward with llama.cpp largely due to its ability to run on more consumer-oriented hardware (especially Apple silicon) and the fact that it doesn’t need a powerful GPU, the next task was to choose the model. Settling on the LLaMa model that Facebook recently open-sourced, this model works well with the runtime environment and is essentially the only one that does.

From there, the team at Mozilla wanted to make sure their chat bot would be able to provide other Mozilla employees with information more readily pertinent to their jobs, so they trained their model with some internal Mozilla data as well as other more generic information. This doesn’t mean the job is done, though, there are a number of other factors that went in to designing this system before it was finally complete. Even then, since they built this in a week it’s not perfect; there are some issues with non-permissive licensing of some of the components and many of the design choices may not have been ideal. It’s impressive what’s out there if you’re hosting your own system, though, and while this might be a little more advanced for a self-hosted project, take a look at some other more beginner-friendly projects you can try if you’re just starting out on the self-hosted path.

Ask Hackaday: The Turing Test Is Dead: Long Live The Turing Test!

Alan Turing proposed a test for machine intelligence that no longer works. The idea was to have people communicate over a terminal, with another real person and with a computer. If the computer is intelligent, Turing mused, most people will incorrectly identify the computer as a human. Clearly, with the advent of modern chatbots, that test is now broken. Despite the “AI” moniker, chatbots aren’t sentient or even pre-sentient, but they certainly seem that way. An AI CEO, Mustafa Suleyman, is proposing a new test: The AI has to take a $100,000 budget and earn $1,000,000.

We were a little bemused at this. By that measure, most of us aren’t intelligent, either, and it seems like this is a particularly capitalistic idea. We could probably write an Excel script that studied mutual fund performance and pull off the same trick, given enough time for the investment to mature. Is it intelligent? No. Besides, even humans who have demonstrated they can make $1,000,000 often sell their companies and start new ones that fail. How often does the AI have to succeed before we grant it person status?

Continue reading “Ask Hackaday: The Turing Test Is Dead: Long Live The Turing Test!”

What Do You Want In A Programming Assistant?

The Propellerheads released a song in 1998 entitled “History Repeating.” If you don’t know it, the lyrics include: “They say the next big thing is here. That the revolution’s near. But to me, it seems quite clear. That it’s all just a little bit of history repeating.” The next big thing today seems to be the AI chatbots. We’ve heard every opinion from the “revolutionize everything” to “destroy everything” camp. But, really, isn’t it a bit of history repeating itself? We get new tech. Some oversell it. Some fear it. Then, in the end, it becomes part of the ordinary landscape and seems unremarkable in the light of the new next big thing. Dynamite, the steam engine, cars, TV, and the Internet were all predicted to “ruin everything” at some point in the past.

History really does repeat itself. After all, when X-rays were discovered, they were claimed to cure pneumonia and other infections, along with other miracle cures. Those didn’t pan out, but we still use them for things they are good at. Calculators were going to ruin math classes. There are plenty of other examples.

This came to mind because a recent post from ACM has the contrary view that chatbots aren’t able to help real programmers. We’ve also seen that — maybe — it can, in limited ways. We suspect it is like getting a new larger monitor. At first, it seems huge. But in a week, it is just the normal monitor, and your old one — which had been perfectly adequate — seems tiny.

But we think there’s a larger point here. Maybe the chatbots will help programmers. Maybe they won’t. But clearly, programmers want some kind of help. We just aren’t sure what kind of help it is. Do we really want CoPilot to write our code for us? Do we want to ask Bard or ChatGPT/Bing what is the best way to balance a B-tree? Asking AI to do static code analysis seems to work pretty well.

So maybe your path to fame and maybe even riches is to figure out — AI-based or not — what people actually want in an automated programming assistant and build that. The home computer idea languished until someone figured out what people wanted to do with them. Video cassette didn’t make it into the home until companies figured out what people wanted most to watch on them.

How much and what kind of help do you want when you program? Or design a circuit or PCB? Or even a 3D model? Maybe AI isn’t going to take your job; it will just make it easier. We doubt, though, that it can much improve on Dame Shirley Bassey’s history lesson.