Prompt Injection: An AI-Targeted Attack

May 19, 2023 by Bryan Cockfield 12 Comments

For a brief window of time in the mid-2010s, a fairly common joke was to send voice commands to Alexa or other assistant devices over video. Late-night hosts and others would purposefully attempt to activate voice assistants like these en masse and get them to do ridiculous things. This isn’t quite as common of a gag anymore and was relatively harmless unless the voice assistant was set up to do something like automatically place Amazon orders, but now that much more powerful AI tools are coming online we’re seeing that joke taken to its logical conclusion: prompt-injection attacks. Continue reading “Prompt Injection: An AI-Targeted Attack” →

The Hello World Of GPT?

April 10, 2023 by Al Williams 73 Comments

Someone wants to learn about Arduino programming. Do you suggest they blink an LED first? Or should they go straight for a 3D laser scanner with galvos, a time-of-flight sensor, and multiple networking options? Most of us need to start with the blinking light and move forward from there. So what if you want to learn about the latest wave of GPT — generative pre-trained transformer — programs? Do you start with a language model that looks at thousands of possible tokens in large contexts? Or should you start with something simple? We think you should start simple, and [Andrej Karpathy] agrees. He has a workbook that makes a tiny GPT that can predict the next bit in a sequence. It isn’t any more practical than a blinking LED, but it is a manageable place to start.

The simple example starts with a vocabulary of two. In other words, characters are 1 or 0. It also uses a context size of 3, so it will look at 3 bits and use that to infer the 4th bit. To further simplify things, the examples assume you will always get a fixed-size sequence of tokens, in this case, eight tokens. Then it builds a little from there.

Continue reading “The Hello World Of GPT?” →

Tired Of Web Scraping? Make The AI Do It

April 9, 2023 by Donald Papp 19 Comments

[James Turk] has a novel approach to the problem of scraping web content in a structured way without needing to write the kind of page-specific code web scrapers usually have to deal with. How? Just enlist the help of a natural language AI. Scrapeghost relies on OpenAI’s GPT API to parse a web page’s content, pull out and classify any salient bits, and format it in a useful way.

What makes Scrapeghost different is how data gets organized. For example, when instantiating scrapeghost one defines the data one wishes to extract. For example:

from scrapeghost import SchemaScraper
scrape_legislators = SchemaScraper(
schema={
"name": "string",
"url": "url",
"district": "string",
"party": "string",
"photo_url": "url",
"offices": [{"name": "string", "address": "string", "phone": "string"}],
}
)

The kicker is that this format is entirely up to you! The GPT models are very, very good at processing natural language, and scrapeghost uses GPT to process the scraped data and find (using the example above) whatever looks like a name, district, party, photo, and office address and format it exactly as requested.

It’s an experimental tool and you’ll need an API key from OpenAI to use it, but it has useful features and is certainly a novel approach. There’s a tutorial and even a command-line interface, so check it out.

The Singularity Isn’t Here… Yet

March 17, 2023 by Jenny List 112 Comments

So, GPT-4 is out, and it’s all over for us meatbags. Hype has reached fever pitch, here in the latest and greatest of AI chatbots we finally have something that can surpass us. The singularity has happened, and personally I welcome our new AI overlords.

Hang on a minute though, I smell a rat, and it comes in defining just what intelligence is. In my time I’ve hung out with a lot of very bright people, as well as a lot of not-so-bright people who nonetheless think they’re very clever simply because they have a bunch of qualifications and diplomas. Sadly the experience hasn’t bestowed God-like intelligence on me, but it has given me a handle on the difference between intelligence and knowledge.

My premise is that we humans are conditioned by our education system to equate learning with intelligence, mostly because we have flaky CPUs and worse memory, and that makes learning something a bit of an effort. Thus when we see an AI, a machine that can learn everything because it has a decent CPU and memory, we’re conditioned to think of it as intelligent because that’s what our schools train us to do. In fact it seems intelligent to us not because it’s thinking of new stuff, but merely through knowing stuff we don’t because we haven’t had the time or capacity to learn it.

Growing up and making my earlier career around a major university I’ve seen this in action so many times, people who master one skill, rote-learning the school textbook or the university tutor’s pet views and theories, and barfing them up all over the exam paper to get their amazing qualifications. On paper they’re the cream of the crop, and while it’s true they’re not thick, they’re rarely the special clever people they think they are. People with truly above-average intelligence exist, but in smaller numbers, and their occurrence is not a 1:1 mapping with holders of advanced university degrees.

Even the examples touted of GPT’s brilliance tend to reinforce this. It can do the bar exam or the SAT test, thus we’re told it’s as intelligent as a school-age kid or a lawyer. Both of those qualifications follow our educational system’s flawed premise that education equates to intelligence, so as a machine that’s learned all the facts it follows my point above about learning by rote. The machine has simply barfed up what it has learned the answers are onto the exam paper. Is that intelligence? Is a search engine intelligent?

This is not to say that tools such as GPT-4 are not amazing creations that have a lot of potential to do good things aside from filling up the internet with superficially readable spam. Everyone should have a play with them and investigate their potential, and from that will no doubt come some very interesting things. Just don’t confuse them with real people, because sometimes meatbags can surprise you.

Detecting Machine-Generated Content: An Easier Task For Machine Or Human?

February 1, 2023 by Maya Posch 41 Comments

In today’s world we are surrounded by various sources of written information, information which we generally assume to have been written by other humans. Whether this is in the form of books, blogs, news articles, forum posts, feedback on a product page or the discussions on social media and in comment sections, the assumption is that the text we’re reading has been written by another person. However, over the years this assumption has become ever more likely to be false, most recently due to large language models (LLMs) such as GPT-2 and GPT-3 that can churn out plausible paragraphs on just about any topic when requested.

This raises the question of whether we are we about to reach a point where we can no longer be reasonably certain that an online comment, a news article, or even entire books and film scripts weren’t churned out by an algorithm, or perhaps even where an online chat with a new sizzling match turns out to be just you getting it on with an unfeeling collection of code that was trained and tweaked for maximum engagement with customers. (Editor’s note: no, we’re not playing that game here.)

As such machine-generated content and interactions begin to play an ever bigger role, it raises both the question of how you can detect such generated content, as well as whether it matters that the content was generated by an algorithm instead of by a human being.

Continue reading “Detecting Machine-Generated Content: An Easier Task For Machine Or Human?” →

Hackaday

GPT

12 Articles

Prompt Injection: An AI-Targeted Attack

The Hello World Of GPT?

Tired Of Web Scraping? Make The AI Do It

The Singularity Isn’t Here… Yet

Detecting Machine-Generated Content: An Easier Task For Machine Or Human?

Search

Never miss a hack

If you missed it

Spy Tech: Conflicts Bring A New Number Station

The Most Secure, Modern Computer Might Be A Mac

From Zip To Nought: The Rise And Fall Of Iomega

The Zero-Power Flight Computer

Artemis II Agenda Keeps Moon-Bound Crew Busy

Our Columns

Medieval Alhambra’s Pulser Pump And Other Aquatic Marvels

Hackaday Links: March 29, 2026

For Art’s Sake

Hackaday Podcast Episode 363: The History Of PLA, Laser DIY PCBs, And Corporate Craziness

This Week In Security: Second Verse, Worse Than The First

Search

Never miss a hack

Subscribe

If you missed it

Our Columns