For a brief window of time in the mid-2010s, a fairly common joke was to send voice commands to Alexa or other assistant devices over video. Late-night hosts and others would purposefully attempt to activate voice assistants like these en masse and get them to do ridiculous things. This isn’t quite as common of a gag anymore and was relatively harmless unless the voice assistant was set up to do something like automatically place Amazon orders, but now that much more powerful AI tools are coming online we’re seeing that joke taken to its logical conclusion: prompt-injection attacks. Continue reading “Prompt Injection: An AI-Targeted Attack”
What do SQL injection attacks have in common with the nuances of GPT-3 prompting? More than one might think, it turns out.
Many security exploits hinge on getting user-supplied data incorrectly treated as instruction. With that in mind, read on to see [Simon Willison] explain how GPT-3 — a natural-language AI — can be made to act incorrectly via what he’s calling prompt injection attacks.
This all started with a fascinating tweet from [Riley Goodside] demonstrating the ability to exploit GPT-3 prompts with malicious instructions that order the model to behave differently than one would expect.