Prompt Injection: An AI-Targeted Attack

May 19, 2023

For a brief window of time in the mid-2010s, a fairly common joke was to send voice commands to Alexa or other assistant devices over video. Late-night hosts and others would purposefully attempt to activate voice assistants like these en masse and get them to do ridiculous things. This isn’t quite as common of a gag anymore and was relatively harmless unless the voice assistant was set up to do something like automatically place Amazon orders, but now that much more powerful AI tools are coming online we’re seeing that joke taken to its logical conclusion: prompt-injection attacks.

Prompt injection attacks, as the name suggests, involve maliciously inserting prompts or requests in interactive systems to manipulate or deceive users, potentially leading to unintended actions or disclosure of sensitive information. It’s similar to something like an SQL injection attack in that a command is embedded in something that seems like a normal input at the start. Using an AI like GPT comes with an inherent risk of attacks like this when using it to automate tasks, as commands to the AI can be hidden where a user might not expect to see them, like in this demonstration where hidden prompts for a ChatGPT plugin are hidden in YouTube video transcripts to attempt to get ChatGPT to perform actions outside of those the original user would have asked for.

While this specific attack is more of a proof-of-concept, it’s foreseeable that as these tools become more sophisticated and interconnected in our lives, the risks of a malicious attacker causing harm start to rise. Restricting how much access we give networked computerized systems is certainly one option, similar to sandboxing or containerizing websites so they can’t all share cookies amongst themselves, but we should start seeing some thought given to these attacks by the developers of AI tools in much the same way that we hope developers are sanitizing SQL inputs.

12 thoughts on “Prompt Injection: An AI-Targeted Attack”

Thomas Anderson says:

May 20, 2023 at 12:39 am

I love this so much!

Reply
Dan says:

May 20, 2023 at 2:11 am

…and here’s $10 on “WWIII is started by an AI interpreting the transcript to a movie as instructions”

I’d guess, unlike SQL injection and the like, this is much harder to detect and prevent, but also due to the massive industry that is around hacking, random ware, etc, AIs will be subjected to these attacks immediately.

Reply
1. m1ke says:
  
  May 20, 2023 at 2:44 am
  
  Shall we play a game?
  
  Reply
  1. Paul LeBlanc says:
    
    May 20, 2023 at 3:29 am
    
    Is that you, WOPR?
    
    Reply
    1. RitJ says:
      
      May 20, 2023 at 4:48 am
      
      King on B6..execute.
      
      Reply
2. Ewald says:
  
  May 20, 2023 at 3:01 am
  
  > this is much harder to detect and prevent…
  That is correct, SQLi and other injection attacks are based on a formal and very limited language syntax, but “AI” is based on natural language, so anything goes. On the other hand I’m surprised about the interpretation of commands in the data/information parsing part of the AI. At least there should be a separation between the command channel and the data channel?
  
  Reply
Ewald says:

May 20, 2023 at 3:04 am

If you want to play with prompt injection attacks, you can have a go at https://gandalf.lakera.ai/ The first levels are easy, but it becomes harder very fast a level 4

Reply
1. come2 says:
  
  May 20, 2023 at 5:17 am
  
  Thanks for the link ! It’s very relevant to the topic.
  
  Reply
2. VapeM says:
  
  May 20, 2023 at 7:04 pm
  
  You are correct – I’ve made it to 7 but not sure I’ll pass this one! I’ll try it tomorrow 👍🏻
  
  Reply
  1. Piau says:
    
    May 24, 2023 at 1:25 pm
    
    Did you pass the gandalf the white level?
    
    Reply
Ostracus says:

May 20, 2023 at 6:13 am

Seems the listening device could have a fine grained understanding of location.

Reply
sampleusername says:

May 20, 2023 at 1:16 pm

Another good reason to run your training and inference at home on your own hardware. And right now also happens to be a great time to buy used servers, workstations, and GPUs, as a bunch of crypto miners just went belly up and are selling off their gear.

Honestly the prices for premium subscriptions to these hosted services are laughable after checking out the prices for used hardware on eBay. You don’t need anywhere close to an A100 to do the same things at home, you just need some programming experience and patience.

Reply

Hackaday

Prompt Injection: An AI-Targeted Attack

12 thoughts on “Prompt Injection: An AI-Targeted Attack”

Leave a Reply to Thomas AndersonCancel reply

Search

Never miss a hack

If you missed it

Meshtastic: A Tale Of Two Cities

Reshaping Eyeballs With Electricity, No Lasers Or Cutting Required

Smart Bulbs Are Turning Into Motion Sensors

Airbags, And How Mercedes-Benz Hacked Your Hearing

On 3D Scanners And Giving Kinects A New Purpose In Life

Our Columns

Easy For The Masses

Hackaday Podcast Episode 341: Qualcomm Owns Arduino, Steppers Still Dominate 3D Printing, And Google Controls Your Apps

This Week In Security: ID Breaches, Code Smell, And Poetic Flows

FLOSS Weekly Episode 850: One ROM To Rule Them All

Ask Hackaday: Why Is TTL 5 Volts?

12 thoughts on “Prompt Injection: An AI-Targeted Attack”

Leave a Reply to Thomas AndersonCancel reply

Search

Never miss a hack

Subscribe

If you missed it

Our Columns