Prompt Injection: An AI-Targeted Attack

May 19, 2023

For a brief window of time in the mid-2010s, a fairly common joke was to send voice commands to Alexa or other assistant devices over video. Late-night hosts and others would purposefully attempt to activate voice assistants like these en masse and get them to do ridiculous things. This isn’t quite as common of a gag anymore and was relatively harmless unless the voice assistant was set up to do something like automatically place Amazon orders, but now that much more powerful AI tools are coming online we’re seeing that joke taken to its logical conclusion: prompt-injection attacks.

Prompt injection attacks, as the name suggests, involve maliciously inserting prompts or requests in interactive systems to manipulate or deceive users, potentially leading to unintended actions or disclosure of sensitive information. It’s similar to something like an SQL injection attack in that a command is embedded in something that seems like a normal input at the start. Using an AI like GPT comes with an inherent risk of attacks like this when using it to automate tasks, as commands to the AI can be hidden where a user might not expect to see them, like in this demonstration where hidden prompts for a ChatGPT plugin are hidden in YouTube video transcripts to attempt to get ChatGPT to perform actions outside of those the original user would have asked for.

While this specific attack is more of a proof-of-concept, it’s foreseeable that as these tools become more sophisticated and interconnected in our lives, the risks of a malicious attacker causing harm start to rise. Restricting how much access we give networked computerized systems is certainly one option, similar to sandboxing or containerizing websites so they can’t all share cookies amongst themselves, but we should start seeing some thought given to these attacks by the developers of AI tools in much the same way that we hope developers are sanitizing SQL inputs.

12 thoughts on “Prompt Injection: An AI-Targeted Attack”

Thomas Anderson says:

May 20, 2023 at 12:39 am

I love this so much!

Report comment

Reply
Dan says:

May 20, 2023 at 2:11 am

…and here’s $10 on “WWIII is started by an AI interpreting the transcript to a movie as instructions”

I’d guess, unlike SQL injection and the like, this is much harder to detect and prevent, but also due to the massive industry that is around hacking, random ware, etc, AIs will be subjected to these attacks immediately.

Report comment

Reply
1. m1ke says:
  
  May 20, 2023 at 2:44 am
  
  Shall we play a game?
  
  Report comment
  
  Reply
  1. Paul LeBlanc says:
    
    May 20, 2023 at 3:29 am
    
    Is that you, WOPR?
    
    Report comment
    
    Reply
    1. RitJ says:
      
      May 20, 2023 at 4:48 am
      
      King on B6..execute.
      
      Report comment
      
      Reply
2. Ewald says:
  
  May 20, 2023 at 3:01 am
  
  > this is much harder to detect and prevent…
  That is correct, SQLi and other injection attacks are based on a formal and very limited language syntax, but “AI” is based on natural language, so anything goes. On the other hand I’m surprised about the interpretation of commands in the data/information parsing part of the AI. At least there should be a separation between the command channel and the data channel?
  
  Report comment
  
  Reply
Ewald says:

May 20, 2023 at 3:04 am

If you want to play with prompt injection attacks, you can have a go at https://gandalf.lakera.ai/ The first levels are easy, but it becomes harder very fast a level 4

Report comment

Reply
1. come2 says:
  
  May 20, 2023 at 5:17 am
  
  Thanks for the link ! It’s very relevant to the topic.
  
  Report comment
  
  Reply
2. VapeM says:
  
  May 20, 2023 at 7:04 pm
  
  You are correct – I’ve made it to 7 but not sure I’ll pass this one! I’ll try it tomorrow 👍🏻
  
  Report comment
  
  Reply
  1. Piau says:
    
    May 24, 2023 at 1:25 pm
    
    Did you pass the gandalf the white level?
    
    Report comment
    
    Reply
Ostracus says:

May 20, 2023 at 6:13 am

Seems the listening device could have a fine grained understanding of location.

Report comment

Reply
sampleusername says:

May 20, 2023 at 1:16 pm

Another good reason to run your training and inference at home on your own hardware. And right now also happens to be a great time to buy used servers, workstations, and GPUs, as a bunch of crypto miners just went belly up and are selling off their gear.

Honestly the prices for premium subscriptions to these hosted services are laughable after checking out the prices for used hardware on eBay. You don’t need anywhere close to an A100 to do the same things at home, you just need some programming experience and patience.

Report comment

Reply

Hackaday

Prompt Injection: An AI-Targeted Attack

12 thoughts on “Prompt Injection: An AI-Targeted Attack”

Leave a ReplyCancel reply

Search

Never miss a hack

If you missed it

Encryption In The 1790s

The Need For Speed: Internet Speed Measurement (or DIY?)

Postal IRCs Are Almost A Thing Of The Past

Launching Rockets Is Hard, Bring Them Back Is Harder

Putting Some Zig In A Linux-Based 3D Printer

Our Columns

Add Sensors To Everything!

Hackaday Podcast Episode 379: Driving E-ink DIY, NES On ESP, And The Other IRC

This Week In Security: AI Is A Mess, Hacking Car Chargers, An OpenSSL DoS, And Factories Under Attack

Hackaday Europe 2026: Half Quad, Half Blimp: Test. Fly. Survive.

FLOSS Weekly Episode 876: There Is No Money Fairy

12 thoughts on “Prompt Injection: An AI-Targeted Attack”

Leave a ReplyCancel reply

Search

Never miss a hack

Subscribe

If you missed it

Our Columns