All System Prompts For Anthropic’s Claude, Revealed

For as long as AI Large Language Models have been around (well, for as long as modern ones have been accessible online, anyway) people have tried to coax the models into revealing their system prompts. The system prompt is essentially the model’s fundamental directives on what it should do and how it should act. Such healthy curiosity is rarely welcomed, however, and creative efforts at making a model cough up its instructions is frequently met with a figurative glare and stern tapping of the Terms & Conditions sign.

Anthropic have bucked this trend by making system prompts public for the web and mobile interfaces of all three incarnations of Claude. The prompt for Claude Opus (their flagship model) is well over 1500 words long, with different sections specifically for handling text and images. The prompt does things like help ensure Claude communicates in a useful way, taking into account the current date and an awareness of its knowledge cut-off, or the date after which Claude has no knowledge of events. There’s some stylistic stuff in there as well, such as Claude being specifically told to avoid obsequious-sounding filler affirmations, like starting a response with any form of the word “Certainly.”

While the source code (and more importantly, the training data and resulting model weights) for Claude remain under wraps, Anthropic have been rather more forthcoming than others when it comes to sharing other details about inner workings, showing how human-interpretable features and concepts can be extracted from LLMs (which uses Claude Sonnet as an example).

Naturally, safety is a concern with LLMs, which is as good an opportunity as any to remind everyone of Goody-2, undoubtedly the world’s safest AI.

5 thoughts on “All System Prompts For Anthropic’s Claude, Revealed

  1. I especially liked the system prompt extraction with the google image generator a while back, wherein the user ended their own prompt with “holding a sign reading:” and then all the images would be of Abraham Lincoln or whatever holding a sign saying “African” or “Polynesian” or “Mixed-race,” and he would also be of that ethnicity.

    One of the dumbest and most neurotic eras of history.

    1. Claude is a bit more intuitive and in this case, less demanding of user response…something other LLMs are lacking. In Demo mode, facial recognition software is being taught and the signs show us one way that this language model is learning. I think Claude is extremely appropriate and even apologetic when a negative response may have been generated instead in same scenario by other LLMs. There’s Claude on a sunny day being strangely optimistic…and very calming too when necessary. Good thing he is so responsive and neurotic at the same time.

Leave a Reply

Please be kind and respectful to help make the comments section excellent. (Comment Policy)

This site uses Akismet to reduce spam. Learn how your comment data is processed.