Correct Horse Battery Staple: The Book

XKCD 936, the comic that introduced the phrase, ‘correct horse battery staple’ into both the lexicon and password dictionaries, is the best way to generate a password. Your passwords should be random phrases of random words, hopefully with a few random numbers or symbols sprinkled about. It’s the most entropy you can get that’s also easy to remember.

However, generating your own ‘correct horse’ password is generally a bad idea. Humans are terrible at coming up with random bits of information. Thankfully, the EFF has come up with a wordlist containing 7,776 random words (65, or five rolls of a six-sided die.) ready for the next time you reset a password.

[m145mcc] thought the EFF’s word list should be a book, so he made it a book. With the clever application of a laser printer, glue, thread, and some card stock, [m145mcc] has a handy password generator that fits in his pocket. All that’s needed to build a password is a single die, a pen, and some patience.

The EFF’s random passphrase list is based off [Arnold Reinhold]’s Diceware list from 1995, but has a few changes to make the list easier to use and more palatable for the audience they’re going for. Most significantly, vulgar words were removed from the Diceware list, as the netsec crowd doesn’t swear as a rule. Additionally, numbers were removed, along with rare and unusual words. The passwords generated by the EFF’s list are longer, but they are arguably more memorable.

Despite the idea of a random dice-based password list being around for two decades, there are few if any examples of this list in dead tree format. The idea of a bound version of this list is a great idea, and we’re glad [m145mcc] could bring it to the table.

87 thoughts on “Correct Horse Battery Staple: The Book

  1. This would be great if sites would actually let you use them.

    Most require a mix of case, letters and symbols. Some even check that your password doesn’t contain english words.

    The problem is if you have no restrictions, people choose stupidly easy passwords. But the more restrictions you add, the more you shrink the search space of possible passwords.

      1. I had (emphasis on had) a bank that restricted passwords to between 6-8 characters. Upper case, lower case, and numbers. No symbols allowed.

        I realize they will lock out your account from too many retries if someone is trying to guess it live, but making such tight restrictions means if anyone gets the opportunity to do an offline attack, those passwords will be found pretty quick, hashed or not.

        1. My bank required the first 3 characters to be numbers as a quick nip on ~y2k mobile phones (remembre those 1c per page ?). But they also use my card number as a login, “non dynamic kinda 2 factor auth” on the web. I never bother to change this one. I’m much more careful on all accounts where I can login with my everyday email…

        2. – On the opposite side of this, I had a bank account at a place that had case insensitive login validation… I happened to notice when I miss-entered my mixed-case password and it let me in. -They seemed to think it wasn’t an issue when I brought it up, mentioning ‘convenience’ factor. Seriously, from a financial institution?

        3. Then heaven forbid you forget it and their security questions are stuff that an awfulk lot of people put on facebook. I’ve only ever seen one site where I got to set the question and the answer.
          Concequently my answers to questions such as “what is your first school” have nothing to do with the actual answer.

          1. I’ve been known to type in a couple dozen random characters as the answer to those security questions and not even pay attention to what I typed. I’d rather render that feature useless than leave it a back door with a far more easily guessed password.

          2. I have switched to using rando-generated chunks for answers as well (if allowed free text entry). I had to read it to a customer service person once which was funny… “What was the mascot of your highschool?” “uh, 9xefFlIj392″… (Because the real answer can be found on facebook in about 10 seconds)

    1. My bank rejected my machine generated password that was 24 characters of upper and lower case letters, numbers, symbols and special characters so I used ‘password01’ because it’s just as much inconvenience for them when an account gets hacked. Lets hope they learn.

      I lot of places seem to have restrictions on passwords now and that in itself is giving hackers a specification for the entropy behind so they can pre-tailor there brute force engine or rainbow/hash tables to maximise the effectiveness and minimise the time taken. It has become the case that the greatest threat to security on some web sites is the person who chooses the password restrictions.

      I just started a new shared hosting plan recently the the password was restricted to 8 character of alpha-numeric, no symbols or special characters. I won’t be renewing that plan as it will be a honey-pot for resource abusers like spammers and on a shared server that means everyone suffers.

      Two factor authentication is an improvement for some but I don’t have a mobile phone and they don’t mention that a mobile is required before you sign up. They need more ‘factors’ so that people can choose two or three to suit their situation.

      I was rejected from yet another hosting provider because I use a VPN with DNS leak protection. Quite frankly the idiot that decided that, or any idiot that decides to make it harder for legitimate applicants AND hackers is just a complete fool because legitimate applicants will go somewhere else and hackers will persevere. Guess what happened there.

      1. When I come across a site that limits the length of the password, I assume one of these things has happened:

        * That’s some kind of database column width limit, in which case the site is probably storing the plaintext password, or

        * It’s just some web page field width limitation, and the UX inmates are running the asylum.

        There could be other reasons, but I feel pretty superior if I limit my thinking to those two. :-)

        (As for MFA without a mobile phone … if I were a bank, I’d just assume anyone without a mobile phone will drive their horse and buggy into a nearby branch. :-)

        1. My bank has a ‘rolling code’ type printout you use to login, but they are going to force a switch to a mobile code app next month. Problem is that it won’t work if your phone is rooted or jailbroken…

    2. My bank demanded that my password was at least 16 characters long, and no longer than 40 characters. I just used a book title. And every time I log in, I have to enter 5 random characters. My wife uses different bank, which limits passwords to 20 characters and she has to enter the same 5 characters each time she logs in. IIRC, we had a bank that was keeping passwords in plain text. Not mentioning many government sites, such as national heath database that lets citizens check, what kind of procedures and medications they had, with classic SQL injection hole. Anyone was able to read entire database just by sending requests with incrementing our version of SSN. No password needed after one had log on to the system…

  2. With the number of words both in and not in the dictionary, it seems the task of creating a four-word combination that can be easily memorized is really not all that difficult (or one would think.)

    Why is it that you say humans are bad at generating passwords like that? I’d argue the current meta of requiring a capital, a number, a lowercase, and a special character leads to people recycling the same password that meets all of those horrible requirements, and thus negates any benefit that such a system would offer. The Correct Horse example sets a pretty easy standard for creating passwords, you can adjust your passwords by changing one or two words, leading to more password diversity.

    To me, being an IT guy, I am all for the process. In my company I’ve updated the password standard to remove the upper/lower/number/special character requirements and change the minimum length to 16 characters. Entropy goes up, and the number of times I reset a password in AD has drastically been reduced. Win-Win.

    1. And, to add to that, five dice rolls (2.8e19 possible combinations) is both 1) harder to remember than four random words chosen freely by the user, and 2) makes it just that much easier for an algorithm to guess in comparison to four freely chosen words (4.9e22 possible combinations just counting what’s in Webster’s). So you would recommend adding another word to the password and reducing the strength by a factor of nearly 2,000 simply to shorten the process of coming up with a password. Seems illogical.

      1. The problem is that most people will “freely choose” from a much smaller subset of words, and people are terrible RNGs.

        Desk Door Floor Ceiling, for example, is fairly easy to guess…

        1. Spectacularly easy to make the password a lot more powerful though, for example by adding some obscure modifiers: e.g. “Fuscous Floor Callipygian Ceiling”. Not hard to remember for a human being (alliteration is easy), but pretty difficult to get through dictionary brute forcing due to the tremendous list of words you will need if you’re going to include rarely used words like ‘fuscous’ or ‘callipygian’.

          1. That’s still just four dictionary lookups where something like @#%^&(:”{{})(*)( has millions of permutations because the symbols don’t ever form into words so you have to test every symbol in every position.

    2. Humans tend to follow semantics, so they generate phrases that have predictable meaning. E.g if the first word is “lazy” then the second word is unlikely to be “zoonosia”. That helps you narrow down the list of possible passwords and test a bunch of most probable ones in a short amount of time.

    3. Requiring certain subsets of all possible characters (e.g obligatory numerics, or upper case letters. or lower case letters, or special characters) to be presented in a password is counter-purpose, because it makes legal combinations space much smaller. Passwords should be truly random, but without such constraints.

  3. i have some childhood songs lyrics where i write down every word’s first letter, so at the end i have 94bit password (for example), some stupid site use to say that my password is not strong, cause i only use lowercase letters, but if you calculate there requirement for a strong password, you end up much less of it :)

    1. That works pretty well until somebody notices that you hum the same tune whenever you unlock your computer. :P
      I’m just jabbing, I really like that idea!

      Foo Fighters song… hiwhfyetitmi (Everlong)
      I can see some problems with that one, but would choose a more suitable tune.

    2. This is a good method, and very easy to recall. I have a similar way, using the first letter of each syllable in a short phrase or lyric. Makes a pleasing routine of typing in the password, with each keypress clicking to the rhythm in your head.

  4. “Most significantly, vulgar words were removed from the Diceware list, as the netsec crowd doesn’t swear as a rule.”

    Are some of these words blacklisted from password entry on some sites? Might have actually been a good idea to remove them just in case. Also makes for a more appropriate coffee-table book IMO. I know a few people who can’t play a game of Scrabble without laughing at the ‘dirty words’ in the dictionary; it gets old quick.

  5. Doesn’t having a list of passwords that you randomly pull from automatically exclude those from being used to create a safe password?
    Random is random until you limit the source.

      1. Good Idea,
        but stick to common character locations. the same OS when and after logging into KDE instead of LXDE the @ becomes
        a double-quote and vice versa. Also despite a UK standard
        (may also happen in USA) the boot sales are teaming with
        mixed keyboard layouts (mainly USA and UK installed keys)
        where the @ and double-quote keys are swapped.

    1. If your password is 4 of these words long there something like 3.6 x 10^15 possible passwords, and the cracker would have to know you used only words from this book to narrow it down to that, if they didn’t know that and used a normal character set it and brute forced it would be a lot, like north of 10^50.

  6. Good/strong and easy to remember passwords are mutually exclusive. I have over 200 online accounts, each with a different password. No amount of horse battery stapling is going to accommodate the best practice of having a unique, randomly generated and strong password for each login.

    Luckily the concept of login/password challenges for authentication is slowly becoming obsolete.

    1. “Luckily the concept of login/password challenges for authentication is slowly becoming obsolete.”

      Obsoleted by what? There is no single factor that adequately replaces the good old UN/PW combo, and there’s especially no factor that replaces it that cannot be subpoenaed.

      1. > Obsoleted by what?

        certificate, token, muti-factor, biometric… And I did say slowly, passwords are not completely obsolete, yet.

        >that cannot be subpoenaed.

        There’s an XKCD that is a little bit more on point than horse battery stapler that addresses this point:

        1. >that cannot be subpoenaed. Biometric can be subpoenaed, and easier than using a wrench.
          Muti-factor, or multi-factor – doesn’t describe a replacement for username and password.
          tokens and certificates, depends on who’s making them, taking them, and how.

    2. Nothing better than people finding your password written down because you couldn’t remember it, or automatically being logged in because you have to use chrome to remember your passwords for you, or you being completely unable to do anything on a different computer because you don’t have your password protected document of passwords with you.

      1. I have three or four passwords memorized. None of them are written down. I have no idea what 99% of them actually are and haven’t for several years. I’ve yet to encounter the situation you described, being unable to do anything because I don’t have access to my passwords. The worst thing that happens is I forget to put a new login into my password manager, and then I have to request a password reset.

        I just checked, and I have 439 logins. The only way to have strong, memorable passwords for that many logins is to only have a handful of passwords reused many, many times. Considering the frequency with which companies have their login databases stolen (and just the ones we know about), re-using passwords across multiple sites is really poor security. When LinkedIn had their password hashes stolen, that was the last straw for me.

        1. ” The only way to have strong, memorable passwords for that many logins is to only have a handful of passwords reused many, many times.”

          Or the method I propose: A good strong base password, let’s say 9fj1rm3a% Then for each site add a two letter abbreviation for that site to the beginning or end. Hack A Day could be HD9fj1rm3a%, Amazon is AM9fj1rm3a%, etc. You only memorize one difficult base, but each account has a different password so if one is compromised, the rest stand.

          1. 1. I have over 400 passwords. Using your 2 letter rule I would have collisions, and I still wouldn’t be able to remember all of them. I mean Amazon is AM, so was American Express AX? Or AE? No… Aetna is AE… and I used AX for Aramex….

            2. In theory your scheme severely undermines your password security. If two of your logins are leaked (ever, there are big caches of them you can go download) and you use the same login name/email address most of the time, it may become apparent what your base password and your scheme is, and then we could just write a little script to try your login on all banking, finance, email, etc sites.

            3. I have passwords for all kinds of servers, routers, things that only have private IP addresses, etc. your scheme doesn’t work well or at for those.

          2. You do know that stealing the login hashes does not directly reveal your password, right??

            Because there are multiple possible passwords for each hash, it’s not possible to go backwards from hash to password -unless- you were using dictionary words and the codebreaker happens to guess the same word. Otherwise they get a password that opens that particular account, but isn’t the password you entered – hence your other sites are safe.

          1. It was meant for [Josh]

            Password managers (or using a browser) are a catastrophe waiting to happen because people don’t back them up. Firefox has a sync now but your trusting your passwords to someone else.

            Hence the tick tick that comes immediately before the hard drive fails and you loose ALL your passwords FOREVER!

          2. Well, lets see. I have a copy cached locally on my cell phone and on two laptops, and in the cloud on Apple cloud backup, and on my password manager service’s servers (LastPass). I’m willing to bet they have my encrypted password vault backed up several times over.

          3. @Xerox probably more secure than a post it note etc. And, you’re right, there is a good chance (I think it actually happened to LastPass?) that the lists will be stolen. But LastPass encrypts them with some really heavy duty crypto. And they don’t have the keys. So I could probably make the file public and not be worried at all.

    1. It is a very easy to guess password:
      \/\/3’r3 |\|0 $7r4|\|93r$ 70 L0\/3 j00Z |<|\|0\/\/ 7|-|3 rUL3$ 4|\|D $0 d0 1 4 PhULL (0/\/\/\/\17/\/\3|\|7'$ \/\/|-|@ 1'/\/\ 7|-|1|\||<1|\|9 0Ph j00Z \/\/0ULD|\|'7 937 7|-|1$ Phr0/\/\ 4|\|'/ 07|-|3r 9U'/ 1 jU$7 \/\/4|\||\|4 73LL j00Z |-|0\/\/ 1'/\/\ Ph33L1|\|9 90774 /\/\4|<3 j00Z U|\|D3r$74|\|D.

  7. And then people just guess the ‘security questions’ / ‘I forgot my password hint’.
    Could they please get rid of those! All my forgot password hints are basically another password. Not going to get me that way, hopefully.

    1. There is one account of mine where the password requirement is so obscure and I use it so little I just mash the keyboard until it says the password is okay and login with the “I forgot my password” option every time.

      1. Yep, and then there’s at least one CAPTCHA that you have to dig through, and most often the CAPTCHA is hosted by Google which requires layers and layers of scripting enable for your Web Browser to even comply. So even if you get past the ridiculous password gate-keeper, just complying with the Goodle CAPTCHA has opened up your machine to Google crapping all over you in terms of tracking (at least). What a disaster. I have been on sites that as soon as I allow scripting to enter a password, more than THIRTY other layered third-party sites pop-up demanding that they get scripting access on my machine before I can even get to the second layer of scripting to enter the password. Then there’s more an more layers of scripting to be enabled before I can get past the CAPTCHA. Needless to say, I never deal with these sites any more.

    1. Should purposely mark a few pages, and not use those pages as passwords.
      I used to leave a false password under my desk, worked very well and I could tell when a ‘friend’ had been snooping around; post-its tend to collect fingerprints.

  8. I have said, and will continue to say (stop shouting me down! :-)) that the horse stapler thing is dumb. Sure, it’s easy to remember one of them, maybe even a few of them. But you really should use different passwords for different things, and I don’t think anybody is going to remember 100 of them any better than remembering 100 randomly chosen strings. So, you should use a password manager. And, if you’re using a password manager, it probably has a password generator built in. Just use the tools and forget the nonsense.

      1. Sure, that happens. The password manager I use has browser plugins, a web site, and a phone app that give me access to all of my passwords. It would be pretty rare for me to be unable to use at least one of them. The browser plugins and the mobile app work even when offline.

        Yes, I have to trust the vendor when they say they only see encrypted copies of my password vault as it passes through their hands among the devices, and, yes, I have to remember to fairly complex master password I chose for the password manager.

        (There are a couple of frequently used password — my work SSO, for example — that I do type from memory, just for my own convenience. I typically use the Linux utility “pwgen” to create a bunch of passwords, and then I pick one that I think I can remember.)

  9. One aspect of the obsolescence of passwords mentioned above is that some sites now permit you to *always* generate a 1-off password that is emailed to your account. I find that tedious… but sometimes less tedious than opening my password vault spreadsheet (which has a password on it).

    1. Nope, should be:

      Because it looks and sounds wrong, and probably deter the attacker just in case he finds your PC to be a shock-site host/mirror.

  10. “…vulgar words were removed from the Diceware list, as the netsec crowd doesn’t swear as a rule.”

    That’s the funniest thing I’ve read all week. Thanks for the laugh!

  11. PFFT!

    I remember a a bunch of 25 to 32 character long random letter-number combos.

    LOL one of my oldest passwords was literally a college license key for
    XP SP2 combined with the only part of a Win98SE key I could remember
    and I still remember it because I reinstalled quite a few friends XP PCs until
    they learn’t not to download every Super-Ultra-Extreme-Pro-Edition advert
    they came across!!!
    Yes I was a badboi Windows Pyrate back in the XP-is-popular days.

    I still reel off said first password to people who I know as part of the Useless Things We Remember type conversations.

    1. In the years of Win95 the test on the key was to add the digits of the number giving a result and taking the result as the number – rinse repeat until you have a single digit that had to be 4.

      I noticed this because I did a *lot* of network work-station installs and didn’t tell anyone.

      If any of the other engineers wanted a product key then I would just rattle off random digits for them keeping a mental tally so I knew what the last digit needed to be.

      1. I should add that there were licensed copies of windows stored.

        On a network of hundreds of workstations you only open one copy load it up on a test workstation add all the applications then take a ghost image that you use to load the hundreds of work-stations from. The other hundreds of copies get thrown in a storage area where you couldn’t bothered going to just for a licence key.

Leave a Reply

Fill in your details below or click an icon to log in: Logo

You are commenting using your account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s