The Big List Of Naughty Strings Helps Find Those User Input Problems

Any software that accepts user input must take some effort to sanitize incoming data, lest unexpected and unwelcome things happen. Here to make that easier is the Big List of Naughty Strings, an evolving list of edge cases, unusual characters, script-injection fragments, and all-around nonstandard stuff aimed at QA testers, developers, and the curious. It’s a big list that has grown over the years, and every piece of it is still (technically) just a string.

These strings have a high probability of surfacing any problems with handling user input. They won’t necessarily break anything, but they may cause unexpected things to happen and help point out any issues that need fixing. After all, many attacks hinge on being able to send unexpected inputs that don’t get properly sanitized.

Finding bad inputs is not always entirely straightforward, but at least the Big List of Naughty Strings is available in a variety of formats to make it easy to use. [Max Woolf] has been maintaining the list for years, but if you haven’t heard of it yet and think it might come in useful, now’s the time to give it a look. Now you can help ensure your system can handle things like someone registering a company named ; DROP TABLE “COMPANIES”;– LTD.

14 thoughts on “The Big List Of Naughty Strings Helps Find Those User Input Problems

  1. i don’t think the blacklist approach is proper. for strings that are just strings, they should be passed through as binary blobs without inspection (using the correct argument substitution mechanism if they’re going into your database). for strings that are something to the program, an index or key or date or whatever, you should use a whitelist. if it doesn’t match your expected format, it should be rejected.

    the only sticky scenario is if you’ve got to output it to the user in html, then you do need an html escape routine. but those are robust and well-tested by now and you can just trust them.

    1. I don’t think anyone was recommending a blacklist. More of a headstart on properly debugging and/or sanitizing your inputs.

      Namely, see if when and how your program fails when presented with unexpected inputs.

      I do like your whitelist idea, at that point its basically a multiple choice field though.

      1. true. i just feel like the only reason you’d want such a testing data set is if you wanted to use it as a crutch to make up for overly-complicated input processing.

        and “multiple choice field” is a great way to put it!

    2. I don’t think the whitelist approach is proper either.
      You should use a whitelist together with testing and verification of that whitelist to ensure it is working as intended and expected.

  2. Wow – I’m literally working on a work project this is going to be super useful for. My predecessor had very minimal string validation and what there is, you can just comment out in the browser. I’ve already found 2 new vulns this morning!

  3. once i put “firstname” (no quotes) as firstname and “lastname” (no quotes) as lastname…
    system thought i was developer and tried to load files i did not have… with no error message.
    so i had paid and they said it was company policy not to refund even if tech problems were thier fault,
    i was going to fail with no recourse for tuition refund (past drop-out refund date).

    after emailing tech-support back and fourth for DAYS while INSIDE the help center and bringing in one of the school’s laptops to the school’s in-person tech support to ask the school why thier software did not work on thier computers and they still could not help me for over a week i finally realised why.

    a “bug” on the website said that i could put ANYTHING into those boxes because my tuititioin (school) had paid for the digital-textbook in advance (did NOT have to use my real name), it FAILED to mention that it was a bad idea to put Firstname Lastname

    so one day i walked in there and told them off, demanded for them to reset my account to STUDENT, explained WHY i had accidently set my account to developer mode, and that if it was not fixed soon i would take legal action for failing to deliver, refusing a refund and for discrimination.

    10am thee next day it magically started working.

    BTW: “new” is not a version of windows, and customers of software products are ALWAYS entitled to know what version(s) of windows are required for a product to function, BY LAW.

Leave a Reply

Please be kind and respectful to help make the comments section excellent. (Comment Policy)

This site uses Akismet to reduce spam. Learn how your comment data is processed.