Unicode, the wonderful extension to to ASCII that gives us gems like “✈”, “⌨”, and “☕”, has had some unexpected security ramifications. The most common problems with Unicode are visual security issues, like character confusion between letters. For example, the English “M” (U+004D) is indistinguishable from the Cyrillic “М” (U+041C). Can you tell the difference between IBM.com and IBМ.com?
This bug, discovered by [John Gracey] turns the common problem on its head. Properly referred to as a case mapping collision, it’s the story of different Unicode characters getting mapped to the same upper or lowercase equivalent.
'ß'.toLowerCase() === 'SS'.toLowerCase() // true
// Note the Turkish dotless i
'John@Gıthub.com'.toUpperCase() === 'John@Github.com'.toUpperCase()
GitHub stores all email addresses in their lowercase form. When a user sends a password reset, GitHub’s logic worked like this: Take the email address that requested a password reset, convert to lower case, and look up the account that uses the converted email address. That by itself wouldn’t be a problem, but the reset is then sent to the email address that was requested, not the one on file. In retrospect, this is an obvious flaw, but without the presence of Unicode and the possibility of a case mapping collision, would be a perfectly safe practice.
This flaw seems to have been fixed quite some time ago, but was only recently disclosed. It’s also a novel problem affecting Unicode that we haven’t covered. Interestingly, my research has turned up an almost identical problem at Spotify, back in 2013.
Continue reading “This Week In Security: Unicode, Truecrypt, And NPM Vulnerabilities” →