Maybe we really were better off with ASCII. Back in my day, we had space for 256 characters, didn’t even use 128 of them, and we took what we got. Unicode opened up computers to the languages of the world, but also opened an invisible backdoor. This is a similar technique to last week’s Trojan Source story. While Trojan Source used right-to-left encoding to manipulate benign-looking code, this hack from Certitude uses Unicode characters that appear to be whitespace, but are recognized as valid variable names.
const { timeout,ㅤ} = req.query;
Is actually:
const { timeout,\u3164} = req.query;
The extra comma might give you a clue that something is up, but unless you’re very familiar with a language, you might dismiss it as a syntax quirk and move on. Using the same trick again allows the hidden malicious code to be included on a list of commands to run, making a hard-to-spot backdoor.
The second trick is to use “confusable” characters like ǃ, U+01C3. It looks like a normal exclamation mark, so you wouldn’t bat an eye at if(environmentǃ=ENV_PROD){
, but in this case, environmentǃ
is a new variable. Anything in this development-only block of code is actually always enabled — imagine the chaos that could cause.
Neither of these are ground-breaking vulnerabilities, but they are definitely techniques to be wary of. The authors suggest that a project could mitigate these Unicode techniques by simply restricting their source code to containing only ASCII characters. It’s not a good solution, but it’s a solution. Continue reading “This Week In Security: Unicode Strikes, NPM Again, And First Steps To PS5 Crack”