Many of us like a keyboard with a positive click noise when we type. You might want to rethink that, though, in light of a new paper from the UK that shows how researchers trained an AI to decode keystrokes from noise on conference calls.
The researchers point out that people don’t expect sound-based exploits. The paper reads, “For example, when typing a password, people will regularly hide their screen but will do little to obfuscate their keyboard’s sound.”
The technique uses the same kind of attention network that makes models like ChatGPT so powerful. It seems to work well, as the paper claims a 97% peak accuracy over both a telephone or Zoom. In addition, where the model was wrong, it tended to be close, identifying an adjacent keystroke instead of the correct one. This would be easy to correct for in software, or even in your brain as infrequent as it is. If you see the sentence “Paris im the s[ring,” you can probably figure out what was really typed.
We’ve seen this done before, but this technique raises the bar. As sophisticated as keyboard listening was back in the 1970s, you can only imagine what the three-letter agencies can do these days.
In the meantime, the mitigation for this particular threat seems obvious — just start screaming whenever you type in your password.
Wouldn’t those mics that focus on the speaker and minimize the background cut down on this?
Really depends on the methods they are using and the point in the process they are listing to – if you have the raw from the Microphone(s in the array) and its using software the keyboard sounds will be there just fine, probably even better than fine in the case of an array as you can further figure out the keyboard position and keys pressed from the directional data.
But if its a physical system like a really directional pickup, or the mic is so close to your mouth the gain is turned way way down, or you only get the post processing data from the software in all those cases its not picking up the background sounds nearly as well, so the data hopefully won’t be there to process.
Not saying you can’t still pull it off with the ‘noise cancelled’ microphone setups though – it probably hasn’t taken out the keyboard sounds enough to make it impossible, just trickier. And it is already a tricky challenge with such a wide variety of keyboard switch out there that sound different anyway and wear in different ways and usually audio compression and all the other ‘background’ noise to filter through as well.
If you’ve got raw sound from the array you’re running code on the target’s computer already. It’s all processed to a mono signal before it’s sent to a conference call.
>It’s all processed to a mono signal before it’s sent to a conference call.
Not always – I’ve heard of places that have surround sound setup and are creating spacial distance for every speaker so two tables around a conference call actually feel a bit more natural. So you may not be getting the raw still, could well be shipped pre-encoded DTS stream or something. Which would make sense but doesn’t actually remove the spacial audio advantages.
And it doesn’t have to be the microphone for the computer that keyboard is connected to either – your Amazon Echo type devices have microphone arrays in them, the nearby smartphone etc. You may not be running code on their ‘secure’ devices at all.
Instead of screaming :) , just ‘mute’ your end while typing passwords. When we are in meetings, we always mute anyway. Otherwise everyone gets distracted by background noises.
Sadly, this doesn’t mitigate a repressive government turning on your smart phone’s mic and listening in.
once they get their foot that far past my door, they can have my passwords :)
Well the only thing they’ll discover is that all my passwords are ctrl+v.
Same, i cant think of a single time I’ve typed a password while on a call, I rarely do it at all.
Finally my choice of silent switches is justified.
Isn’t a common obfuscation technique to substitute letters ? Such obfuscation would be strengthened by a spy technique that relies on filling in the blanks.
password123!!!
Considering most all passwords I ever enter while logged in generate one or two mouse clicks, this isn’t that much of a threat. I’me surely nopt the only person who uses browser, OS or other app-based password management?
And who is connected to a Zoom meeting before they log in?
Ah, missed the ‘conference call’ bit….so I guess it’s plausible. Still if I’m on a conference call and will be using my conmputer, again, I’m normally logged on.
You should be aware of a malware category called Stealers (yeah I know, I didn’t name ’em).
Specifically Redline and Racoon (Raccoon) Stealer are extremely popular right now, although there are many others.
They execute incredibly quickly and exfiltrate all of your passwords from every profile of every browser you’ve used, including password extensions, session cookies, password storing applications, etc. Accidentally execute the wrong malware (even via viewing something via email with a zero day) and within hours (at most), all of your credentials are for sale (as a collection) on the Russian dark web for $10 (yes, odd price, but it’s consistent).
> who is connected to a Zoom meeting before they log in?
Defensive infrastructure doesn’t have you log in to the desktop and suddenly all the intranet is available for duration of your desktop session.
My desktop session never times out if I keep using the mouse.
I will have to log to Azure (Microsoft SSO), LastPass, and GitHub during many of my meetings. If my IP changes during the meeting, Teams quickly reconnects unless the security decides that merits another SSO login.
But does it work with any keyboard without extra training? Because if you have to train the net for each specific keyboard, it’s quite useless…
Probably relies on a bit of pretraining using statical analysis of regular keystrokes, e.g. from typing e-mails and such.
Yeah, I’m not buying this. But if it were real a way around it would be to use passwords in another alphabet or simply buy a clickless keyboard. Seem to recall Macs have those. Speaking of Macs you could go to the alphabet/unicode pane and mouseclick.
good thing I type with a unique cadence from teaching myself to type. i had no idea what “home row” or the little nubs on F and J were until I could type 100+ WPM
FYI because keyloggers, the gold standard for password entry is on screen keyboard with randomized location, size and key layout. Also screengrab prevention.
_Hunt_ and click. Fun!
dvorak
This is just FUD from the password manager vendors.
Like they aren’t ‘owned’ by the NSA/Mossad/GCHQ/DGSE etc etc.
Gotta wonder, do the spooks leave each others copies of backorifice installed. Professional courtesy?
Anybody remember when DCs cell system fell over because of all the stingrays intercepting each others traffic? That was funny.
EncryptPad and entering a password is “copy paste” only.
I **love** Encryptpad because I can organize my password stuff in a “chaotic” text file way.
Just saying.
This gives me an idea to confuse my friends by using an ancient typewriter as my keyboard with only a hidden microphone to connect it to a computer.
I suppose you could also record some typing and play back randomly some random keypresses to confuse any listener.
But as the article says: nobody thinks of doing anything about sound really. Except a very few of course, Snowden and Zuckerberg and such. And probably Bezos learned his lesson.
Now I’m reminded of the Gene Hackman movie The Conversation.
And ‘Enemy of the State’ too I guess.
Appropriate that his name is ‘hackman’ actually.
Huh. Excellent question that was never answered. How?
From what I can tell there’s no difference between a “4” or an “g” in terms of sound on a keyboard.
They’re in different parts of the keyboard (so they resonate different), probably pressed with different fingers, definitely different distances of reach so different angle of impact…
It seems entirely plausible, if somewhat dependent on having training data that is close enough.
Huh, Interesting I sent this as a reply.
Anyway. I can sorta see that.
Wonder if inserting backspace in your passwords would throw the algorithm off
Well, the cleaning crew can leave behind a microphone too.
What kind of psycho types unmuted on a loud mechanical keyboard during a call
Me. I like the sound of mechanical keyboards, it’s relaxing. And you can dose the amount of pressure on the key so that it’s barely audible or loud as a bang, even in call.
More than just passwords! Now we can find out what the heck airline agents at a ticketing counter are writing for so long! Or what kind of internal notes a customer service representative is making about us on a call.
Having listened to a lot of talk radio I cringe when I have to strain to hear call-ins who aren’t using the same mic location as the host. The “reverb” of a cubicle is worst of all. Use a boom or hand held like a telephone, close. Headset mic or lavaliere worn mic in a quiet acoustical treated location acceptable. No desk placed, monitor frame, or built-in appliance mics (speakerphone) please. This would help everyone. Covid has exacerbated this. Should throw a spanner in this decoding work. Pick up the mic and lower the gain otherwise everything else gets picked up. No way to lower the gain? Bad hardware.
This reminds me of a demo I saw a few years ago, where someone from a government (was either UK or US, been a while) hacking team was able to tell when (to the second) and where (rough area) a video was taken, based on background noise of a video, caused by the power grid, due to fluctuations. Those fluctuations are only getting worse due to solar and wind, making this technology even easier as time goes on.
> Anybody remember when DCs cell system fell over because of all the stingrays intercepting each others traffic? That was funny.
I don’t remember hearing about this. Sounds hilarious. Do you have a link?
I don’t remember that specifically, but I do remember reading about someone finding a lot of undocumented cell towers all around DC, presumably sting rays placed by various agencies… maybe some foreign ones even?
Or maybe, instead of screaming or muting your microphone, just use a password manager – even better, with a hardware token (*): you won’t “type” anything hence there won’t be anything to listen.
(*) the hardware token eliminates the need for a “master password” to be typed to unlock the password manager, in case it wasn’t unlocked before the start of the conversation.
Note that all the big tech companies now have tons of microphones in (and out of) people’s house, the earbuds have them, the remotes have them, the ‘assistants’ have them and of course the smartphones do too.
Note also that it was exposed that the spooks were one of the customers that bought user data.
So do you need a conference really? nah
should passwords then be SQL Commands?
like in this classic: https://xkcd.com/327/