The Tens Of Millions Of Faces Training Facial Recognition; You’ll Soon Be Able To Search For Yourself

In a stiflingly hot lecture tent at CCCamp on Friday, Adam Harvey took to the stage to discuss the huge data sets being used by groups around the world to train facial recognition software. These faces come from a variety of sources and soon Adam and his research collaborator Jules LaPlace will release a tool that makes these dataset searchable allowing you to figure out if your face is among the horde.

Facial recognition is the new hotness, recently bubbling up to the consciousness of the general public. In fact, when boarding a flight from Detroit to Amsterdam earlier this week I was required to board the plane not by showing a passport or boarding pass, but by pausing in front of a facial recognition camera which subsequently printed out a piece of paper with my name and seat number on it (although it appears I could have opted out, that was not disclosed by Delta Airlines staff the time). Anecdotally this gives passengers the feeling that facial recognition is robust and mature, but Adam mentions that this not the case and that removed from highly controlled environments the accuracy of recognition is closer to an abysmal 2%.

Images are only effective in these datasets when the interocular distance (the distance between the pupils of your eyes) is a minimum of 40 pixels. But over the years this minimum resolution has been moving higher and higher, with the current standard trending toward 300 pixels. The increase is not surprising as it follows a similar curve to the resolution available from digital cameras. The number of faces available in data sets has also increased along a similar curve over the years.

Adam’s talk recounted the availability of face and person recognition datasets and it was a wild ride. Of note are data sets by the names of Brainwash Cafe, Duke MTMC (multi-tracking-multi-camera),  Microsoft Celeb, Oxford Town Centre, and the Unconstrained College Students data set. Faces in these databases were harvested without consent and that has led to four of them being removed, but of course, they’re still available as what is once on the Internet may never die.

The Microsoft Celeb set is particularly egregious as it used the Bing search engine to harvest faces (oh my!) and has associated names with them. Lest you think you’re not a celeb and therefore safe, in this case celeb means anyone who has an internet presence. That’s about 10 million faces. Adam used two examples of past CCCamp talk videos that were used as a source for adding the speakers’ faces to the dataset. It’s possible that this is in violation of GDPR so we can expect to see legal action in the not too distant future.

Your face might be in a dataset, so what? In their research, Adam and Jules tracked geographic locations and other data to establish who has downloaded and is likely using these sets to train facial recognition AI. It’s no surprise that the National University of Defense Technology in China is among the downloaders. In the case of US intelligence organizations, it’s easier much easier to know they’re using some of the sets because they funded some of the research through organizations like the IARPA. These sets are being used to train up military-grade face recognition.

What are we to do about this? Unfortunately what’s done is done, but we do have options moving forward. Be careful of how you license images you upload — substantial data was harvested through loopholes in licenses on platforms like Flickr, or by agreeing to use through EULAs on platforms like Facebook. Adam’s advice is to stop populating the internet with faces, which is why I’ve covered his with the Jolly Wrencher above. Alternatively, you can limit image resolution so interocular distance is below the forty-pixel threshold. He also advocates for changes to Creative Commons that let you choose to grant or withhold use of your images in train sets like these.

Adam’s talk, MegaPixels: Face Recognition Training Datasets, will be available to view online by the time this article is published.

MIDI Controller In A Concertina Looks Sea Shanty-Ready

Did you know that the English concertina, that hand-pumped bellows instrument favored by sailors both legitimate and piratical in the Age of Sail, was invented by none other than [Sir Charles Wheatstone]? We didn’t, but [Dave Ehnebuske] knew that the venerable English gentleman was tickling the keys of his instrument nearly two decades before experimenting with the bridge circuit that would bear his name.

This, however, is not the reason [Dave] built a MIDI controller in the form of an English concertina. That has more to do with the fact that he already knows how to play one, they’re relatively easy to build, and it’s a great form factor for a MIDI controller. A real concertina has a series of reeds that vibrate as air from the hand bellows is directed over them by valves controlled by a forest of keys. [Dave]’s controller apes that form, with two wind boxes made from laser-cut plywood connected by a bellows made from cardboard, Tyvek, and nylon fabric. The keys are non-clicky Cherry MX-types that are scanned by a Bluefeather microcontroller. To provide some control over expression, [Dave] included a pressure sensor, which alters the volume of the notes played depending on how hard he pushes the bellows. The controller talks MIDI over Bluetooth, and you can hear it in action below.

We’ve seen MIDI controllers in just about everything, from a pair of skate shoes to a fidget spinner. But this is the first time we’ve seen one done up like this. Great job, [Dave]!

Continue reading “MIDI Controller In A Concertina Looks Sea Shanty-Ready”

This Week In Security: Censoring Researchers, The Death Of OpenPGP, Dereferencing Nulls, And Zoom Is Watching You

Last week the schedule for our weekly security column collided with the Independence Day holiday. The upside is that we get a two-for-one deal this week, as we’re covering two weeks worth of news, and there is a lot to cover!

[Petko Petrov], a security researcher in Bulgaria, was arrested last week for demonstrating an weakness he discovered in a local government website. In the demonstration video, he stated that he attempted to disclose the vulnerability to both the software vendor and the local government. When his warnings were ignored, he took to Facebook to inform the world of the problem.

From the video, it appears that a validation step was performed on the browser side, easily manipulated by the end user. Once such a flaw is discovered, it becomes trivial to automate the process of scraping data from the vulnerable site. The vulnerability found isn’t particularly interesting, though the amount of data exposed is rather worrying. The bigger story is that as of the latest reports, the local government still intends to prosecute [Petko] for downloading data as part of demonstrating the attack.

Youtube Censorship

We made a video about launching fireworks over Wi-Fi for the 4th of July only to find out @YouTube gave us a strike because we teach about hacking, so we can't upload it. YouTube now bans: "Instructional hacking and phishing: Showing users how to bypass secure computer systems"

In related news, Google has begun cracking down on “Instructional Hacking and Phishing” videos. [Kody] from the Null Byte Youtube channel found himself locked out of his own channel, after receiving a strike for a video discussing a Wifi vulnerability.

The key to getting a video unblocked seems to be generating lots of social media attention. Enough outcry seems to trigger a manual review of the video in question, and usually results in the strike being rescinded.

Improved Zip Bomb

A zip bomb is a small zip file that unzips into a ridiculously large file or collection of files. While there are obvious nefarious uses for such a file, it has also become something of a competition, crafting the most extreme zip bomb. The previous champion was 42.zip, a recursive zip file that when fully extracted, weighs in at 42 petabytes. A new contender may have just taken the crown, and without using zip file recursion.

[David Fifield] discovered a pair of ZIP tricks. First being that multiple files can be constructed from a single “kernel” of compressed data. The second is that file headers could also be part of files to be decompressed. It’s clever work, and much easier to understand when looking at the graphics he put together. From those two points, the only task left is to optimize. Taking advantage of the zip64 format, the final compression ratio was approximately 98 million to one.

Breaking OpenPGP Keyservers

OpenPGP as we know it is on the ropes. OpenPGP is the technique that allows encryption and verification of emails through cryptographic signatures. It’s the grandaddy of modern secure communication, and still widely used today. One of the features of OpenPGP is that anyone can upload their public key to keyservers hosted around the world. Because of the political climate in the early 90’s when OpenPGP was first developed, it was decided that a baked-in feature of the keyserver was that uploaded keys could never be deleted.

Another feature of OpenPGP keys is that one user can use their key to sign another user’s key, formally attesting that it is valid. This creates what is known as a “web of trust”. When an OpenPGP instance validates a signature, it also validates all the attestations attached to that signature. Someone has spammed a pair of OpenPGP certificates with tens of thousands of signatures. If your OpenPGP client refreshes those signatures, and attempts to check the validations, it will grind to a halt under the load. Loading the updated certificate permanently poisons the offline key-store. In some cases, just the single certificate can be deleted, but some users have had to delete their entire key store.

It’s now apparent that parts of the OpenPGP infrastructure hasn’t been well maintained for quite some time. [Robert J. Hansen] has been spearheading the public response to this attack, not to mention one of the users directly targeted. In a follow-up post, he alluded to the need to re-write the keyserver component of OpenPGP, and the lack of resources to do so.

It’s unclear what will become of the OpenPGP infrastructure. It’s likely that the old keyserver network will have to be abandoned entirely. An experimental keyserver is available at keys.openpgp.org that has removed the spammed signatures.

Beware the QR Codes

Link shorteners are a useful way to avoid typing out a long URL, but have a downside — you don’t know what URL you’re going to ahead of time. Thankfully there are link unshorteners, like unshorten.it. Paste a shortlink and get the full URL, so you don’t accidentally visit a shady website because you clicked on a shortened link. [Nick Guarino] over at cofense.com raises a new alarm: QR codes can similarly lead to malicious or questionable websites, and are less easily examined before scanning. His focus is primarily how a QR code can be used to bypass security products, in order to launch a fishing attack.

Most QR scanners have an option to automatically navigate to the web page in the code. Turn this option off. Not only could scanning a QR code lead to a malicious web site, but URLs can also launch actions in other apps. This potential problem of QR codes is very similar to the problem of shortened links — the actual payload isn’t human readable prior to interacting with it, when it’s potentially too late.

Dereferencing Pointers for Fun and Profit

On the 10th, the Eset blog, [welivesecurity], covered a Windows local priveledge escalation 0-day being actively exploited in the wild. The exploit highlights several concepts, one of which we haven’t covered before, namely how to use a null pointer dereference in an exploit.

In C, a pointer is simply a variable that holds a memory location. In that memory location can be a data structure, a string, or even a callable function. By convention, when pointers aren’t referring to anything, they are set to NULL. This is a useful way to quickly check whether a pointer is pointing to live data. The process of interacting with a pointer’s data is known a dereferencing the pointer. A NULL pointer dereference, then, is accessing the data referred to by a pointer that is set to NULL. This puts us in the dangerous territory of undefined behavior.

Different compilers, architectures, and even operating systems will potentially demonstrate different behavior when doing something undefined. In the case of C code on 32-bit Windows 7, NULL is indistinguishable from zero, and memory location zero is a perfectly valid location. In this case, we’re not talking about the physical location zero, but logical address zero. In modern systems, each process has a dedicated pool of memory, and the OS manages the offset and memory mapping, allowing the process to use the simpler logical memory addressing.

Windows 7 has a function, “NtAllocateVirtualMemory”, that allows a process to request access to arbitrary memory locations. If a NULL, or zero, is passed to this function as the memory location, the OS simply picks a location to allocate that memory. What many consider a bug is that this function will effectively round down small memory locations. It’s quite possible to allocate memory at logical address 0/NULL, but is considered to be bad behavior. The important takeaway here is that in Windows 7, a program can allocate memory at a location referred to by a null pointer.

On to the vulnerability! The malicious program sets up a popup menu and submenu as part of its GUI. While this menu is still being initialized, the malicious program cancels the request to set up the menu. By timing the cancellation request precisely, it’s possible for the submenu to still be created, but to be a null pointer instead of the expected object. A second process can then trigger the system process to call a function expected to be part of the object. Because Windows allows the allocation of memory page zero, this effectively hands system level execution to the attacker. The full write-up is worth the time to check out.

Zoom Your Way to Vulnerability

Zoom is a popular web-meeting application, aimed at corporations, with the primary selling point being how easy it is to join a meeting. Apparently they worked a bit too hard on easy meeting joins, as loading a malicious webpage on a Mac causes an automatic meeting join with the mic and webcam enabled, so long as that machine has previous connected to a Zoom meeting. You would think that uninstalling the Zoom client would be enough to stop the madness, but installing Zoom also installs a local webserver. Astonishingly, uninstalling Zoom doesn’t remove the webserver, but it was designed to perpetually listen for a new Zoom meeting attempt. If that sounds like a Trojan to you, you’re not wrong.

The outcry over Zoom’s official response was enough to inform them of the error of their ways. They have pushed an update that removes the hidden server and adds a user interaction before joining a meeting. Additionally, Apple has pushed an update that removes the hidden server if present, and prompts before joining a Zoom meeting.

Wireless Keyboards Letting You Down

Have you ever typed your password using a wireless keyboard, and wondered if you just broadcast it in the clear to anyone listening? In theory, wireless keyboards and mice use encryption to keep eavesdroppers out, but at least Logitech devices have a number of problems in their encryption scheme.

Part of the problem seems to be Logitech’s “Unifying” wireless system, and the emphasis on compatibility. One receiver can support multiple devices, which is helpful when eliminating cable clutter, but also weakens the encryption scheme. An attacker only has to be able to monitor the radio signals during pairing, or even monitoring signals while also observing keypresses. Either way, a few moments of processing, and an attacker has both read and write access to the wireless gear.

Several even more serious problems have fixed with firmware updates in the past years, but [Marcus Mengs], the researcher in question, discovered that newly purchased hardware still doesn’t contain the updated firmware. Worse yet, some of the effected devices don’t have an officially supported firmware update tool.

Maybe wired peripherals are the way to go, after all!

Snoopy Come Home: The Search For Apollo 10

When it comes to the quest for artifacts from the Space Race of the 1960s, few items are more sought after than flown hardware. Oh sure, there have been stories of small samples of the 382 kg of moon rocks and dust that were returned at the cost of something like $25 billion making it into the hands of private collectors, and chunks of the moon may be the ultimate collector’s item, but really, at the end of the day it’s just rock and dust. The serious space junkie wants hardware – the actual pieces of human engineering that helped bring an epic adventure to fruition, and the closer to the moon the artifact got, the more desirable it is.

Sadly, of the 3,000,000 kg launch weight of a Saturn V rocket, only the 5,600 kg command module ever returned to Earth intact. The rest was left along the way, mostly either burned up in the atmosphere or left on the surface of the Moon. While some of these artifacts are recoverable – Jeff Bezos himself devoted a portion of his sizable fortune to salvage one of the 65 F1 engines that were deposited into the Atlantic ocean – those left on the Moon are, for now, unrecoverable, and in most cases they are twisted heaps of wreckage that was intentionally crashed into the lunar surface.

But at least one artifact escaped this ignominious fate, silently orbiting the sun for the last 50 years. This lonely outpost of the space program, the ascent stage from the Apollo 10 Lunar Module, appears to have been located by a team of amateur astronomers, and if indeed the spacecraft, dubbed “Snoopy” by its crew, is still out there, it raises the intriguing possibility of scoring the ultimate Apollo artifact by recovering it and bringing it back home.

Continue reading “Snoopy Come Home: The Search For Apollo 10”

Retrotechtacular: This Boat Isn’t Sinking… It’s Doing Research

It looks like a ship when it is in port or in transit, and when it use you’d think it’s about to sink. The RP FLIP (for “FLoating Instrument Platform)  is an unpowered research buoy with a very special design designed to provide the most stable and vibration-free platform possible for scientists studying the properties of the sea.

RP FLIP interrior bathroom design has two sinks mounted at 90 degree angles.

Scientific research often places demanding requirements upon existing infrastructure, requiring its own large projects tailored to their individual task. From these unusual needs sometimes come the most curious buildings and machinery. RP FLIP is designed to provide the most stable and vibration-free platform possible for scientists studying the properties of the sea. By flooding tanks in its bow it transfers from horizontal and floating on the surface to vertical and half-submerged when it is deployed. With its stern protruding from the water and pointing skywards it has the appearance of a sinking ship. What’s really neat is that its interior is cleverly designed such that its crew can operate it in either horizontal or vertical positions.

The original impetus for FLIP’s building was the US Navy’s requirement to understand the properties of sound waves in the ocean with relation to their submarines and presumably also those of their Soviet adversaries. Research submarines of the 1950s were not stable enough for reliable measurements, and the FLIP, launched in 1962, was built to address this by providing a far more stable method of placing a hydrophone at depth. Since then it has participated in a significant number of other oceanographic studies as diverse as studying the propagation of waves across the Pacific, and the depth to which whales dive.

The videos below should give a good introduction to the craft. The first one is a glossy promotional video from its operator, the Scripps Institution Of Oceanography, on its 50th anniversary, while the lower of the two is a walkaround by a scientist stationed aboard. In this we see some of the features for operating in either orientation, such as a toilet facilities mounted at 90 degrees to each other.

It appears that FLIP is in good order and with continuing demand for its services that should see it still operating well into the future. Those of us who live near Atlantic waters may never see it in person but it remains one of the most unusual and technically intriguing vessels afloat.

FLIP is not the only 1960s oceanographic research buoy we’ve covered, should you have an interest in such things.

Continue reading “Retrotechtacular: This Boat Isn’t Sinking… It’s Doing Research”

Radio Piracy On The High Seas: Commercial Demand For Taboo Music

The true story of pirate radio is a complicated fight over the airwaves. Maybe you have a picture in your mind of some kid in his mom’s basement playing records, but the pirate stations we are thinking about — Radio Caroline and Radio Northsea International — were major business operations. They were perfectly ordinary radio stations except they operated from ships at sea to avoid falling under the jurisdiction of a particular government.

Back then many governments were not particularly fond of rock music. People wanted it though, and because people did, advertisers wanted to capitalize on it. When people want to spend money but can’t, entrepreneurs will find a way to deliver what is desired. That’s exactly what happened.

Of course, if that’s all there was to it, this wouldn’t be interesting. But the story is one of intrigue with armed boardings, distress calls interrupting music programs, and fire bombings. Most radio stations don’t have to deal with those events. Surprisingly, at least one of these iconic stations is still around — in a manner of speaking, anyway.

Continue reading “Radio Piracy On The High Seas: Commercial Demand For Taboo Music”

A Game Boy Supercomputer For AI Research

Reinforcement learning has been a hot-button area of research into artificial intelligence. This is a method where software agents make decisions and refine these over time based on analyzing resulting outcomes. [Kamil Rocki] had been exploring this field, but needed some more powerful tools. As it turned out, a cluster of emulated Game Boys running at a billion FPS was just the ticket.

The trick to efficient development of reinforcement learning systems is to be able to run things quickly. If it takes an AI one thousand attempts to clear level 1 of Super Mario Bros., you’d better hope you’re not running that in real time. [Kamil] started by coding a Game Boy emulator in C. By then implementing it in Verilog, [Kamil] was able to create a cluster of emulated Game Boys that enabled games to be run at breakneck speed, greatly speeding the training and development process.

[Kamil] goes into detail about how the work came to revolve around the Game Boy platform. After initial work with the Atari 2600, which is somewhat of a defacto standard in RL circles, [Kamil] began to explore further. It was desired to have an environment with a well-documented CPU,  a simple display to cut down on the preprocessing required, and a wide selection of games.

The goal of the project is to allow [Kamil] to explore the transfer of knowledge from one game to another in RL systems. The aim is to determine whether for an AI, skills at Metroid can help in Prince of Persia, for example. This is arguably true for human players, but it remains to be seen if this can be carried over for RL systems.

It’s rather advanced work, on both a hardware emulation level and in terms of AI research. Similar work has been done, training a computer to play Super Mario through monitoring score and world values. We can’t wait to see where this research leads in years to come.