Ten years ago, MIT researchers proved that it was possible to look through an envelope and read the text inside using terahertz spectroscopic imaging. This research inspired [Barmak Heshmat] to try the same technique to read a book through its cover. A new crop of MIT researchers led by [Heshmat] have developed a prototype to do exactly that, and he explains the process in the video after the break. At present, the system is capable of correctly deciphering individual letters through nine pages of printed text.
They do this by firing terahertz waves in short bursts at a stack of pages and interpreting the return values and travel time. The microscopic air pockets between the pages provide boundaries for differentiation. [Heshmat] and the team rely on these pockets to reflect the signal back to a sensor in the camera. Once they have the system dialed in to be able to see the letters on the target page and distinguish them from the shadows of the letters on the other pages, they use an algorithm to determine the letters. [Heshmat] says the algorithm is so good that it can get through most CAPTCHAs.
The most immediate application for this technology is reading antique books and other printed materials that are far too fragile to be handled, potentially opening up worlds of knowledge that are hidden within disintegrating documents. For a better look at the outsides of things, there is Reflectance Transformation Imaging.
Continue reading “MIT Researchers Can Read Closed Books (and defeat CAPTCHA)”
Here’s something we’re sure SEO specialists, PR reps, and other marketeers already know: how to write a script to game reddit.
The course of upvotes and downvotes controls which submission makes it to the front page of reddit. These submissions are voted on by users, and new accounts must log in and complete a CAPTCHA to vote. [Ian] discovered that reddit’s CAPTCHA is not really state-of-the-art, and figured out how to get a bot to solve it
The method exploits the 8-bit nature of the distorted grid in the CAPTCHA. Because this grid isn’t pure black or pure white, it’s at a lower intensity than the letters in the CAPTCHA. Putting the CAPTCHA through a threshold filter, deleting any blocks of pixels smaller than 20 pixels, and running it through a classifier (PDF there), a bot can guess what the letters of the CAPTCHA should be.
Out of the 489 CAPTCHAs [Ian] fed into his algorithm, only 28 – or 5.73% – were guessed correctly. However, because he knows which CAPTCHAs had failed segmentation, ignoring those can increase the success rate to 10%. Theoretically, by requesting new CAPTCHAs, [Ian] can get the accuracy of his CAPTCHA bot up to about 30%.
Combine this with a brilliant auto voting script that only requires someone to enter CAPTCHAs, and you’ve got the recipe for getting anything you want directly to the front page of reddit. Of course you could do the same with a few memes and pictures of cats, but you knew that already.
A while back we saw the MintEye CAPTCHA system – an ‘are you human’ test that asks you to move a slider until an image is de-swirled and de-blurred – cracked wide open by exploiting the accessibility option. Later, and in a clever bit of image processing, the MintEye CAPTCHA was broken yet again by coming up with an algorithm to detect if an image is de-swirled and de-blurred.
It appears we’re not done with the MintEye CAPTCHA yet (Russian, translation). Now the MintEye CAPTCHA can be broken without any image processing or text-to-speech libraries. With 31 lines of Java, you too can crack MintEye wide open.
The idea behind the hack comes from the fact that blurred images will be much smaller than their non-blurred counterpart. This makes sense; the less detail in an image, the smaller the file size can be. Well, all the pictures MintEye delivers to your computer – 30 of them, one for each step of swirl and blurring – are the same size, meaning the ‘wrong answer’ images are padded with zeros at the end of the file.
There’s a 31 line program on the build page that shows how to look at thirty MintEye images and find the image with the fewest zeros at the end of the file. This is, by the way, the correct answer for the MintEye CAPTCHA, and has a reproducibility of 100%.
So, does anyone know if MintEye is a publicly traded company? Also, how exactly do you short a stock?
A few days ago we saw a post from [samuirai] at the Shackspace hackerspace in Stuttgart on breaking the minteye captcha system. Like most other captcha cracks, [samuirai] used the voice accessibility option that provides an audio captcha for blind users. Using the accessibility option is a wonderful piece of work, but [Jack] came up with an even more elegant way to defeat the minteye captcha.
For those unfamiliar, the minteye captcha provides a picture tossed through a swirl filter with a slider underneath. Move the slider left or right to eliminate the swirl and you’ve passed the, “are you human” test. Instead of looking for straight lines, [Jack] came up with a solution that easily defeats the minteye captcha in 23 lines of Python: just minimize the length of all the edges found in the pic.
The idea behind the crack is simply the more you swirl an image, the longer the edges in the image become. Edge detection is a well-studied problem, so the only thing the minteye cracking script needed to do was to move the slider for the captcha from the left to the right and measure the lengths of all the edges.
[Jack] included the code for image processing part of his crack, fortunately leaving out the part where he returns an answer to the minteye captcha. For that, and a very elegant way to crack a captcha, we thank him.
We hadn’t heard of minteye CAPTCHA before, but we’ve seen evidence of a script that can break the system. Minteye combines two things which you probably don’t love about the Internet: advertisements and CAPTCHA. The system uses a slider to distort an advertiser’s image. Once the slider is in just the right spot the image becomes clear and you can click on submit to see if you passed the challenge.
Challenges like this are impossible for the visually impaired, so there is usually an audio option as well. In this case the audio button will instruct you to move the slider to the right, left, or that it’s already in the correct place. [Samuirai] used the text2speech API available in Google Chrome to parse these commands. As you can see above, “movies later” is a misinterpretation of “move the slider”, but he was still able to get enough accuracy to solve the challenge. See the script in action in the video after the break.
Audio challenges have been exploited like this in the past. Check out this talk about beating reCAPTCHA through the audio option.
Continue reading “Script defeats minteye CAPTCHA”
This talk from the 2012 LayerOne conference outlines how the team build Stiltwalker, a package that could beat audio reCAPTCHA. We’re all familiar with the obscured images of words that need to be typed in order to confirm that you’re human (in fact, there’s a cat and mouse game to crack that visual version). But you may not have noticed the option to have words read to you. That secondary option is where the toils of Stiltwalker were aimed, and at the time the team achieved 99% accurracy. We’d like to remind readers that audio is important as visual-only confirmations are a bane of visually impaired users.
This is all past-tense. In fact, about an hour before the talk (embedded after the break) Google upgraded the system, making it much more complex and breaking what these guys had accomplished. But it’s still really fun to hear about their exploit. There were only 58 words used in the system. The team found out that there’s a way to exploit the entry of those word, misspelling them just enough so that they would validate as any of up to three different words. Machine learning was used to improve the accuracy when parsing the audio, but it still required tens of thousands of human verifications before it was reliably running on its own.
Continue reading “Stiltwalker beat audio reCAPTCHA”
What do you put on your pancakes? Butter and syrup but not a pair of shoes? This makes sense to us, and it’s the premise of the new CAPTCHA game PlayThru. The space that is normally filled by nearly illegible text is now taken up by a little graphic-based game where you drag the appropriate items to one part of the screen. In addition to being easier than deciphering letters, this new platform shouldn’t require localization. But alas, it seems the system is already broken. [Stephen] sent us a link to a bot that can pass the PlayThru CAPTCHA.
Take a look at the video after the break to see the four test-runs. It looks like the bot is just identifying the movable objects and trying them out. Sometimes this is quick, sometimes not. But it does eventually succeed. For the PlayThru developers this should be pretty easy to fix, just make an error limit for trying the wrong item. At any rate, we can’t think defeating the current system is nearly as hard as defeating reCaptcha was.
Update: [Tyler] over at Are You A Human wrote in to share their side of this story. Apparently we’re seeing the bot play the game, but not necessarily pass it. It isn’t until the game if finished and the playing information is sent to their servers that a decision is made on whether it is successful or not. This way they can change the authentication parameters from the server side at any time.
At the same time, [Stephen] updated his bot and made a video of it playing the game without any shoes on the pancakes.
Continue reading “CAPTCHA bot beats new Are You A Human PlayThru game”