MIT Researchers Can Read Closed Books (and Defeat CAPTCHA)

Ten years ago, MIT researchers proved that it was possible to look through an envelope and read the text inside using terahertz spectroscopic imaging. This research inspired [Barmak Heshmat] to try the same technique to read a book through its cover. A new crop of MIT researchers led by [Heshmat] have developed a prototype to do exactly that, and he explains the process in the video after the break. At present, the system is capable of correctly deciphering individual letters through nine pages of printed text.

They do this by firing terahertz waves in short bursts at a stack of pages and interpreting the return values and travel time. The microscopic air pockets between the pages provide boundaries for differentiation. [Heshmat] and the team rely on these pockets to reflect the signal back to a sensor in the camera. Once they have the system dialed in to be able to see the letters on the target page and distinguish them from the shadows of the letters on the other pages, they use an algorithm to determine the letters. [Heshmat] says the algorithm is so good that it can get through most CAPTCHAs.

The most immediate application for this technology is reading antique books and other printed materials that are far too fragile to be handled, potentially opening up worlds of knowledge that are hidden within disintegrating documents. For a better look at the outsides of things, there is Reflectance Transformation Imaging.

59 thoughts on “MIT Researchers Can Read Closed Books (and Defeat CAPTCHA)

    1. How about packs of Scroll Rack? They changed card packs out to foil to try to prevent this. Looks like it only works through a small number of sheets but packs of cards are only 15 or so cards, right?

  1. That’s something. I am only getting about 90% of capchas right these days… Sometimes so distorted I can’t tell upper from lower case. C V O W P S etc

    Be really cool for marine archeologists, books from the bottom of the sea…

    1. I’ve had to personally email website owners to become a member because I can’t do their captcha. In most cases though I just abandon the web site.

      I have a less common form of dyslexia and this is basically discrimination.

      I use a VPN so google used to send lot of captcha’s my way so now I use yahoo.

      Google is the worst offender at this discrimination because they abuse the opportunity by asking the user to decode street numbers and the like so they can use that information to their advantage.

      A lot of the captcha’s are unnecessarily far too difficult.

      1. Oh don’t get me started on google’s capchas, fire off two or three searches and used advanced operators and they demand you prove you’re human. Yes I’m very obviously human because I’m trying to defeat your artificial stupidity that is trying to tell me what it thinks I’m searching for, rather than what I am actually searching for, which surprisingly enough is what I typed in the search box the first time and all meaning got inaccurate-synonymed and assumed-typoed away.

        1. Yahoo is just as good for searching for text / documentation and all the same advanced search terms work –
          +site:.gov +url:nsa +link:cnn.com +”freedom foundation”

          Google is better for image search but you can always go to google for that.

          My default search is set to yahoo and if I need google then I just type the google domain name.

        1. So please do tell why you think it’s so funny? Do you laugh at someone in a wheel chair trying to get up a couple of steps when there is no ramp?

          This (assumed) attitude reminds me of a conversation I had with the post office manager.

          I rang to complain that an item was NOT delivered and instead I had to go collect it from the post office.

          He asked my address and then laughed at me and told me that I shouldn’t complain because he could walk that far in 5 minutes.

          I said “Challenge accepted” but I wanted him to walk 5 meters and back when I arrive to prove he could walk as well.

          He accepted.

          Then I said “you will be able recognize me when I arrive as I will be carrying a 4 foot length of timber with an eight inch nail driven through the end”.

          He said “what’s that for”.

          I said “to make this challenge fair I am going to swing the timber and drive the nail into the base of his spine fracturing a vertebrata and then see if he can walk the 10 meters in excruciating pain”.

          When I said I will be there in half an hour he immediately said they would make an extra trip to deliver the package.

          1. I’m thinking more from an engineering standpoint here but, how exactly would you apply a captcha to a person with dyslexia? Or other disabilities that interfere with visual processing?

            I mean, you have to prove someone is human so they have to complete a task a computer couldn’t do. Fair enough, plenty of things there… except they only have a monitor and keyboard for sure. A mouse is optional, as are speakers, biometric scanners and whatnot.

            So with a keyboard and monitor you have to get a human to do something uniquely human. That’s image processing right there although computers seem to be getting better at recognising pictures of dogs for instance so we rely on text and boom, unless there’s a fault in the logic somewhere that’s a problem that is always going to screw over the visually impaired.

            The only other solution I could think of would be a biometric scanner of some kind paired with a way to say “this information was gathered 5 seconds ago” . So Captchas you can’t read or retina scans to be able to use Google, which feels less like discrimination? They’re both pretty bad.

          2. To make a fair captcha there is write the word in said in the sound, click on a random moving ball, select between two jokes what is the more hilarious, choose if this is a painting or a photo, start recording and say hello, what is doing the man in the gif?, …

          3. Some of the self made ones on forum sign ups are annoying, and not updated anywhere near often enough. Like 2 years into Obama the answer to “Who is US President?” was George Bush, etc.

          4. @[CampGareth]

            Well this is a bit offtopic but hay this is not the first time we have seen this on HAD and it’s very relevant to the electronics industry to a greater degree than some other industries as we are also needing to do some re-thinking as we need older people to remain in the work force. Well unless the younger are willing to pay ever increasing taxes to support older people that don’t work, and by increased taxes you need to understand that means you will be living in a open plan one bedroom apartment eating pasta and rice for the rest of your life.

            Discrimination has many forms, most of which are not recognized as demonstrated above. We can accept that a person that has some form of disability and needs to use a wheelchair, also needs a little assistance to be able to do some of the things others take for granted and that includes work.

            BUT – at the same time it is perfectly acceptable to laugh at someone that may have some other form of disability because the disability they has been less popularized by baby kissing politicians and therefor doesn’t hit peoples empathy buttons.

            So does that mean that younger people want to pay exorbitant taxes for people with less popular disabilities but not for those who have the forms of disability that are more commonly known about.

            Age itself brings a complete raft of disabilities. Some are well known like fading sight, reduced physical ability but most of them are not even known to younger people. Where is the colostomy bag disposal unit at your workplace.

            And the above highlights most of the problem. The people who make decisions about workplace inclusiveness don’t want to ask the actual people with disabilities about how to make the workplace habitable.

            Instead they make there own decisions which still exclude workers. So they fail because they think about what “disability” means to them ie the popularized disabilities.

            The reality is that *anything* that impedes a persons ability to work *is* a disability so the focus needs to be on inclusiveness.

            Or – start getting used to pasta and rice.

      2. Hi,
        Very interesting, I’d not thought about dyslexia affecting captchas.
        If you don’t mind answering:

        – you say you have a less common form – do most dyslexics handle captchas OK? How common is your form?

        – does the audio option on a traditional captcha work for you?

        – I assume the newer recaptcha presents no difficulties?
        (Interestingly, I find this hard – “select the pickup trucks” isn’t very easy when it shows you the front of the things, particularly for a Brit – we don’t have many pickups here)

        1. OK, first up “visual dyslexia” which is what most people mean when they say “dyslexia”.

          “Dyslexia” without the pre-pended “visual” is a learning disability that is unrelated to visual processing.

          Visual dyslexia is sort of a grey scale in that normal people can sometime make the same mistakes.

          I have a bit of both, but more of “dyslexia”. For me that means I can’t learn left and right. For me left and right are the same and I have to go through a conscious cognitive process if I wish to determine one from the other.

          I can’t read things like small print, not because it’s small (I can see and distinguish the letters) but because it occupies too little a space. I use a 37 inch computer monitor. I can only read smaller monitors if the screen has artifacts like dust or grubby fingerprints.

          If I have fine print on paper then I have to crumple the paper and re-flatten it. This is probably why I *can’t* do some captch’s. It’s hard to crumple a LCD screen and then re-flatten it.

          Most normal visual dyslexics (except the extreme cases) should be able to read a captch but it would be frustrating and they would likely get it wrong quite often.

          It’s like – you could run very fast indeed if I drop a red hot piece of metal down the back of your pants. Is it then OK for me to expect you to run very fast?

          The whole concept of captcha is wrong. Captcha’s don’t blacklist computers, instead they white list people with normal vision.

          So people with visual dyslexia are punished for the actions of *people* that use scripts to abuse network resources.

          It’s like – your neighbor shot my brother and your neighbor is not home now so I am going to shoot you.

          That’s fair right?

          1. social interaction is unfair since it puts the less socially inclined at a disadvantage…

            ideally we would find a solution that wasnt built upon just a single thing, many capthcas have alternatives, from listening to the word through questions or equations, but if the general issue lies with any text when using a screen perhaps we should find a solution to that instead of raving over what is, in fairness, a fairly simple method of sorting out the worst bots, which for many who run sites is a very real problem, i can remember trying to keep up with a user creating forum bot as a moderator once, it took weeks for the site (with over 50k regulars) to get up and running at normal capacity, we had to disable registration in the end.

          2. People like you *are* the problem.

            You didn’t bother to read what I had written as a person who experiences these difficulties because your completely uneducated uninformed *opinion* is more important.

            And by the way, I spent a decade as a web developer. I only finished about two years ago so I can tell you that if you *need* a captcha to minimize bot activity then you can’t code. Go sell lolly-pops or something.

            And if you haven’t worked it out by now, I *can* read text on a screen.

            Why don’t you go ripping up wheelchair access ramps or something like that and see just what proportion of people completely disagree with you.

          3. Hi ROB,

            Thank-you very much for explaining in detail, that’s really helpful. I (and I imagine most others) was completely unaware there were different types of dyslexia – I’ve only had one close friend with dyslexia, and hers must have been the visual type, and quite mild at that – so I was completely ignorant of the issues.

            Putting blocks in front of users – be it captchas causing you an issue, or accessibility issues for blind users, etc. – is always bad. But we usually just think of ‘accessibility’ as ‘screen readers’. That’s why I found your comment particularly interesting – captchas are supposed to be ‘accessible’, so I would previously assumed they were fine for everyone, when perhaps they’re accessible to blind people but not some dyslexics :( I’m very sorry! I implemented recaptcha (the image-based one) for a client just last week, completely unaware that it could be a block for you. I’d be very happy to go back to him with a better solution, but he’ll want something to prevent automated signups, as he had his membership DB rendered useless previously by 1000s of bot signups.

            I think the ‘concept’ of captchas (distinguishing humans from bots) is not unreasonable (given that we live in a world where people use bots to do bad things). It’s just that we don’t really have the technology – the most common implementations are somewhat lacking even for the average user, so I can’t begin to imagine how much trouble the’d cause you.
            I have seen some very non-intrusive ones that use e.g. mouse-tracking to identify humans, which I’d guess might work fine for you? But I’d guess they still cause issues for certain people, e.g. with motor control issues.

            I suppose we largely tolerate captchas for now, because without them, many websites would become unusable because of bots, but we need to keep pushing forward the technology – both because bots catch up, and because they’re not fully accessible yet. I think the image-based recaptcha is a significant improvement over the old text captcha, certainly for most people, but it’s still a pain.

            But anything we can do to avoid them is good. I’m not sure how HAD comments stay free of spam without any captchas – unless I completed one ages ago and forgot? Perhaps they’ve got good spam filters, or very active moderators?

            Are there any options which don’t cause you any issues?
            Would I be right in guessing systems like “Login with Google” are better for you, as you only have to prove to Google once that you’re not a bot, and then other sites trust Google’s identification of you?
            Are the image-based ones like reCaptcha better?

            If any HAD writers are listening – I think accessibility is something we often overlook, and particularly in the hacking subculture, where we don’t usually get things to as polished a final state as is required for a consumer product. Perhaps we could have a series of articles on accessibility issues at some point? It’d be very educational for us.

          4. >I can tell you that if you *need* a captcha to minimize bot activity then you can’t code
            I don’t think that’s entirely fair; many developers work on systems which don’t have these issues, so it’s often not a lack of coding skills, but lack of knowledge about how bots can be identified – which is a whole industry; Google and others have invested a lot of research into it, and it’s generally good to offload that type of thing to someone else who specialises in it – just as we use common string libraries etc. when we code. The ‘off the shelf’ solutions are also sold as accessible – though as most of us have just learnt, they’re not.

            You also need to consider that implementing recaptcha can be done in about 3 minutes on most sites.

            Any worse still, many clients want to feel protected by seeing a captcha, even if it’s not necessary. (airport security, anyone…?)

            However, I’d be very keen to hear what solutions you’ve used to prevent bot activity? I already hated captchas before reading your comment, and now I hate them even more, and for better reasons.

          5. @[Dan]

            I can get past most captchas. Many of them have an audio option which I never use because I also have Tinnitus and that fact probably highlights a problem.

            You want to design a captcha that will work for my dyslexia only to find I also have Tinnitus … and other people … how complex could that end up being.

            How long is it before you’re trying to code a captcha that is suitable for one armed left handed deaf lesbians who can’t stand pictures of cafelatas. Not that I have anything against lesbians but damn I hate cafelatas.

            Lets start over.

            Security Rule 1) Never trust anything from the client machine.

            That leaves us with this issue of bots so we break –

            UX Rule 1) Make things as easy as possible for the user.

            You have to ask this question –

            What is more predictable?
            A) The activity of BOTs
            B) All the elements in the breadth and nature of human diversity

            Well lets see –
            A) a BOT is generated by a Turing complete machine that is running a script that acts as a state machine. It is 100% predictable.
            B) Call me back when you’ve solved B. If this house has the same number then you will probably speak to distant descendant.

            Then there is the other issues. A lot of the unwanted traffic is coming from people sitting in China and being paid 2 grains of rice to do things like click facebook “Like” buttons and no captcha is going to solve that problem.

            Here’s an alternative.

            I identify the remote computer –
            https://panopticlick.eff.org/static/browser-uniqueness.pdf

            Trace their url path to the resource.

            If the first page their hitting is the registration page and there doing it 2500 times a second then there either a very very fast human or it’s probably that they’re a bot.

            Trace their path – home – forums – forums – forums – forum – site map – registration page … probably human.
            home – registration – probably a bot

            Count seconds per page. Add other metrics. Get really super-freaky and weight the metrics into an algorithm. Try it, it’s far easier that you think to refine and people will think you a God of code and bring you gold plated 1’s and 0’s

            You can simply MD5 the new url with the md5 of all the older urls if your worried about server resources but session info is by the most part on the hdd anyway.

            There are a gazillion ways that you can code in a little intelligence to solve 99.9% of the problem and captchas only solve 60% – 80% anyway.

            My apologies to any one armed left handed deaf lesbians who can’t stand pictures of cafelatas.

          6. @RÖB Thanks for that rare insight. This has been very enlightening.

            This may give some clues as to why some folks have such a hard time using computers, in general.

            I hate CAPTCHA, too. I just did some searching, and came up with this, from WIkipedia: ‘The web accessibility organization WebAIM reported in May 2012, “Over 90% of respondents [screen reader users] find CAPTCHA to be very or somewhat difficult.’

            Perhaps this problem is more common than one might suppose.

    2. “Be really cool for marine archeologists, books from the bottom of the sea…”

      Sure they can see through the pages, but will they be able to see through the snail on the tail of the frog on the bump on the ship’s log in the hole in the bottom of the sea.

    1. That is very interesting. Let’s just assume that yes, that is possible with this system. Wouldn’t you need access to the tickets in advance, and then somehow, make sure someone else purchases the duds and you get the good ones? I think part of the security of lottery cards is access to them, and since the house always wins, purchasing full rolls doesn’t do you much good.

      1. No, the sort of lottery cards he’s talking about have n scratchoff bubbles, but pay out based on only m bubbles you choose to scratch off. (Scratching off more than m voids the ticket, naturally.) Knowing _which_ m bubbles to scratch for maximum payout gives you an edge over the intended blind choice. I don’t know whether this edge is big enough to break even on average, but if so, I’m sure they’ll catch on and refuse to pay up after a little while, and if you fight it, only the lawyers win.

        1. Along with that, the other kinds where it’s already decided what the card will win (match the left side to the right and win a prize) you’d be able to toss out all the losers and make sure you buy a winner.

      2. They’re sold over the counter in various tobacco and magazine stores.
        Surely one or two storeowners may be interested if you tell them that you have a way to pick out the good tickets and that you’re willing to split the profits.
        Then the bad ones are sold normally.

        I wonder how detectable this would be to the lottery?
        Would they keep a record of the tickets and check if they are cashed in the correct order? Would a sudden bulk winning raise a flag?

        1. We can see validation attempts as well as know where the winning ticket is sold to. Any attempt to claim a bulk win would get you a visit from the Police ( Who have an office in the Lottery HQ here ).

          1. ah the wonders of using the state police to do dirty work….

            all forms of organized gambling is a con, intended for people who needs the rush or dont understand the math.

  2. To be fair, the MIT researchers say it gets most of the CAPCHAs, I would say you’re not going to be replaced by this even with 10% failure rate.

    I am also excited by the possibility of salvaging otherwise unreadable texts. This is a really cool method to combat decay from aging. With water damage, I wonder if the eliminates the micro-pockets of air between pages which is integral to this working?

    1. It’s all about impedance mismatch.
      I don’t see why they wouldn’t be able to differentiate from any single or multiple media between sheets. Oil & biomed companies do this with other acoustic or radio waves. All you need are sufficiently different propagation rates between the pages and the interfering media to generate a reflection. Ignore the ones that don’t correspond to the pages of the text and you’re home free.

      1. Or if you want to think about it optically, that would be like differences in the index of refraction. Like you can see through an inch thick piece of glass well enough but if you stack microscope slides to the height of an inch, it’s a lot less clear.

  3. Don’t they (TSA in America) already use this at airports to see your naked body under your clothes? Also this tech could be used to see if you have money in your wallet. Read your driver’s license. See what’s in your trunk of your car or in your truck (aka lorrie) trailer. Lottery ticket scratchers already have readers at the ticket counter. It reads the bar code or QR code. Here in Connecticut a lot of our criminally minded convenience store owners have been recently arrested and charged with fraud for reading a whole roll of scratchers trusted in their inventory by the lottery commission. They have a friend/associate process the unpaid for winners.

    1. Wow, slow down a bit. This is done on a perfectly immoble camera, medium and target. Carefully calibrated in a lab by experts.On a well known material. With a long time needed to get useful data.
      I don’t see this breaking privacy anytime soon, but will be surely useful in endoscopy. Sample reasonable applications can be rediscover mural paintings, overwritten text on paper, not invasive biopsy.
      For what concerns me lotteries are already a legalized kind of scam, where you buy 99% of times a faulty product.

    2. Body scanners use millimeter waves to ‘see’ through clothing. But it’s already in use for medical and security proceedures.
      https://en.wikipedia.org/wiki/Terahertz_radiation#Research
      As for seeing into cars, I suspect it won’t be effective as it’s still in the radio spectrum. You’d need X-rays to see through things like trailers or metal trunks. But they already use backscatter scanners for that at ports, it’s only a matter of time until the sensors become small enough to be man portable.
      Regarding scanning wallets for money:
      Why bother? Thieves and muggers can just steal the whole wallet, cards are worth at least as much as cash in the right hands. Or just pay wait staff to skim cards the old fashion way. Chip & pin was broken long before it made it’s way to the US so that’s not a problem either. Not that every retailer has upgraded to it yet despite being on the hook for any fraud linked to their points of sale.
      The same paranoia surrounded the switch to foil security strips in money when they were introduced. There are far more likely scenarios to worry about than Big Brother tracking your daily cash holdings as you commute to work. If that’s your biggest concern, buy a ‘faraday wallet’.

  4. Preserving the artifact and losing the contents is a crime. They need to cut the spine off and photograph the pages. Honestly having a single 1000 year old book that nobody can ever read is worthless except as an artifact for a rich man to possess. Destroy it to duplicate and spread the knowledge inside is far far more valuable to the whole human race.

    1. Well the issue is destroying or failing to preserve the artifact AND losing the contents when they get so delicate, so even having the choice of one of the other is often considered a luxury. When it’s too delicate for conventional methods, it’s too delicate period, even spine removal may remove last vestige of structural integrity of some old and high acid papers and you’ll crumble the whole lot with a sneeze.

    2. @Timothy Gray
      Technological advances allow people to recover things previously thought recoverable. Thinking like yours leads to people recovering part of the text, but in the process completely eliminating the chance to recover the rest.

    1. I’m not entirely sure it would be non-destructive in that use case. Bound to be features on the chip that act as antennas for that frequency and then you have stray currents running amok.

    2. This is all ready somewhat done. Acoustic and X-ray analysis of chips are both extremely common non-destructive inspection methods. Radar microscopy is less common because of difficulty pertaining to the chips themselves. Decapping the chip if done right, does not impact it’s electrical performance and allows testing of the chip while active which can yield huge amounts of information that non-destructive methods do not. Also keep in mind that the feature size you are looking at on die is muuuuuuch much smaller than even the finest text on a page.

Leave a Reply to RWCancel reply

Please be kind and respectful to help make the comments section excellent. (Comment Policy)

This site uses Akismet to reduce spam. Learn how your comment data is processed.