Bookworm Playing Bot Tests Programmer’s OCR Skills

bookworm-bot

Check out this brainy bot with [Jari] whipped up to dominate the Bookworm Deluxe scoreboard. The bot runs on top of a win32 machine, pulling screenshots to see the game board and simulating mouse clicks to play. The video after the jump shows that it plays like a champ, but it took some doing to get this far and [Jari] took the time to share all of the development details.

The hardest part of writing these types of bots is recognizing the game pieces. Check out all of the animation that’s going on in the still shot above… a lot of the tiles are obscured, there are different colors, and the tiles themselves shift as the bot spells and submits each word.

After some trial and error [Jari] settled on an image pre-processor which multiplies pixel values by themselves four times, then looks at each pixel with a 1/6 threshold to produce a black and white face for each tile. From there a bit of Optical Character Recognition compares each tile to a set of known examples. This works remarkably well, leading into the logic and dictionary part of the programming challenge.

Do you think this was easier or harder than the Bejeweled Blitz bot. That one was looking for specific pixel regions, this one is basically a focused roll-your-own OCR script.

 

[Read more...]

OCR automatically reads a power meter

ocr-used-to-read-a-power-meter

[Chris] tried his hand at using Optical Character Recognition in his server power monitoring rig. The image above is what the IP camera used in the setup sees. He’s included a bright light to ensure that the contrast is as great as possible. After applying a threshold filter to the captured still, he is able to process the image to test all seven segments of every digit.

He uses Mathematica for the processing. We’re not familiar with the particulars of the language, but it’s easy enough to see the main parts of the program. Line six of his source code applies the image filters and then the program loops through the assigned location of each digit, testing segment combinations to ascertain what number is shown. Things get hairy when it comes to the decimal point. We gather that the meter can show varying degrees of precision based on the total number of digits needed (like a Digital Multimeter). But [Chris'] setup has a difficult time reliably detecting that decimal point because of its size. He uses a shortcut to get around this, knowing that his server never pulls less than 300W so he corrects the output (by multiplying it by ten) if the reading is below that benchmark.

Of course it would be easier to crack open the monitor and glean data electronically (that’s how the Tweet-A-Watt does it) but then [Chris] wouldn’t have had the fun of playing with OCR.

Google your home with a roomba

Meet GåågleBot. GåågleBot is a modified roomba that will not only vacuum your home, but collect data while it does it. While it is carrying out its normal duties as a floor cleaner, it will take pictures, collecting and analyzing all the data for later searches.  With built in OCR, you can actually search for things using text strings.

Aside from just carrying out its normal job, you can also remote control it via the web. You can even control theirs!

[via Boing Boing]

DS based reader for the blind

[Epokh] has release some homebrew software that uses a Nintendo DS as a voice reader for documents. This is extremely useful for blind and visually impaired folks who normally use screen readers but can utilize this technology for reading books, documents, and email on the go. Future versions look to add an email client and implement OCR via the camera for reading documents on the go.

The flite package is utilized to provide the text to speech functionality. We’re familiar with this package and judging by the video after the break, it lost nothing in the port to the DS hardware. [Epokh] pointed out that similar readers can cost $1500 when a DS sells for around $130. We can’t wait to see the final version fleshed out!

[Read more...]

Are you human? Then type out this book

type_this_text

Google has acquired reCAPTCHA and plans to use the system for digitizing books. Wait… what? CAPTCHA is the method of requiring a user to type in a visually obscured word to prove they are human. How can this digitize books? The answer is a bit obscure and takes some time to discover, but you’ll have fun along the way. [Read more...]

High speed book scanner from trash

book_scanner

[Daniel] sent us his entry to the Epilog laser cutter challenge on instructables. He made a book scanner, mainly out of found parts. The bulk of the project was salvaged from dumpsters, though if you’re not comfortable with that, the free section of craigslist might be able to do the job. The cameras are loaded with CHDK, using StereoData maker, and custom software to compile the images into PDFs. They did a fantastic job of documenting every step of the construction, including helpful tips for some of the more complicated parts. There are several videos in the instructable, so be sure to check them out. We’re particularly amused by the extra step of making the photo captions visually interesting. At 79 steps, it’s a long read, but well worth it.

MegaUpload captcha cracking in JavaScript

megaupload-the-leading-online-storage-and-file-delivery-service

This was certainly the last thing we expected to see today. [ShaunF] has created a Greasemonkey script to bypass the captcha on filehosting site Megaupload. It uses a neural network in JavaScript to do all of the OCR work. It will auto submit and start downloading too. It’s quite a clever hack and is certainly helped by the simple 3 character captcha the site employs. Attempting to do the same thing with ReCAPTCHA has proven much more difficult.

UPDATE: [John Resig] explained of how it works.

[via Waxy]

Follow

Get every new post delivered to your Inbox.

Join 96,671 other followers