Every once in a while a project comes along with that magical power to consume your time and attention for many months. When you finally complete it, you feel sorry that you don’t have to do anything more.
What is so special about this Bingo ball reader? It may seem like an ordinary OCR project at first glance; a camera captures the image and OCR software recognizes the number. Simple as that. And it works without problems, like every simple gadget should.
But then again, maybe it’s not that simple. Numbers are scattered all over the ball, so they have to be located first, and the best candidate for reading must be selected. Then, numbers are painted onto a sphere rather than a flat surface, sometimes making them deformed to the point where their shape has to be recovered first. Also, the angle of reading is not fixed but somewhere on a 360° scale. And then we have the glare problem to boot, as Bingo balls are so shiny that every light source reflects as a saturated bright spot.
So, is that all of it? Well, almost. The task is supposed to be performed by an embedded microcontroller, with limited speed and memory, yet the recognition process for one ball has to be fast — 500 ms at worst. But that’s just one part of the process. The project includes the pipelined mechanism which accepts the ball, transports it to be scanned by the OCR and then shot by the public broadcast camera before it gets dumped. And finally, if the reading was not reliable enough, the ball has to be subtly rotated so that the numbers would be repositioned for another reading attempt.
Despite these challenges I did manage to build this system. It’s fast and reliable, and I discovered some very interesting tricks along the way. Take a look at the quick demo video below to get a feel for the speed, and what the system “sees”. Then join me after the break to dive into the details of this interesting embedded build.
Continue reading “Reading Bingo Balls With Microcontrollers”
Ten years ago, MIT researchers proved that it was possible to look through an envelope and read the text inside using terahertz spectroscopic imaging. This research inspired [Barmak Heshmat] to try the same technique to read a book through its cover. A new crop of MIT researchers led by [Heshmat] have developed a prototype to do exactly that, and he explains the process in the video after the break. At present, the system is capable of correctly deciphering individual letters through nine pages of printed text.
They do this by firing terahertz waves in short bursts at a stack of pages and interpreting the return values and travel time. The microscopic air pockets between the pages provide boundaries for differentiation. [Heshmat] and the team rely on these pockets to reflect the signal back to a sensor in the camera. Once they have the system dialed in to be able to see the letters on the target page and distinguish them from the shadows of the letters on the other pages, they use an algorithm to determine the letters. [Heshmat] says the algorithm is so good that it can get through most CAPTCHAs.
The most immediate application for this technology is reading antique books and other printed materials that are far too fragile to be handled, potentially opening up worlds of knowledge that are hidden within disintegrating documents. For a better look at the outsides of things, there is Reflectance Transformation Imaging.
Continue reading “MIT Researchers Can Read Closed Books (and Defeat CAPTCHA)”
Over at [Truthlabs], a 30 year old pinball machine was diagnosed with a major flaw in its game design: It could only entertain one person at a time. [Dan] and his colleagues set out to change this, transforming the ol’ pinball legend “Firepower” into a spectacular, immersive gaming experience worthy of the 21st century.
A major limitation they wanted to overcome was screen size. A projector mounted to the ceiling should turn the entire wall behind the machine into a massive 15-foot playfield for anyone in the room to enjoy.
With so much space to fill, the team assembled a visual concept tailored to blend seamlessly with the original storyline of the arcade classic, studying the machine’s artwork and digging deep into the sci-fi archives. They then translated their ideas into 3D graphics utilizing Cinema4D and WebGL along with the usual designer’s toolbox. Lasers and explosions were added, ready to be triggered by game interactions on the machine.
To hook the augmentation into the pinball machine’s own game progress, they elaborated an elegant solution, incorporating OpenCV and OCR, to read all five of the machine’s 7 segment displays from a single webcam. An Arduino inside the machine taps into the numerous mechanical switches and indicator lamps, keeping a Node.js server updated about pressed buttons, hits, the “Lange Change” and plunged balls.
The result is the impressive demonstration of both passion and skill you can see in the video below. We really like the custom shader effects. How could we ever play pinball without them?
Continue reading “The Most Immersive Pinball Machine: Project Supernova”
There are devices out there that will magnify text using fancy cameras and displays, devices that will convert these to Braille, and text-to-speech software has been around for thirty years. For his entry into our Raspberry Pi Zero contest, [Markus] decided to combine all these ideas into a simple device that will turn the printed word into speech.
The impetus for [Markus]’ project came to him in the form of a group of blind computer science students. These students used a specialized program that used specialized hardware and software such as mobile Braille terminals, OCR, and oral exams that allowed these students to study the same thing as everyone else. [Markus] wanted to produce something similar, using simple text-to-speech software instead of a complicated Braille display.
The physical design of [Markus]’ project is uniquely functional – a hand-held device with a camera up front, a Pi in the middle, and a speaker and headphone jack on the back. The hand grip includes a large battery and a trigger for telling the Pi to read a few words aloud.
The software is built around the SnapPicam and includes a lot of the functionality already needed. OCR is largely a solved problem with Tesseract, and text-to-speech is easy with Festival.
Although [Markus] is just plugging a few existing software modules together, he’s come up with a device that is certainly unique and could be exceptionally useful to anyone with a vision impairment.
The Raspberry Pi Zero contest is presented by Hackaday and Adafruit. Prizes include Raspberry Pi Zeros from Adafruit and gift cards to The Hackaday Store!
See All the Entries
Check out this brainy bot with [Jari] whipped up to dominate the Bookworm Deluxe scoreboard. The bot runs on top of a win32 machine, pulling screenshots to see the game board and simulating mouse clicks to play. The video after the jump shows that it plays like a champ, but it took some doing to get this far and [Jari] took the time to share all of the development details.
The hardest part of writing these types of bots is recognizing the game pieces. Check out all of the animation that’s going on in the still shot above… a lot of the tiles are obscured, there are different colors, and the tiles themselves shift as the bot spells and submits each word.
After some trial and error [Jari] settled on an image pre-processor which multiplies pixel values by themselves four times, then looks at each pixel with a 1/6 threshold to produce a black and white face for each tile. From there a bit of Optical Character Recognition compares each tile to a set of known examples. This works remarkably well, leading into the logic and dictionary part of the programming challenge.
Do you think this was easier or harder than the Bejeweled Blitz bot. That one was looking for specific pixel regions, this one is basically a focused roll-your-own OCR script.
Continue reading “Bookworm Playing Bot Tests Programmer’s OCR Skills”
[Chris] tried his hand at using Optical Character Recognition in his server power monitoring rig. The image above is what the IP camera used in the setup sees. He’s included a bright light to ensure that the contrast is as great as possible. After applying a threshold filter to the captured still, he is able to process the image to test all seven segments of every digit.
He uses Mathematica for the processing. We’re not familiar with the particulars of the language, but it’s easy enough to see the main parts of the program. Line six of his source code applies the image filters and then the program loops through the assigned location of each digit, testing segment combinations to ascertain what number is shown. Things get hairy when it comes to the decimal point. We gather that the meter can show varying degrees of precision based on the total number of digits needed (like a Digital Multimeter). But [Chris’] setup has a difficult time reliably detecting that decimal point because of its size. He uses a shortcut to get around this, knowing that his server never pulls less than 300W so he corrects the output (by multiplying it by ten) if the reading is below that benchmark.
Of course it would be easier to crack open the monitor and glean data electronically (that’s how the Tweet-A-Watt does it) but then [Chris] wouldn’t have had the fun of playing with OCR.
Meet GåågleBot. GåågleBot is a modified roomba that will not only vacuum your home, but collect data while it does it. While it is carrying out its normal duties as a floor cleaner, it will take pictures, collecting and analyzing all the data for later searches. With built in OCR, you can actually search for things using text strings.
Aside from just carrying out its normal job, you can also remote control it via the web. You can even control theirs!
[via Boing Boing]