EasyOCR Makes OCR, Well, Easy

Working on embedded systems used to be easier. You had a microcontroller and maybe a few pieces of analog or digital I/O, and perhaps communications might be a serial port. Today, you have systems with networks and cameras and a host of I/O. Cameras are strange because sometimes you just want an image and sometimes you want to understand the image in some way. If understanding the image involves reading text in the picture, you will want to check out EasyOCR.

The Python library leverages other open source libraries and supports 42 different languages. As the name implies, using it is pretty easy. Here’s the setup:


import easyocr
reader = easyocr.Reader(['th','en'])
reader.readtext('test.jpg')

The results include four points that define the bounding box of each piece of text, the text, and a confidence level. The code takes advantage of the GPU, but you can run it in a CPU-only mode if you prefer. There are a few other options, including setting the algorithm’s scanning behavior, how it handles multiple processors, and how it converts the image to grayscale. The results look impressive.

According to the project’s repository, they incorporated several existing neural network algorithms and conventional algorithms, so if you want to dig into details, there are links provided to both code and white papers. If you need some inspiration for what to do with OCR, maybe this past project will give you some ideas. Or you could cheat at games.

Hackaday Podcast 035: LED Cubes Taking Over, Ada Vanquishes C Bugs, Rad Monitoring Is Hot, And 3D Printing Goes Full 3D

Hackaday Editors Mike Szczys and Elliot Williams get caught up on the most interesting hacks of the past week. On this episode we take a deep dive into radiation-monitor projects, both Geiger tube and scintillator based, as well as LED cube projects that pack pixels onto six PCBs with parts counts reaching into the tens of thousands. In the 3D printing world we want non-planar printing to be the next big thing. Padauk microcontrollers are small, cheap, and do things in really interesting ways if you don’t mind embracing the ecosystem. And what’s the best way to read a water meter with a microcontroller?

Take a look at the links below if you want to follow along, and as always tell us what you think about this episode in the comments!

Take a look at the links below if you want to follow along, and as always, tell us what you think about this episode in the comments!

Direct download (60 MB or so.)

Continue reading “Hackaday Podcast 035: LED Cubes Taking Over, Ada Vanquishes C Bugs, Rad Monitoring Is Hot, And 3D Printing Goes Full 3D”

Rock Out To The Written Word With BookSound

With his latest project, [Roni Bandini] has simultaneously given the world a new type of audiobook and music. Traditional audiobooks are basically the adult equivalent of having somebody read you a bedtime story, but BookSound actually turns the written word into electronic music. You won’t be able to boast to your friends that as a matter of fact, you have read that popular new novel, but at least you might be able to dance to it.

[Roni] says he’s still working on perfecting the word to music mapping, so the results shown in the video after the break are still a bit rough. But even in these early stages there’s no denying this is an exceptionally unique project, and we’re excited to see where it goes from here.

Inside the classy looking 3D printed enclosure is a Raspberry Pi, an OLED display, and the button and switch which make up the extent of the device’s controls. At the end of the arm is a standard Raspberry Pi Camera module, which gives the BookSound a bird’s eye view of the book to be songified.

To turn your favorite book into electronic beats, simply open it up, put it under the gaze of BookSound, and press the button on the front. Because the Raspberry Pi isn’t exactly a powerhouse, it takes about two minutes for it to scan the page, perform optical character recognition (OCR), and compose the track before you start to hear anything.

If you’re wondering what the secret sauce is to turn words into music, [Roni] isn’t ready to share his source code just yet. But he was able to give us a few high-level explanations of what’s going on inside BookSound. For example, to generate the song’s BPM, the software will count how many words per paragraph are on the page: so a book with shorter paragraphs will consequently have a faster tempo to match the speed at which the author is moving through ideas. Similarly, drum kicks are generated based on the number of syllables in each paragraph. In the future, he’s looking at adding “lyrics” by running commonly used words on the page through a text to speech engine and inserting them into the beat.

We’ve seen practical applications of OCR on the Raspberry Pi in the past and even similar looking book scanning arrangements. But nothing quite like BookSound before, which at this point, is really saying something.

Continue reading “Rock Out To The Written Word With BookSound”

A Whole Other Kind Of Graphical Programming

Java isn’t everyone’s cup of tea. With all its boilerplate and overhead, you’re almost always better off with a proper IDE that handles everything under the hood for you. However, if you learn a new language, you don’t really want to be bothered setting up a clunky and complex IDE. If only you could use a simple, standard Windows program that you are most likely already familiar with. This wish led [RubbaBoy] to create the MSPaintIDE, a Java development environment that let’s you write your code in — yes — MS Paint.

If you’re thinking now that you will end up writing your program with MS Paint’s text tool and create a regular image file from it — then you are right. Once set up, MSPaintIDE will compile all your PNG source files into a regular Java JAR file. And yes, it has syntax highlighting and a dark theme. [RubbaBoy] uses a custom-made OCR to transform the image content into text files and wraps it all into few-button-click environment — including git integration. You can see a demonstration of it in the video after the break, and find the source code on GitHub.

One has to truly admire how far [RubbaBoy] went, considering the tongue-in-cheek nature of this project. And all joking aside, if you’re interested in OCR, this might just be simple enough to begin with. Or you could expand it with some text to speech functionality.
Continue reading “A Whole Other Kind Of Graphical Programming”

Computer Vision For PCB Layout

One of the big problems with doing PCB layout is finding a suitable footprint for the components you want to use. Most tools have some library although — of course — some are better than others. You can often get by with using some generic footprint, too. That’s not handy for schematic layout, though, because you’ll have to remember what pin goes where. But if you can’t find what you are looking for SnapEDA is an interesting source of components available for many different layout tools. What really caught our eye though was a relatively new service they have that uses computer vision and OCR to generate schematic symbols directly from a data sheet. You can see it work in the video below.

The service seems to be tied to parts the database already knows about. and has a known footprint available. As you’ll see in the video, it will dig up the datasheet and let you select the pin table inside. The system does OCR on that part of the datasheet, lets you modify the result, and add anything that it missed.

Continue reading “Computer Vision For PCB Layout”

People With Dementia Can DRESS Smarter

People with dementia have trouble with some of the things we take for granted, including dressing themselves. It can be a remarkably difficult task involving skills like balance, pattern recognition inside of other patterns, ordering, gross motor skill, and dexterity to name a few. Just because something is common, doesn’t mean it is easy. The good folks at NYU Rory Meyers College of Nursing, Arizona State University, and MGH Institute of Health Professions talked with a caregiver focus group to find a way for patients to regain their privacy and replace frustration with independence.

Although this is in the context of medical assistance, this represents one of the ways we can offload cognition or judgment to computers. The system works by detecting movement when someone approaches the dresser with five drawers. Vocal directions and green lights on the top drawer light up when it is time to open the drawer and don the clothing inside. Once the system detects the article is being worn appropriately, the next drawer’s light comes one. A camera seeks a matrix code on each piece of clothing, and if it times out, a caregiver is notified. There is no need for an internet connection, nor should one be given.

Currently, the system has a good track record with identifying the clothing, but it is not proficient at detecting when it is worn correctly, which could lead to frustrating false alarms. Matrix codes seemed like a logical choice since they could adhere to any article of clothing and get washed repeatedly but there has to be a more reliable way. Perhaps IR reflective threads could be sewn into clothing with varying stitch lengths, so the inside and outside patterns are inverted to detect when clothing is inside-out. Perhaps a combination of IR reflective and absorbing material could make large codes without being visible to the human eye. How would you make a machine-washable, machine-readable visual code?

Helping people with dementia is not easy but we are not afraid to start, like this music player. If matrix codes and barcodes get you moving, check out this hacked scrap-store barcode scanner.

Thank you, [Qes] for the tip.

DIY Text-to-Speech With Raspberry Pi

We can almost count on our eyesight to fail with age, maybe even past the point of correction. It’s a pretty big flaw if you ask us. So, how can a person with aging eyes hope to continue reading the printed word?

There are plenty of commercial document readers available that convert text to speech, but they’re expensive. Most require a smart phone and/or an internet connection. That might not be as big of an issue for future generations of failing eyes, but we’re not there yet. In the meantime, we have small, cheap computers and plenty of open source software to turn them into document readers.

[rgrokett] built a RaspPi text reader to help an aging parent maintain their independence. In the process, he made a good soup-to-nuts guide to building one. It couldn’t be easier to use—just place the document under the camera and push the button. A Python script makes the Pi take a picture of the text. Then it uses Tesseract OCR to convert the image to plain text, and runs the text through a speech synthesis engine which reads it aloud. The reader is on as long as it’s plugged in, so it’s ready to work at the push of a button. We can probably all appreciate such a low-hassle design. Be sure to check out the demo after the break.

If you wanted to use this to read books, you’d still have to turn the pages yourself. Here’s a BrickPi reader that solves that one.

Continue reading “DIY Text-to-Speech With Raspberry Pi”