Working on embedded systems used to be easier. You had a microcontroller and maybe a few pieces of analog or digital I/O, and perhaps communications might be a serial port. Today, you have systems with networks and cameras and a host of I/O. Cameras are strange because sometimes you just want an image and sometimes you want to understand the image in some way. If understanding the image involves reading text in the picture, you will want to check out EasyOCR.
The Python library leverages other open source libraries and supports 42 different languages. As the name implies, using it is pretty easy. Here’s the setup:
import easyocr reader = easyocr.Reader(['th','en']) reader.readtext('test.jpg')
The results include four points that define the bounding box of each piece of text, the text, and a confidence level. The code takes advantage of the GPU, but you can run it in a CPU-only mode if you prefer. There are a few other options, including setting the algorithm’s scanning behavior, how it handles multiple processors, and how it converts the image to grayscale. The results look impressive.
According to the project’s repository, they incorporated several existing neural network algorithms and conventional algorithms, so if you want to dig into details, there are links provided to both code and white papers. If you need some inspiration for what to do with OCR, maybe this past project will give you some ideas. Or you could cheat at games.