Desktop Digitizer Makes Note Capture A Breeze

While it might seem quaint these days, we’ve met many makers and hackers who reach for a pen and a pad when learning something new or working their way through some technical problem. But even if you’re the type of person who thinks best when writing something out on paper, there’s still a good chance that you’ll eventually want to bring those notes and sketches into the digital realm. That’s where things can get a little tricky.

[Spencer Adams-Rand] recently wrote in with his clever solution for capturing written notes and pushing them into Notion, but the hardware design and digitization workflow is flexible enough that it could be adapted to your specific needs — especially since he was good enough to release all the files required to build your own version.

Whether they are hand-written notes, old photographs, or legal documents, digitization boils down to taking a high resolution digital photo of the object and running it through the appropriate software. But getting good and consistent photos is the key, especially when you’re working your way through a lot of pages. [Spencer] started out just snapping pictures with his phone, but quickly found the process was less than ideal.

His custom scanning station addresses that first part of the problem: getting consistent shots. The images are captured using a Raspberry Pi 5 with attached Camera Module 3, while the 3D printed structure of the device makes sure that the camera and integrated lighting system are always in the same position. All he needs to do is place his notepad inside the cavity, hit the button, and it produces a perfect shot of the page.

Using a dedicated digitizing station like this would already provide better results than trying to freehand it with your phone or camera, but [Spencer] took things quite a bit farther. The software side of the project puts a handy user interface on the 5 inch touch screen built into the top of the scanner, while also providing niceties like a REST API and integration with the OpenAI Vision API for optical character recognition (OCR).

Those with an aversion to AI could certainly swap this out for something open source like Tesseract, but [Spencer] notes that not only is OpenAI’s OCR better at reading his handwriting, it spits out structured markdown-like data that’s easier to parse. From there it goes into the Notion API, but again, this could be replaced with whatever you use to collect your digital thoughts.

A device like this would go a long way towards answering a question we posed to the community back in January about the best way to digitize your documents.

A black and white device sits on a beige table. A white rotary knob projects out near the base of it's rectangular shape nearest the camera. Near it is a black rectangular section of the enclosure with six white dots protruding through holes to form a braille display. A ribbon cable snakes out of the top of the enclosure and over the furthest edge of the device, presumably connecting to a camera on the other side of the device.

This Polaroid-esque OCR Machine Turns Text To Braille In The Wild

One of the practical upsides of improved computer vision systems and machine learning has been the ability of computers to translate text from one language or format to another. [Jchen] used this to develop Braille Vision which can turn inaccessible text into braille on the go.

Using a headless Raspberry Pi 4 or 5 running Tesseract OCR, the device has a microswitch shutter to take a picture of a poster or other object. The device processes any text it finds and gives the user an audible cue when it is finished. A rotary knob on the back of the device then moves the braille display pad through each character. When the end of the message is reached, it then cycles back to the beginning.

Development involved breadboarding an Arduino hooked up to some MOSFETs to drive the solenoids for the braille display until the system worked well enough to solder together with wires and perfboard. Everything is housed in a 3D printed shell that appears similar in size to an old Polaroid instant camera.

We’ve seen a vibrating braille output prototype for smartphones, how blind makers are using 3D printing, and are wondering what ever happened with “tixel” displays? If you’re new to braille, try 3D printing your own trainer out of TPU.

Continue reading “This Polaroid-esque OCR Machine Turns Text To Braille In The Wild”

Convert Any Book To A DIY Audiobook?

If the idea of reading a physical book sounds like hard work, [Nick Bild’s] latest project, the PageParrot, might be for you. While AI gets a lot of flak these days, one thing modern multimodal models do exceptionally well is image interpretation, and PageParrot demonstrates just how accessible that’s become.

[Nick] demonstrates quite clearly how little code is needed to get from those cryptic black and white glyphs to sounds the average human can understand, specifically a paltry 80 lines of Python. Admittedly, many of those lines are pulling in libraries, and some are just blank, so functionally speaking, it’s even shorter than that. Of course, the whole application is mostly glue code, stitching together other people’s hard work, but it’s still instructive and fun to play with.

The hardware required is a Raspberry Pi Zero 2 W, a camera (in this case, a USB webcam), and something to hold it above the book. Any Pi with the ability to connect to a camera should also work, however, with just a little configuration.

On the software side, [Nick] pulls in the CV2 library (which is the interface to OpenCV) to handle the camera interfacing, programming it to full HD resolution. Google’s GenAI is used to interface the Gemini 2.5 Flash LLM via an API endpoint. This takes a captured image and a trivial prompt, and returns the whole page of text, quick as a flash.

Finally, the script hands that text over to Piper, which turns that into a speech file in WAV format. This can then be played to an audio device with a call out to the console aplay tool. It’s all very simple at this level of abstraction.

Continue reading “Convert Any Book To A DIY Audiobook?”

Smart Glasses Read Text

You normally think of smart glasses as something you wear as either an accessory or, if you need a little assistance, with corrective lenses. But [akhilnagori] has a different kind of smart eyewear. These glasses scan and read text in the user’s ear.

This project was inspired by a blind child who enjoyed listening to stories but could not read beyond a few braille books. The glasses perform the reading using a Raspberry Pi Zero 2 W and a machine learning algorithm.

Continue reading “Smart Glasses Read Text”

Make Your Bookshelf Clickable

We’ll confess that we have a fondness for real books and plenty of them. So does [James], and he decided he needed a way to take a picture of his bookshelves and make each book clickable to find more information. This is one of those things that sounds fairly simple until you decide to do it. You can try an example of the results and then go back and read about the journey it took to get there.

There are several subtasks involved. First, you want to identify each book’s envelope. It wouldn’t do to click on the Joy of Cooking and get information about Remembrance of Things Past.

The next challenge is reading the title of the book. This can be tricky. Fonts differ. The book could be upside down. Some titles go cross the spine, but most go vertically. The remainder of the task is fairly easy. If you know the region and the title, you can easily find a link (for Google Books, in this case) and build an SVG overlay that maps the areas for each book to the right link.

Continue reading “Make Your Bookshelf Clickable”

You’ve Got Mail: Reading Addresses With OCR

Last time I delivered on this column, I told you about the USPS’ attempts to fully automate a post office. Of course, that’s a bit of a misnomer, since it took 1,500 employees to actually operate the place on a daily basis. Although Project Turnkey in Rhode Island and Project Gateway in California were proving grounds for all kinds of mail sorting and processing equipment, the act of actually reading addresses and routing mail to its final destination still required human intervention and hand coding.

Today, the post office processes hundreds of millions of mail pieces each day using various pieces of equipment. One of those important pieces of equipment is the OCR address reader, which manages to make sense of all kinds of chicken scratch.

Continue reading “You’ve Got Mail: Reading Addresses With OCR”

Immersive Cursive: Growing Up Loopy

Growing up, ours was a family of handwritten notes for every occasion. The majority were left on the kitchen counter next to the sink, or in a particular spot on the all-purpose table in the breakfast nook. Whether one was professing their familial love and devotion on the back of a Valpak coupon, or simply communicating an intent to be home before dinnertime, the words were generally immortalized in BiC on whatever paper was available, and timestamped for the reader’s information. You may have learned cursive in school, but I was born in it — molded by it. The ascenders and descenders betray you because they belong to me.

Both of my parents always seemed to be incapable of printing in anything other than all caps, so I actually preferred to see their cursive most of the time. As a result, I could copy read it quite easily from an early age. Well, I don’t think I ever had any hope of imitating Dad’s signature. But Mom’s on the other hand — like I said in the first installment, it was important for my signature to be distinct from hers, given that we have the same name — first, middle, and last. But I could probably still bust out her signature if it came down to something going on my permanent record.

While my handwriting was sort of naturally headed towards Mom’s, I was more interested in Dad’s style and that of my older brother. He had small caps handwriting down to an art, and my attempts to copy it have always looked angry and stilted by comparison. In addition, my brother’s cursive is lovely and quick, while still being legible.

Continue reading “Immersive Cursive: Growing Up Loopy”