MNIST Clock Uses Famous Training Database

When training neural networks to recognise things, what you need is a big pile of training data. You then need a subsequent pile of testing data to verify that the network is working as you’d expect. In the field of handwriting recognition, the MNIST database is commonly used to train networks on handwritten numerals. After [Evan Pu] mentioned it would be fun to use this data to create a clock, [Dheera Venkatraman] got down to work.

The original sketch which inspired the build.

The MNIST database contains 60,000 training images, and 10,000 test images. [Dheera] selected an ESP32 to run the project, which packs 4MB of flash storage – more than enough to store the testing database at 196 bytes per numeral. This also gives the project network connectivity, allowing the clock to use Network Time Protocol to stay synchronised – thus eliminating the need for an external RTC. Digits are displayed on four separate e-ink displays, which fits well with the hand-drawn aesthetic. It also means the clock doesn’t unduly light up the room at night.

It’s a fun project that will likely draw a knowing chuckle from those working in the field of handwriting recognition. We’d love to have one on our desk, too. If you’re thinking of attempting a build yourself, check out our recent contest for more inspiration!

6 thoughts on “MNIST Clock Uses Famous Training Database

  1. It’d be nice if it didn’t randomly pull the numbers but sequentially.

    I believe it’d take well under a week to cycle through the set (if 20 hours didn’t repeat the 2 used in say 2 minutes ).

      1. If my calculations and code are correct and assuming no repeats it comes out to a little over 40 days. I didn’t check the database and just assumed it had an equal number of training and testing images for every digit (7000). I also assumed the image would only change when the digit actually changed. With those assumptions I got the maximum number of images for any digits to be 172 (for both 1 and 2) and 7000/172 is around 40.7.

        Here’s the code if you want to check it:
        https://gist.github.com/FPiorski/867d428aa2f2e0776ee03d712a99fbbe

        1. Thanks, I ended up calcuating yet another way with a totally different answer.

          I imagine the least used digit would be 9, i.e. once all the 9’s are consumed so has every other number. Each day it is used 24 times in the minutes (09,19…59) and twice in the hours (09 and 19), assuming it changes with every minute that’s 60 x 2 for hours and 24 x 6 for minutes, 264 then.

          If there are 60k images then there are 6k “9”s, at 264 per day I got 23 days.

          I think the difference is with my calculation then 19:00 and 19:01 use different 9’s this way each would look uniquely hand written.

          If 19:00 and 19:01 use the same 9 then there is 144 9’s consumed per day and I get 42 days (41.666)

          Interesting anyways!

Leave a Reply

This site uses Akismet to reduce spam. Learn how your comment data is processed.