Inside Project Silica, Now On Bakeware

You see it all the time in science fiction: the heroes find old data, read it, and learn how to save the day. But how realistic is that? Forget aliens. Could you read a stack of punch cards or a 9-track tape right now? Probably not, and those are just a handful of decades in the past. Fast forward a few centuries, and punch cards will decay, and tapes will lose their coating. More modern storage is just as bad. It simply isn’t made to last for thousands of years. Microsoft has Project Silica, which aims to store data in quartz glass with a potential lifetime of many thousands of years.

As you might expect, this is a write-once technology. Lasers write the data, and polarization-sensitive microscopes read it back. Electromagnetic fields don’t matter. You can’t accidentally change the data while reading. A square glass platter the size of a DVD can hold about 7 TB of data.

While the program is not a new one, they’ve recently published results using ordinary borosilicate glass (like your Pyrex baking dish is made from) as a storage medium. They say writing is also more efficient, and reading now requires only one camera instead of the three in the original system. The paper identifies birefringent voxel writing, phase voxels, and more.

Obviously, this isn’t for the casual project. But we have to wonder if hackers could do something similar with lower densities, for example. Unlike other methods we’ve seen, no DNA is involved.

28 thoughts on “Inside Project Silica, Now On Bakeware

    1. Yeah, a trained operator could probably hand-read a card that’s too damaged to read by machine. I’ll bet that’s how they would preserve antique Jacquard loom cards now.

      1. Correct to: old trained operator. Many teenagers today have not experienced punch cards other than some side aspect. My son the computer geek did not know the holes in his pay check meant something to an IBM 407

      2. I think most of us on Hackaday could rig up a punch card reader. There was a paper tape reader in the ’70s that was just a line of 9 photodiodes on a pcb, with a bent wire for a tape guide. Cards would require a few more photodiodes and some indexing method [since cards don’t have clock track (the sprocket holes on a tape)], but it shouldn’t be hard

        1. Different reader models had different synchronisation methods which basically did the same thing. Register the left and/or top edges, the one with the sliced off corner, and high precision feed timing.

          IBM punch cards had a column number track along the bottom which might be usable for a DIY solution.

        2. Assuming a punch card or ten last for three thousand years (we have found a few documents written in clay that old, so a big maybe), reading the cards is not going to be the main problem. Figuring out what the encoding means will be the challenge. We know (or can lookup) what those lights glowing through the holes cards mean today, but who will remember c. 5026 that “01001000 01100101 01101100 01101100 01101111” is a binary equivalent of “Hello”? Hm… is that big-endian or little-endian, ASCII or EBCDIC or something weird? Oh well. I’m sure those future archeologists will understand that my 1975 pay stub isn’t an ancient oracle of the forthcoming darkness and destruction.

  1. Call me when I don’t have to pay Microsoft or (insert vendor name) to read/write/store my own dang data. Ownership of myself, and my identity, and my data is more important to me than longevity at the moment. I’m looking at you AI training vampires.

  2. Is this better than the 21TB on 186 reels of film ?

    “GitHub Arctic Code Vault preserves open-source software by storing snapshots of all public repositories on specialized, 1,000-year archival film (piqlFilm)”

    I suspect that in 1000 years one will be still be fully readable, even if we as a species head the way of Idiocracy. While the Microsoft proprietary solution will probably work just as well as trying to restort a Microsoft DOS 3.3 backup under Microsoft DOS 6.22. Or attempting to open a Microsoft Word 7.0 (Office 95) document in Microsoft Word 2003 (Microsoft Office 2003).

  3. Minor nit (but those are the ones I love to pick): pyrex (lower case), sadly, is tempered window glass that isn’t worth spit. PYREX is borosilicate that hasn’t been made for quite some years. Double-sadly, I used to tell students that a PYREX beaker containing boiling water could be plunged into an ice bath with no danger of fracture. Except that beakers aren’t always made of borosilicate anymore. Perhaps that will be changed when enough students are injured by shattered window-glass beakers.

  4. Doesn’t anyone here remember the Long Now Project?

    In the last 30-odd years, I’ve lived through the shift from “if you want to preserve digital information, print it out” to “if you want to preserve analog information, digitize it and store it online”. I really hope I don’t live to see a collapse that reverses that.

  5. It’d be a nice challenge to make anything last a millennium. Where you may need stone tablets to explain the value of the data stored in glass, and explanations on how to build the machine to read the data. Maybe it needs an in between density, as well. Where each step introduces more complex concepts about light, mathematics, electronics, and optics. 7TB per piece of glass is quite remarkable.

    I’m wondering, though. What data is worth storing for a thousand years? Most data out there is not worth much, other data is constantly changing. Code is updated, wikis are modified. So not that useful for Microsoft’s ‘cloud’. Of course there are books and other immutable reference material, but it feels quite niche. An immutable dataset of cat pictures, or the weights of the largest LLM we have built as humans.

    1. other data is constantly changing. Code is updated, wikis are modified.

      And just how often are folks today wishing we had more of this historic data, in a snapshot from the POV of that moment in history. Or excited to see the lower level problem solving that we have largely forgotten how to solve without our insert applicable higher level/abstracted tools here as seen in the miracles those old games pulled off in hardware that now would be considered a microprocessor not a real computer!

      Being able to see the steps taken in the past could also be the saviour of open source projects in the future – assume something as complex as the GNU tools + Linux kernel + Gnome (etc) lasts 300 years at some point the old original code and structures have been eroded to the point the remnants are just not obvious, quite possibly in a programming language only the compliers still understand as nobody left alive is likely to have that arcane knowledge. So when something breaks around them…

      Being able to view the ArchWiki and kernel sources etc of today, maybe even the ‘teach yourself C’ book etc will probably point you in the right direction, and its still going to be way easier to build on the well developed framework than really try to start again.

      Also worth remembering just how many skills have been largely if not entirely lost – still surviving for now but perhaps not for long examples might be blowing your own lab glass, coopering, probably just about all the carpentry that doesn’t rely upon glues and ply/particle boards inherent stability and CNC cutters insane precision. At some point in future that old method might just be worth reviving or reinventing to solve a current problem.

    2. Sadly, I bet the vast bulk of stuffs (aka “data”) generated today worldwide falls somewhere between propaganda and lolcats, with statistically insignificant splashes of things like quantum physics or history : – [

Leave a Reply

Please be kind and respectful to help make the comments section excellent. (Comment Policy)

This site uses Akismet to reduce spam. Learn how your comment data is processed.