Inside Project Silica, Now On Bakeware

You see it all the time in science fiction: the heroes find old data, read it, and learn how to save the day. But how realistic is that? Forget aliens. Could you read a stack of punch cards or a 9-track tape right now? Probably not, and those are just a handful of decades in the past. Fast forward a few centuries, and punch cards will decay, and tapes will lose their coating. More modern storage is just as bad. It simply isn’t made to last for thousands of years. Microsoft has Project Silica, which aims to store data in quartz glass with a potential lifetime of many thousands of years.

As you might expect, this is a write-once technology. Lasers write the data, and polarization-sensitive microscopes read it back. Electromagnetic fields don’t matter. You can’t accidentally change the data while reading. A square glass platter the size of a DVD can hold about 7 TB of data.

While the program is not a new one, they’ve recently published results using ordinary borosilicate glass (like your Pyrex baking dish is made from) as a storage medium. They say writing is also more efficient, and reading now requires only one camera instead of the three in the original system. The paper identifies birefringent voxel writing, phase voxels, and more.

Obviously, this isn’t for the casual project. But we have to wonder if hackers could do something similar with lower densities, for example. Unlike other methods we’ve seen, no DNA is involved.

35 thoughts on “Inside Project Silica, Now On Bakeware

    1. Yeah, a trained operator could probably hand-read a card that’s too damaged to read by machine. I’ll bet that’s how they would preserve antique Jacquard loom cards now.

      1. Correct to: old trained operator. Many teenagers today have not experienced punch cards other than some side aspect. My son the computer geek did not know the holes in his pay check meant something to an IBM 407

      2. I think most of us on Hackaday could rig up a punch card reader. There was a paper tape reader in the ’70s that was just a line of 9 photodiodes on a pcb, with a bent wire for a tape guide. Cards would require a few more photodiodes and some indexing method [since cards don’t have clock track (the sprocket holes on a tape)], but it shouldn’t be hard

        1. Different reader models had different synchronisation methods which basically did the same thing. Register the left and/or top edges, the one with the sliced off corner, and high precision feed timing.

          IBM punch cards had a column number track along the bottom which might be usable for a DIY solution.

        2. Assuming a punch card or ten last for three thousand years (we have found a few documents written in clay that old, so a big maybe), reading the cards is not going to be the main problem. Figuring out what the encoding means will be the challenge. We know (or can lookup) what those lights glowing through the holes cards mean today, but who will remember c. 5026 that “01001000 01100101 01101100 01101100 01101111” is a binary equivalent of “Hello”? Hm… is that big-endian or little-endian, ASCII or EBCDIC or something weird? Oh well. I’m sure those future archeologists will understand that my 1975 pay stub isn’t an ancient oracle of the forthcoming darkness and destruction.

  1. Call me when I don’t have to pay Microsoft or (insert vendor name) to read/write/store my own dang data. Ownership of myself, and my identity, and my data is more important to me than longevity at the moment. I’m looking at you AI training vampires.

      1. Thank you for sharing.

        RE: Optar – two decades ago I kind of went the same route, except the intention was to keep it as open-source as possible and not needing writing anything new.

        My solution was “zip-compress everything, UUCP into ASCII-only version, print in the smallest font possible with minimum margins – so that the cheapest OCR could read it back reliably and the slowest computer could restore it back”. I’ve never run stats, as it wasn’t to be the permanent solution, only proof of concept. I am pretty sure zip-compress-and-store-as-ASCII and reverse could now be done simpler than that.

  2. Is this better than the 21TB on 186 reels of film ?

    “GitHub Arctic Code Vault preserves open-source software by storing snapshots of all public repositories on specialized, 1,000-year archival film (piqlFilm)”

    I suspect that in 1000 years one will be still be fully readable, even if we as a species head the way of Idiocracy. While the Microsoft proprietary solution will probably work just as well as trying to restort a Microsoft DOS 3.3 backup under Microsoft DOS 6.22. Or attempting to open a Microsoft Word 7.0 (Office 95) document in Microsoft Word 2003 (Microsoft Office 2003).

  3. Minor nit (but those are the ones I love to pick): pyrex (lower case), sadly, is tempered window glass that isn’t worth spit. PYREX is borosilicate that hasn’t been made for quite some years. Double-sadly, I used to tell students that a PYREX beaker containing boiling water could be plunged into an ice bath with no danger of fracture. Except that beakers aren’t always made of borosilicate anymore. Perhaps that will be changed when enough students are injured by shattered window-glass beakers.

  4. Doesn’t anyone here remember the Long Now Project?

    In the last 30-odd years, I’ve lived through the shift from “if you want to preserve digital information, print it out” to “if you want to preserve analog information, digitize it and store it online”. I really hope I don’t live to see a collapse that reverses that.

  5. It’d be a nice challenge to make anything last a millennium. Where you may need stone tablets to explain the value of the data stored in glass, and explanations on how to build the machine to read the data. Maybe it needs an in between density, as well. Where each step introduces more complex concepts about light, mathematics, electronics, and optics. 7TB per piece of glass is quite remarkable.

    I’m wondering, though. What data is worth storing for a thousand years? Most data out there is not worth much, other data is constantly changing. Code is updated, wikis are modified. So not that useful for Microsoft’s ‘cloud’. Of course there are books and other immutable reference material, but it feels quite niche. An immutable dataset of cat pictures, or the weights of the largest LLM we have built as humans.

    1. other data is constantly changing. Code is updated, wikis are modified.

      And just how often are folks today wishing we had more of this historic data, in a snapshot from the POV of that moment in history. Or excited to see the lower level problem solving that we have largely forgotten how to solve without our insert applicable higher level/abstracted tools here as seen in the miracles those old games pulled off in hardware that now would be considered a microprocessor not a real computer!

      Being able to see the steps taken in the past could also be the saviour of open source projects in the future – assume something as complex as the GNU tools + Linux kernel + Gnome (etc) lasts 300 years at some point the old original code and structures have been eroded to the point the remnants are just not obvious, quite possibly in a programming language only the compliers still understand as nobody left alive is likely to have that arcane knowledge. So when something breaks around them…

      Being able to view the ArchWiki and kernel sources etc of today, maybe even the ‘teach yourself C’ book etc will probably point you in the right direction, and its still going to be way easier to build on the well developed framework than really try to start again.

      Also worth remembering just how many skills have been largely if not entirely lost – still surviving for now but perhaps not for long examples might be blowing your own lab glass, coopering, probably just about all the carpentry that doesn’t rely upon glues and ply/particle boards inherent stability and CNC cutters insane precision. At some point in future that old method might just be worth reviving or reinventing to solve a current problem.

    2. Sadly, I bet the vast bulk of stuffs (aka “data”) generated today worldwide falls somewhere between propaganda and lolcats, with statistically insignificant splashes of things like quantum physics or history : – [

      1. In fairness to the modern world I doubt it was any different in antiquity, other than the only people that got to hear about the lolcats or see that propaganda were relatively small targeted audiences compared to the basically global distribution of the internet. And in the past the only folks ‘archiving’ in the process of having their conversations in way that might just survive are the ones with something more real to say. You don’t spend a small fortune sending a cat ‘video’ flip book to every corner of the globe, but you just might send your latest theoretical math/physics proof or weird bio/chemistry discovery to a fairly distributed and not insignificant number of your fellow thinkers.

  6. Casual project?

    And absolutely brilliant? ,,, Someone else did it.

    fis is there a video of one Nanocomputer
    communicating with another using Bluetooth?

    AI Overview

    Yes, there are several videos and tutorials demonstrating one Arduino
    Nano (a popular “nanocomputer” or microcontroller board) communicating
    with another using Bluetooth. The most common demonstrations involve
    using Bluetooth Low Energy (BLE) on models like the Arduino Nano 33 BLE
    or using external modules (like HC-05) with classic Nano boards. Reddit Reddit +4

    Video/Demonstration Examples:

    wo-Way Communication: A common demonstration shows two Arduino Nano
    boards (e.g., Nano 33 BLE and Nano 33 BLE Sense) exchanging data, such as
    gesture data or temperature readings, where one acts as the “central” device
    and the other as “peripheral”.

    Sensor Data Transmission: Videos show one Arduino Nano collecting sensor
    data (temperature/humidity) and transmitting it via Bluetooth to another Nano
    to be displayed.

    Bluetooth Controlling a Robot: Another common video demonstration
    shows a robot powered by an Arduino Nano and an HC-05 Bluetooth
    module receiving commands from another device. YouTube YouTube +4

    How to find these videos:

    You can search YouTube for: “Arduino Nano Two-Way Bluetooth
    Communication”

    “Arduino Nano Two-Way Bluetooth Communication”

    “Arduino Nano 33 BLE central peripheral tutorial”

    These projects typically involve writing code that sets up one board
    to scan for and connect to the other, enabling the transfer of data
    via Bluetooth.

  7. Hello Ace,

    Project:
    1 Connect nanocomputer [esp32, ARM M33, )+, ATmega, …]
    running single task simple OS to microcomputer such as
    Raspberry Pi Banana Pi, Orange Pi, Le Potato, .. using Bluetooth.

    2 The micros will act as slaves to the nanocomputer.

    3 The micros will

    A print messages from the nano to screen.

    B micros will send text message to the nano. C micros will
    send, on nano command, 1024 byte file segments to the nano.

    4 micos will receive 1024 byte segments to be written to a file.

    Also under consideration is communication with Android
    cell phones.

    Perhaps you or some of you buddies may be interested
    in implementing parts of this project?

    Please send me your phone number so that we can arrange
    a time to talk.

    bill 505-464-7123

    hp sream 13 you guys gave me has been running 24/7 sing both
    Windows 10 and Ubuntu.

Leave a Reply

Please be kind and respectful to help make the comments section excellent. (Comment Policy)

This site uses Akismet to reduce spam. Learn how your comment data is processed.