YouTube As Infinite File Storage

Anyone who was lucky enough to secure a Gmail invite back in early 2004 would have gasped in wonder at the storage on offer, a whole gigabyte! Nearly two decades later there’s more storage to be had for free from Google and its competitors, but it’s still relatively easy to hit the paid tier. Consider this though, how about YouTube as an infinite cloud storage medium?

The proof of concept code from [DvorakDwarf] works by encoding binary files into video files which can then be uploaded to the video sharing service. It’s hardly a new idea as there were clever boxes back in the 16-bit era that would do the same with a VHS video recorder, but it seems that for the moment it does what it says, and turns YouTube into an infinite cloud file store.

The README goes into a bit of detail about how the code tries to avoid the effects of YouTube’s compression algorithm. It eschews RGB colour for black and white pixels, and each displayed pixel in the video is made of a block of the real pixels. The final video comes in at around four times the size of the original file, and looks like noise on the screen. There’s an example video, which we’ve placed below the break.

Whether this is against YouTube’s TOS is probably open for interpretation, but we’re guessing that the video site could spot these uploads with relative ease and apply a stronger compression algorithm which would corrupt them. As an alternate approach, we recommend hiding all your important data in podcast episodes.

131 thoughts on “YouTube As Infinite File Storage

          1. Instead of encrypting all your files, ransomware would be more effective if they installed a cracked copy of Oracle software in the background and threatened to report you to Oracle unless you paid up

      1. Since this looks so much like TV static.
        It makes me wonder.
        What if there are clandestine analogue TV stations broadcasting “static” that actually turns out to be data.
        If you’ve tried turning an analogue TV after the switch to digital, you’ll have noticed that there are still odd signals trying to get through in places.
        It would be interesting to try to decode them, just in case, if anyone has the hardware laying around to try.

        1. What you are seeing is probably two things. The ATSC HD TV broadcasts that use the same frequencies that analog TV did and a large portion of what was the upper UHF channels has now been converted for cellular usage. In both cases what you are seeing on the analog TV is interference from these digital transmissions

        2. I think it should be relatively easy to discriminate between “random background noise” and “well formatted NTSC signal containing random noise”

          Even if you’re just broadcasting raw noise with no framing, you’d be able to pick out a clock or carrier wave.

          At the very least, the signal would be significantly stronger than background.

        3. CRJEEA, yes this actually happened in the early 1960s. When a TV repeating satellite was launched from the US. A group of engineers at a 3 letter organization realized that they could piggyback on the TV transponder. They would beam up messages to the satellite using spread spectrum ( frequency hopping ), and the signal would get retransmitted on the other side of the globe. it was noteworthy because there was no visible distortion of the NTSC TV signal.

          1. FHMUX/ERSRS-Eieio and other parlance were peppered among the conversations overheard at Xetron(a subsidiary of Westinghouse, Then Northrupp-Grumman) between a friend and coworkers developing things related to giant frisbee-carrying airplanes given freely to our NATO Allies and for some reason also to Saudi Arabia. Apparently Motorola DSP’s require uncapping and being probed to determine binary in the encryption key data bits to learn the secrets programmed on them. This assumes one can obtain boards located who knows where on an AWACS and live to tell the tale. It wouldn’t matter, the firmware encryption and frequency hopping schedule is passé compared to AES Ultra7. The unknown and undocumented number of rounds used by AES Ultra7 extends the 8bit/256 entry S-boxes to 48bits requiring 1.5petabytes of custom ROM generated by unknown Galois polynomials stored on low production quantity devices. 32 bit extended AES Ultra 4 is feasible for pcs if you can determine a safe generating polynomial. AES Ultra levels use larger Sboxes and extended rounds of encryption where the table size is 16 bits for Extended AES, 24 for Extended2 AES, and Ultra Levels use the formula Ultra Level=(Number of bits / 8). This allows for intermediate levels such as AES Ultra 4.5 using 36 bits, and Ultra 7 using 48 bits.

        4. The US used to broadcast white noise to USSR listening posts in a successful attempt to get them to waste tons of resources decoding white noise. I feel like you’ll probably just end up in the same boat lol 😅 it is a really neat idea though, it’d be a great premise for a hacker movie/book!

        5. Interesting that you seem to think it’s only television. When we switched to optic drives what makes you think there aren’t messages encoded in light. We are all just light and life.

        6. It has definitly happened, but generally one needed to borrow an older satellite that would relay without authentication to get the signal where it needed as the now gone vhf frequencies didn’t carry far enough. Now in the days of broadband internet I can’t imagine still needing to utilize that method.

      2. How about someone putting highly illegal content of explicit nature in an encrypted zip file, and then using this technique to post it online for other people to obtain, download and decrypt. If I, a law abiding citizen, could figure this out, then criminally minded monsters would, too.

        Now imagine, how would typical government respond? A lawsuit? A complete ban for YT and similar services? A new law that gives police power to officially trace and observe online activities of everyone?

        1. Imagine people start putting highly illegal content of explicit nature under removable insoles in shoes in department stores. If I, a law abiding citizen, could figure this out, then criminally minded monsters would, too.

          Now imagine, how would typical government respond? Banning shoes? Surgically removing feet? Genetically modifying embryos to have tougher soles so shoe companies go out of business?

          This and more invented problems to be afraid of, coming up on Coyote News!

          1. You may laugh now, but this sort of thing happened before. The Tor network was created as a tool for reporters, dissidents and the oppressed people as a way to communicate in totalitarian countries. Now it’s used mostly for sale and exchange of drugs, weapons, stolen data, botnets and “highly illegal content of explicit nature”. And this other thing created with good intensions, cryptocurrency, is widely used as preferred method of payment.

            They can’t shut down Tor, because it was created to be unstoppable. But they can shutdown YouTube, to prevent crimes. And how many advertisers would want to advertise on a platform that hides “highly illegal content of explicit nature”? Only crooks and politicians.

          2. I do laugh now. I laugh heartily.

            Because it’s not the same as tor, it’s completely preposterous. You really think Google would be willing to shut down youtube over something like that? All they have to do is ban videos with graphically encoded content. They already have the ability to do it, they just don’t because it’s not a significant problem – yet.

            If it ever became an actual thing people did, it would get stopped as soon as people learned it was happening. Long before the big evil government ever had to get involved.

          3. “They can’t shut down Tor, because it was created to be unstoppable. 2

            “They” don’t shut down Tor because they have no reason to, as it serves its intended purpose of censorship-resistant communication. “They” in this case being the US Naval Research Laboratory, the developers of Tor.

            If you claim “this sort of thing happened before”, you may want to try citing a case where it has actually happened, rather than invented fantasies.

        2. How about just make it so that all combinations of 1’s and 0’s are legal, no exceptions. Arresting someone for data is a massive waste of resources and is a slippery slope to tyranny.

        3. What is actually crazy is that this technique could be used to host malware binaries. As the data is stored on the YouTube server, a request to download the binary to the victims computer could be made very legitimate-looking as the request originates from a very trustworthy domain (in this case, YouTube) with SSL enabled. Ofcourse, this could be done using any site where you can save and read content. Files can be saved to text as well, and encrypted asynchronously to make is unreadable (and thus unidentifyable as malware).

      3. Why slow down speed up the process and get that out there because the faster it’s mainstream the faster you tube becomes the unlimited free storage comp. + Adding in persay my hexadecimal binary code algorithm that I based bit coin on could be flawless as a storage compiler.

      1. Stuff like that you can do with just Windows command lines, I do it every year for my nephew’s Christmas present.

        Amusingly, this last Christmas, instead of hiding files in pictures, images in audio, etc, I created a video with hidden layers of information in distortion and uploaded it to YouTube, so this article is slightly ironic.

      2. Somewhere on this site is an article about the magazine POC OR GTFO. POC in this case means “proof of concept” rather than “person of color”. It’s a Travis Goodspeed project. IIRC one of the proofs of concept was an executable that could also be viewed as a PDF file. This was apparently different from the embedde executable security hole that wasn’t technically an exploit that was found in 2010.

    1. Iirc totally optimal encryption schemes will produce something statistically indistinguishable from noise if you don’t have the key. But the snow on your tv is various naturally-occuring radio noise, cosmic microwave background, stars, et cetera… Wouldn’t that be funny, though? If some pre-big-bang civilization hid an encrypted message in the background noise? Extremely dubious but would make a good sci-fi plotline.

      1. In the Novel _Contact_, the civilization that sent the message ends up telling the MC that, alongside their own SETI programs, they have an additional form of SETI-like program that monitors the mathematics derived from the constants of the universe and the irrational numbers for messages left by intelligences that either designed or altered the universe.

        The MC still comes back to the same scandal she faced in the movie – but she has enough pull to get “waste” time on some lesser big computers. As the book ends, she gets her first hit off of a machine computing Pi in base-11, a binary representation of a circle 300K or so digits in.

    2. Made me think back. Maybe back then ‘They’ were trying to contact and talk with us, but we haven’t caught up to that technology just yet. But now … I wonder if there is any of that footage of ‘Snow’ archive, and have been looked into. Hmmm?

  1. IANAL, but I can’t see anything in the TOS to forbid the upload (assuming you own the copyright). But restoring from the backup is problematic, since restriction 1 in the TOS forbids downloading “except … as expressly authorized by the Service” (or with permission of YouTube), and I assume the only downloading that’s expressly authorized is by streaming without saving a copy.

    And of course it’s a terrible waste of space.

  2. I’ve done a lot of encoding of data on video in the 80’s and 90’s. So, I enjoy the working innards of such an idea, but I can tell you that my experience with YouTube is that it is a very unreliable place to consider keeping your only backup of videos. YouTube accounts are hacked regularly and videos deleted by hackers who don’t even know you. Also, YouTube is notorious for taking away your access privileges for various reasons. Sometimes, when you lose a video and remember that it’s been uploaded to Youtube, it can be a lifesaver, but to rely on it as a backup is just plain sketchy.

  3. qr codes, 1 per channel R,G,B , use the sound channel to encode other information. 1920×1080 can store a lot of smaller qr codes, and the “alpha channel” can also encode some information. Better error correcting and checking and looks less “suspicious” and probably won’t be killed by the algo.

    80×80 px @ 1920×1080 = ~324 codes per channel [ r,g,b ]

    then again, i’d not rely on youtube on as a backup.

    1. QR codes have a bunch of robustness against perspective and other factors that are relevant when capturing images in the real world, but not quite here, and a light compression can screw it easily.

  4. Set the video to private and no one will be any wiser, maybe place some licence free video stuff at the first 30 seconds to thwart the bots.
    And yes, i am waiting for the first pirates to use this method of file distribution. ^^’

    1. I’m thinking this was actually pretty common.

      I have a handful of videos from almost exactly two years ago (Feb 16-22, 2021) where I packed the original Doom game into zip files and encoded it.

      I started with black and white and slowly shrunk the pixels and introduced color to see how much I could pack into the stream before the encoding killed it

      https://www.youtube.com/watch?v=FjNFOTUVtMk

      Unfortunately I got bored when I realized that for optimal packing I would need to reverse engineer the actual settings used to encode the video, and I didn’t have the focus for that.

      Still, fun project

  5. Yes, it works fine. Thanks, I think you have ruined it!

    I used to “archive” very high resolution and higher bit depth images on MiniDV tape as strings of video frames. It was not bad especially on a Mac with the native Firewire. I think it holds up better than writable DVD.

    I wonder if YouTube rejects “video” like that?

        1. Why do you think that? Adblock Plus works fine on firefox and chrome on windows, and on firefox on android. The mobile site is clunky but it’s worth the uninterrupted experience.

      1. It’s funny that we are all talking about this as if it was 1985 and 4t hard drives weren’t available for $100. It’s an interesting discussion, but those 4tb drives will sure look good to you after trying to store and retrieve data on YouTube.

  6. I’ve thought about doing this, many times. And I’m sure YouTube has thought about smart people trying to do this too.

    Encoding data as black and white pixels seems like a bad idea, though: Downloading a video is already a pain because you have to figure out a way to get direct access to the video stream or file, which is basically a game of whack-a-mole. But if YT decides to re-encode your video, or if you can’t get the full resolution because your connection is slow, you’re up a creek, too.

    I think it would be better to start by encoding the data as JPEG macroblocks. Those are blocks of 8×8 pixels in the YUV color space, where usually the brightness data (Y) has 2x or 4x the resolution of the color data (U and V). Macroblocks are encoded as discrete cosine transformations, and the lossy part of the compression algorithm is based on reducing the accuracy at which the data is stored.

    I don’t know enough about video compression to dive deeper into this but I figure if you encode the data into something that resembles an MPEG (or H.263 or whatever) video stream closely enough, you might avoid getting the data re-encoded by YouTube and becoming useless.

    ===Jac

    1. If you can life with low bandwidth then encoding it as a videotext stream on some VHS recording, replacing the original stream, could do the trick with flying under the radar.
      I bet some folks already do that without telling anyone as the idea is so simple if you hear it.

  7. Back in the Amstrad 1512 days I had one of those ISA cards which turned a VHS into a tape backup.

    Didn’t make the pixel blocks quite so small as in the YouTube video though.

  8. There is a hack for unlimited video storage on Youtube as uncompressed. Upload the videos you wish to Youtube. And when you need it, go to Google Takeout and select Youtube videos for archive download (it will select the whole channel). It will take a few hours for Google to make it ready for download and voila, you can download the videos you uploaded as original files back.

    You can create multiple brand channels with the same e-mail, and categorise your videos in this way.

  9. Nice to see someone else thought of this concept. I used nearly the same concept to transfer data in a school project few years back. I used an 8×8 pixel screen at the rate of 32fps (transferring a total of 2048 bits per second) by picking up the black/white off a screen with an array of photodiodes. It was just a concept and got full marks on it. Problem was that the school wanted to take the credit but managed to destroyed the Atmega chip before they even had the chance to get their hands on it…

    1. I made the same concept with a bunch of lasers instead of a screen. I clocked it to a lot of data per second… The only thing you needed was a direct line of sight from the transmitter to the receiver… and patience aligning 8 laser diodes with 8 photodiodes. I did a 100% data transfer over 100m, 97% @ 150m and only failed drastically when I went over 200m. Conditions were not optimal and laser diodes were off aliexpress (very cheap). I have the project in an electronics waste bin somewhere in the garage haha

  10. “we’re guessing that the video site could spot these uploads with relative ease and apply a stronger compression algorithm which would corrupt them”

    More like detect them, delete them, and ban the account if they don’t stop uploading them.

  11. For better data stability over lossy compression or to hide the data for youtube he can use steganografic to mix his data with an static image. But don’t use the LSB but an bit over the threshhold of lossy compression. If the static image is not so calm his data is not seeable in in animaton. He can also code his data in an big qr-code for error-correction by Reed-Solomon-Code or build an own 2D-Code system with error-correction and more than 3kb of data per 177-qr-code.

  12. What good is file storage that isn’t safe or reliable?

    This is a clear violation of the intended use of YouTube and I’m not trying to be the morality police saying this, it has significance.

    Google essentially can scrub anything you upload to YouTube which means anything you upload this way they can use however they want.

    Also, you have no protection. Why would I want to upload terabytes of my data that can be deleted on any whim that serves YouTube? If this practice became popular, do you think Google would just never find out and do anything about it? Just as easily as you can embed data in a video, that data can be detected, extracted, and altered.

    Do you think there won’t come a day when YouTube just deletes these videos or even just makes changes to the formatting that removes or ruins the data?

    One person doing this for personal use is clever. Making a business out of it is unstable, subject to fraud claims and the legal wrath of Google, and peoduces no stable or reliable service to anyone. This is a cease and desist waiting to happen at best. I hope this company can afford really good attorneys, because Google will eventually either crush them and/or buy them to use this for their own needs.

    1. Agree that legal battles are ahead potentially. The technique in the article is not a new one. A startup years ago developed this technology and patented it. The YouTube service policy may be a moot point.

      The patent at https://patents.google.com/patent/US11557015B2 mentions “YouTube may be used as a file server” example in the body of the patent as one use. There’s quite a bit more than just data added to video.

    1. Unfortunately for the author of hat patent, the GitHub commit log shows that the code has been published in 2018, *before* the patent application, and wayback confirms the code was indeed publically available in 2018: http://web.archive.org/web/20180627155112/https://github.com/robertkeizer/youtubefs. This means the patent cannot cover any technology from that repo as there was prior art. The patent author clearly failed to catch this in his research.

      Any patent claim against the code in that repo could easily be dismissed in court.

    2. The patent is completely BOGUS! Do a simple Google search. The technology has been around long before January 2023. Whoever did the research for this patent prior to submitting the application needs to be fired. The patent will be thrown out in the first motion in court.

      1. AFAIK what counts is the filing date which I believe was 2019. Fortunately it’s easy to prove that this technique existed before that as the code has been in GitHub before 2019.

  13. Have you considered Fountain codes?

    https://divan.dev/posts/fountaincodes/

    This could help maxing out the compression while allowing a certain error rate. You could compensate for those errors by adding additional video length, and the best part is you only need to download as much as you need to reconstitute all blocks; if you don’t encounter errors you can just skip the extra frames.

    I also think you should check what type of color encoding is being used and leverage that to decompose the video into separate data channels. For example assuming YUV422 you’re currently only leveraging the chroma channel, you have two additional color channels with a quarter of the bandwidth (2×2 instead of 4×4) which could give you a 50% gain in bandwidth.

  14. I also tried to create something like this which converts any kind of text file into video, and it can also convert from video into original file with preserved extension.

    It converts any kind of text file into multiple splitted text files then for each chunks, it converts text into QRcode image, then every image is merged into one .avi video file.

    For revered procedure, it fetch video frames into image then for each frame it scans QRcode back to text and append lines into final file with preserved file extension.

    GitHub Page Website:
    https://imvickykumar999.github.io/YOUTUBE-AS-INFINITE-FILE-STORAGE/

    YouTube Playlist:
    https://imvickykumar999.github.io/YOUTUBE-AS-INFINITE-FILE-STORAGE/

  15. When Gmail first came out, I remember reading on Slashdot about a hack to use it as a (very very slow) file system. This was long before Dropbox, google drive etc existed, so having a gigabyte of storage that could be used from anywhere with a web connection was actually a pretty big deal.

  16. Adblock on Youtube is possible. They break it sometimes, but adlock devel fixes it again in a few hours r days usually. I have seen ad on youtube just during one month approx. 8 years ago and I can’t buy in one shop in my country since then, because their ads were toooooooooooo annoying.

Leave a Reply

Please be kind and respectful to help make the comments section excellent. (Comment Policy)

This site uses Akismet to reduce spam. Learn how your comment data is processed.