Server racks branded with Internet Archive

Internet Archive Hits One Trillion Web Pages

In case you didn’t hear — on October 22, 2025, the Internet Archive, who host the Wayback Machine at archive.org, celebrated a milestone: one trillion web pages archived, for posterity.

Founded in 1996 by Brewster Kahle the organization and its facilities grew through the late nineties; in 2001 access to their archive was greatly improved by the introduction of the Wayback Machine. From their own website on Oct 21 2009 they explained their mission and purpose:

Most societies place importance on preserving artifacts of their culture and heritage. Without such artifacts, civilization has no memory and no mechanism to learn from its successes and failures. Our culture now produces more and more artifacts in digital form. The Archive’s mission is to help preserve those artifacts and create an Internet library for researchers, historians, and scholars.

We were curious about the Internet Archive technology. Storing a copy (in fact two copies!) of the internet is no mean feat, so we did some digging to find out how it’s done. The best information available is in this article from 2016: 20,000 Hard Drives on a Mission. They keep two copies of every “item”, which are stored in Linux directories. In 2016 they had over 30 petabytes of content and were ingesting at a rate of 13 to 15 terabytes per day, web, and television being the most voluminous.

In 2016 they had around 20,000 individual disk drives, each housed in specialized computers called “datanodes”. The datanodes have 36 data drives plus two operating system drives per machine. Datanodes are organized into racks of 10 machines, having 360 data drives per rack. These racks are interconnected via high-speed Ethernet to form a storage cluster.

Even though content storage tripled over 2012 to 2016, the count of disk drives stayed about the same; this is because of disk drive technology improvements. Datanodes that were once populated with 36 individual 2 terabyte drives are today filled with 8 terabyte drives, moving single node capacity from 72 terabytes (64.8 T formatted) to 288 terabytes (259.2 T formatted) in the same physical space. The evolution of disk density did not happen in a single step, so there are populations of 2, 3, 4, and 8 T drives in the storage clusters.

We will leave you with the visual styling of Hackaday Beta in 2004, and what an early google.com or amazon.com looked like back in the day. Super big shout out to the Internet Archive, thanks for providing such an invaluable service to our community, and congratulations on this excellent achievement.

Access The Information Superhighway With A Mac Plus

For some time now, Apple has developed a reputation for manufacturing computers and phones that are not particularly repairable or upgradable. While this reputation is somewhat deserved, especially in recent years, it seems less true for their older machines. With the second and perhaps most influential computer, the Apple II, being so upgradable that the machine had a production run of nearly two decades. Similarly, the Macintosh Plus of 1986 was surprisingly upgradable and repairable and [Hunter] demonstrates its capabilities by bringing one onto the modern Internet, albeit with a few tricks to adapt the old hardware and software to the modern era.

The Mac Plus was salvaged from a thrift store, and the first issue to solve was that it had some rotten capacitors that had to be replaced before the computer could be reliably powered on at all. [Hunter] then got to work bringing this computer online, with the only major hardware modification being a BlueSCSI hard drive emulator which allows using an SD card instead of an original hard disk. It can also emulate an original Macintosh Ethernet card, allowing it to fairly easily get online.

The original operating system and browser don’t support modern protocols such as HTTPS or scripting languages like Javascript or CSS, so a tool called MacProxy was used to bridge this gap. It serves simplified HTML from the Internet to the Mac Plus, but [Hunter] wanted it to work even better, adding modular domain-specific handling to allow the computer to more easily access sites like Reddit, YouTube, and even Hackaday, although he does call us out a bit for not maintaining our retro page perhaps as well as it ought to be.

[Hunter] has also built an extension to use the Wayback Machine to serve websites to the Mac from a specific date in the past, which really enhances the retro feel of using a computer like this to access the Internet. Of course, if you don’t have original Macintosh hardware but still want to have the same experience of the early Internet or retro hardware this replica Mac will get you there too.

Continue reading “Access The Information Superhighway With A Mac Plus”