The Weird World Of Liquid Cooling For Datacenters

When it comes to high-performance desktop PCs, particularly in the world of gaming, water cooling is popular and effective. However, in the world of datacenters, servers rely on traditional air cooling more often than not, in combination with huge AC systems that keep server rooms at the appropriate temperature.

However, datacenters can use water cooling, too! It just doesn’t always look quite how you’d expect.

Staying Cool

Cooling is of crucial importance to datacenters. Letting hardware get too hot increases failure rates and can even impact service availability. It also uses a huge amount of energy, with cooling accounting for up to 40% of energy use in the average datacenter. This flows into running costs, as well, as energy doesn’t come cheap.

Thus, any efficiency gains in cooling a datacenter can have a multitude of benefits. Outside of just improving reliability and cutting down on emissions through lower energy use, there are benefits to density, too. The more effective cooling available, the more servers and processing power that can be stuffed in a given footprint without running into overheating issues.

Water and liquid cooling techniques can potentially offer a step change in performance relative to traditional air cooling. This is due to the fact that air doesn’t have a great heat capacity compared to water or other special liquid coolants. It’s much easier to transfer a great quantity of heat into a liquid. In some jurisdictions, there is even talk of using the waste heat from datacenters to provide district heating, which is much easier with a source of hot liquid carrying waste heat vs. hot air.

However, liquid cooling comes with drawbacks, too. Leaks can damage electronics if not properly managed, and such systems typically come with added complexity versus running simple fans and air conditioning systems. Naturally, that improved cooling performance comes at a trade-off, else it would be the norm already.

Various Approaches

Danish company Asetek has experimented with direct water cooling of server hardware, while also exploring using the waste heat for district heating purposes. Credit: Asetek press release

The most obvious water-cooling approach for a datacenter would be to swap out fan coolers in servers for water blocks, and link up racks to water cooling circuits. This is achievable, with some companies offering direct-to-chip cooling blocks that can then be hooked into a broader liquid cooling loop in a supporting server rack. It’s the same theory as water cooling a desktop PC, replacing fans and heatsinks with water blocks instead. This method of directly water-cooling servers has the benefit that it can extract a lot of heat, with some claims as high as 80 kW per rack.

However, this approach comes with several drawbacks. It requires opening up and modifying servers prior to installation in the rack. This is undesirable for many operators, and any mistakes during installation can introduce defects that are costly to rectify in both time and equipment. Service and maintenance is also complicated by the need to break water cooling connections when removing servers, too, though this is assuaged somewhat by special “dripless” quick-connect fittings.

A less invasive method involves the use of regular air-cooled servers that are placed in special water-cooled racks. This method removes any need to modify server hardware. Instead, air-to-water heat exchangers mounted at the back of the server rack pick up the heat from the hot server exhaust and dump it in into the liquid coolant. The exhaust air is thus chilled and returns to the room, while the coolant carries the waste heat away. Rooftop cooling towers, like the ones pictured at the top of this article, can then be used to extract the heat from the coolant before it’s returned. It’s not as effective as directly capturing the heat from an on-chip waterblock, but claims are that such systems can extract up to 45 kW of heat per rack.

In addition to using unmodified hardware, the system cuts down on the danger of leaks significantly. Any leaks that happen will be in the back of the server rack, rather than directly on the server’s circuit boards. Additionally, systems typically run at negative pressure so air is sucked in from any holes or damaged tubes, rather than liquid being allowed to leak out.

Microsoft famously ran an underwater datacenter in a sealed tube back in 2018. The experiment had several benefits over traditional land-based datacenters. Credit: Microsoft

More extreme methods, exist, too. Microsoft made waves by running a fully-submerged datacenter off the coast of Scotland back in 2018. With a cluster of conventional servers installed in a watertight tube, heat was rejected to the surrounding waters which kept temperatures very stable. The project ran for two years, and found that the sealed atmosphere and low temperatures were likely responsible for an eight-fold increase in reliability. Project Natick, as it was known, also promised other benefits, such as reduced land costs from locating the hardware offshore.

Microsoft isn’t resting on its laurels, though, and has investigated even wilder concepts of late. The company has developed a two-phase immersion cooling tank for datacenter use. In this design, conventional servers are submerged in a proprietary liquid developed by 3M, which boils at a low temperature of just 50 C (122 F). As the server hardware heats up, the liquid heats up.  It sucks up huge amounts of energy in what is called the latent heat of vaporization, required for the liquid to boil. The gaseous coolant then reaches the condenser on the tank lid, turning back to liquid and raining back down on the servers below.

Microsoft has been experimenting with dumping servers in a non-conductive liquid which cools the immersed hardware via a phase change to gas. Note the bubbling liquid warmed by the heat of the servers. Credit: Microsoft

The immersion method makes for excellent heat transfer between the server hardware and the coolant. As a bonus, it doesn’t just cool down a small section of the CPU via a heatsink. Instead, the entire server is free to dump heat into the liquid. The hope is that this would allow an increase in hardware density in datacenters, as well as an increase in performance, as the high cooling capacity of the immersion method allows for better heat removal in a much smaller space.

Of course, it’s a complex and high-end solution that will take some time before it’s ready for the mainstream. Datacenter operators simply aren’t used to dunking their hardware in liquid, nor used to running them in sealed containers to allow such a system to work. It’s likely that there would also be some maintenance headaches, where immersion tanks would have to be switched off prior to opening them for physical service of the hardware inside.

As humanity continues to crave more computing power, and we strive to cut energy use and emissions, expect further developments in this space. Sheer competition itself is a big driver, too. Any company that can cut running costs, land use, and find more performance will have an advantage over its rivals in the marketplace. Expect watercooling systems to become more mainstream over time, and some of the whackier ideas to find purchase if their major benefits are worth all the hassle. It’s an exciting time to work in datacenter engineering, that much is for sure.

42 thoughts on “The Weird World Of Liquid Cooling For Datacenters

    1. Mineral oil is fine for personal projects but I doubt anyone has ever done that on a commercial scale as there are actually a lot of downsides. Mineral oil is very good at wicking or climbing up porous materials which makes it surprisingly difficult to contain. And it’s surprisingly difficult to clean off once it’s gotten on something. Not all plastics play nice with all oils so you run the risk of accidentally weakening or even dissolving things like wire insulation, mounting brackets, or even capacitors. And of course oil is flammable.

      So you’ve got hundreds of gallons of flammable liquid that keeps escaping its containers and is coating everything and it’s being kept hot by all the servers. All you need is a spark from a wire somewhere whose insulation cracked and you’ve got a nightmare of a fire. What’s not to like?

      1. Actually, there’s a webhosting in Czech Republic that does experiment with oil submersion cooling for a couple of years now (and it’s certainly not a personal project, they’re one of the big players in CZ) so commercial scale usage is happening already. How much of a success it’s going to be, that remains to be seen yet, but they seem pretty invested at this point. Their blog has some updates on that topic from time to time:

        1. Yes, they are trying it for couple of years with two blade servers from HP and basically got nowhere. I would really like to know how they solved the problem, that once something was in the oil bath, it is hard to clean and the oil will just get everywhere.

      1. It’s indeed combustible, but not “highly flammable” (it won’t spontaneously combust at room temperature and its self-ignition point is high enough to use it ). Otherwise you’d see warning labels on your cooking oil bottles…

    1. It is so much more than just lol cats* vs the survival of the human species.
      It is not an easy choice.

      sad cat
      angry cat
      grumpy cat
      lime cat
      cold cat
      warm cat
      smart cat
      And do not get me started on kittens they are just krazy.

    1. Back in the ‘80s I was on an NSFnet committee that met at the Minnesota Supercomputer center. Seeing the liquid cooled Crays along with the ETA and IBM machines always amazed me.

    2. Just like everything being in the “cloud” – mainframe, and everyone using “terminals” to access it.
      Because you can charge for the time used on the mainframe.

      It really is crazy how the cycle of computer costs (expensive, cheap, too cheap) has driven us back down that path.

      1. The book “Kill It With Fire” (don’t mind the funny titlte) has a chapter on this. A really crudial books for those working in IT and dealing with legacy systems (either software or hardware-based).

  1. Another big change happening in the datacenter world is the “cold” side temps going higher.

    Back in the day a datacenter could operate at 15-22 C at the cold side of the servers.
    Today, 30+ C isn’t uncommon, and some apparently look at 45 C. (with the requirement that the hardware in the datacenter is built for this higher ambient temperature.)

    Having a higher cold side temp means that one often don’t need an AC to keep cool, even on a warmer day.
    Now, there is times outdoor temps can reach past the desired “cold” temp. Here we would need an AC again, but the temperature difference we try to pump heat over is much smaller, and therefor our coefficient of performance can be a fair bit higher.

    Downside with higher ambient temps is numerous. Everything from shorter component life to sweaty technicians fainting in the warm aisle.

    Personally, I think a good server shouldn’t complain if it is 35 C ambient. 45+ I can however agree is a bit warm.

  2. I’ve always been interested in the whole “drop it in a tank of coolant” method, but it seems like the right liquid would be hard to find.

    You’d want something non-polar, because that minimizes the opportunity for some contamination turning it into a conductor, but those tend to be oily and it seems like a mess because if you ever need to fix anything because it’s going to take forever to get things clean enough to work on.

    So the ideal coolant would evaporate and leave clean boards… oh, but wait… that means it would just evaporate out of the tanks…

    I suppose there’s some kind of liquid that has a very heavy vapor, like the stuff they use for vapor-phase soldering, but it seems like maybe it’s not a great idea to be in a room with open tanks full of that stuff all day long.

    I used to work for a defense contractor and we had some version of Coolanol coolant for airborne systems, but it was really unpleasant to work with and we were *really* motivated to keep it inside the pipes.

    Anybody have any actual experience with anything that isn’t mineral oil that wasn’t uncomfortably ‘exotic’ ?

    1. Well, back before CFCs (and HCFCs) became the baddie, dunking anything in a CFC to cool it was all the rage. Pick pretty much any temperature you want. Non-conductive, non-flammable…. only problem was people started to use it for *everything*, and then it started to leak, and then… if it had all been kept inside the pipes, we’d still be using it.

      I wonder what formulation that 3M 50 degree stuff is?

        1. Mmmm. I suppose coming from 3M is almost as bad as coming from Dupont. A reformulated CFC that carries a different name but has (mostly) the same properties and gets past the EPA. Good lawyers.

  3. There are quite a few rear door heat exchangers out there. My day job consists of designing IT enclosures and we’ve been through a half dozen manufacturers of such technology. There are much cheaper and less maintenance required means of cooling IT equipment though. As stated above, the requirements for inlet temperature (finally) have been allowed to creep up so “free” air cooling is getting more prevalent. ASHRAE says anything up to 94°F is allowable but you walk into a datacenter and the cold side is frequently set mid to high 60s because airflow problems cause hot spots. They’re frequently haphazardly designed and poorly maintained.

    Any time you don’t need to run a chiller pump your PUE goes down. With good containment design, good cabinet design, and use of heat exchangers rather than refrigeration units the up front costs are quickly paid for in energy savings.

  4. Using liquid cooling for beer making I can relate to the problems. First, anything “cold” will condense water which will then drip onto anything you don’t want to get wet. Also, every single “dripless” connector I’ve used for beer and other industrial applications isn’t perfect and still drips a little. Making beer it doesn’t matter much but pulling out a big expensive switch or something I’d be a lot more worried about water.

  5. Just a mention because I don’t think it was touched on.
    Commercial A/C at this scale uses a large cold water loop to transfer heat out of the building. So some of these chillers could be using a heat exchanger that provides a secondary loop into the machines inside the building.

    The power density becomes a problem too as you stack over 100s of AI GPU or CPU each pulling 100s of watts into a single rack.

    This is bleeding edge, and it does bleed. Check out Hetzner datacenter tour from Der8auer and see how to do it the cheap way xD

  6. Wrangling liquids at datacenter-scale sounds like a pain in the butt! IMO direct connections to the machine only makes sense if you have fewer beefier servers.

    I think the water cooling in the racks is my favorite solution because it’s totally server agnostic and has the least chance of drowning a server. The place I worked had many 1U “pizza box” servers, some co-located servers (customer owned), some blade servers, etc. A real mashup.

    1. Oh yes, I recall a 3083 that early one morning refused to power up. You could hear the pumps in one of the service boxes start and then it would trip out with no indication of why. Tried a few more times, then put in a service call to IBM and went home. Later that morning I came in to see a few dozen empty distilled water bottles lined up in the hallway outside the data center. Turns out the plumbing had sprung a leak.

  7. They extent these companies are going to, and the money spent trying to solve these issues gives insight to how wasteful these buildings are. Wondering where the environmentalists are on this.

    1. They are actually quite efficient at what they do, amirite? From a purely ecological standpoint, it checks out I think.

      It’s just, do we need to do that much internet/data/cryptomining?

  8. IBM model 7302 core memory array was immersed in a tank of temperature controlled oil. This 128K memory was first shipped in 1959 and was prized for its 2.18 microsecond access time.

  9. As mentioned previously, chiller doors are a thing ( for example) which is a very good solution where you have sufficient power infrastructure but insufficient HVAC.

    However, modern high density servers can have multiple kW of fans in a rack due to having to use high back pressure double-stacked screamer fans, so that’s certainly not efficient and there is some savings to be had there. The other thing when you get to waterblock-type liquid cooling is you don’t have to force air though an enormous heatsink, so your fans can be less powerful

    Immersion cooling is a neat science project but maintenance is a PITA, there are things you can’t submerge (like spinning hard drives and fiber optics, depending on the fluid) and the working fluids can be silly expensive if you want to get away from some variety of mineral oil. In addition, you’ll never get the same compute density per unit floor space as a 9’ tall rack with more conventional liquid cooling.

  10. Wonder why they dump the energy straight into the environment. When used to heat (public) pools, this ‘low quality energy’ is very useful. It most likely saves the entire energy consumption of such pools.
    Or move servers to homes, as this page describes:
    Living in a well-insulated home myself, we can heat the entire office in the loft by as single NAS. In summertime it has to be moved to the basement though, to be able to dump its heat, or the office becomes uncomfortably warm.

  11. Underwater data centers not only could allow for reduced energy costs, but also better security, less real estate, as well as other “perks”. China is building a large one right now, South Korea has one planned. Companies are starting to look at this.

  12. Been running a mineral oil submersion cooling system for over 4 years now. It’s a Ryzen 3 3200g on an MSI B450M in a 15 gallon Fluval fish tank. I left the stock AMD cooler on the cpu and it still functions but the fan turns slowly. It has both a 512g Adata M.2 and a 960g Kingston SSD. The PSU is an old 500 watt I had lying around and it’s in there as well. The board is mounted to the plastic plate at the back of the tank that separates the main tank from the built-in filtration system. Using the submersible pump that came with the Fluval it does a great job of circulating the oil and filtering it. One other addition is a standard dual line aquarium air pump connected to 2 large air stones.

    I’ve ran Prime95 in full burn mode for hours and the cpu temp never gets above 51c. My suspicion is that the addition of the air stones made a huge difference in the oil temperature as room temp air is constantly being pumped into the oil.

    But, as one poster mentioned, wiking is a huge issue. I’ve already had 2 monitors ‘infected’ with oil as it wicks through the video cables. Though the monitors still work the oil has gotten between the plastic sheets covering the LCD and now give the appearance of a 1960’s acid trip backdrop. The oil has also wicked through the cat5 and is now pooling in the bottom of the router. Over the 4 years it’s been running I suspect less than a pint of mineral oil has wicked out.

    As far as plastic degradation, so far the only issue I’ve had is one of the plastic covers over the GSkill Triton memory fell off. I’ve had no problems with the machine running otherwise. There’s even a cheap external TP Link usb wifi in there and it still works well.

    While it’s a bit messy at times, cleanup is easy. The benefit? It serves one huge purpose in my computer repair shop. It gets the customers attention like nothing else I’ve tried. If they’ve seen it before, they think, “Oh, cool, you did that.” If they’re unfamiliar with submersion cooling, they think I’ve lost my mind, which is a good thing too.

    Here’s a short video of it running:

Leave a Reply

Please be kind and respectful to help make the comments section excellent. (Comment Policy)

This site uses Akismet to reduce spam. Learn how your comment data is processed.