Dodging A 60-Year-Old Design Flaw In Your RAM

A stick of DDR4 in DIMM format held by some alligator clips

Modern computers use dynamic RAM, a technology that allows very compact bits in return for having to refresh for about 400 nanoseconds every 3-4 microseconds. But what if you couldn’t afford even such a tiny holdup? [LaurieWired] goes into excruciating detail about how to avoid this delay.

But first, why do we care? It once again comes down to high-frequency trading; a couple nanoseconds of latency can be the difference between winning or losing a buy order. You likely miss all the caches and need to fetch data from the remote land of main memory. And if you get unlucky, you’ll be waiting on that price for a precious 400+ nanoseconds! [Laurie] explains all the problems faced in trying to avoid this penalty; you try to get a copy of the data on two independent refresh timers. That’s easier said than done; not only does the operating system hide the physical addresses from you, but the memory controllers themselves also scramble the addresses to the underlying RAM!

For the real computer architecture nerds, there’s a lot more to it, and [Laurie] goes over it in meticulous detail in the video after the break.

Thanks to [Keith Olson] for the tip!

41 thoughts on “Dodging A 60-Year-Old Design Flaw In Your RAM

  1. There is a reason RAM addresses are obfuscated. The number of attacks in recent years is staggering.

    As an illustration: imagine you could tell what line was busy on a small town phone exchange. Not what was said on it, just that it was in use. Even given that amount of information you could build a model and start to make some guesses about what information is shared.

    The more granular control we allow over the RAM the easier it is for a program to take secrets that don’t belong to it.

    I might check out this video.

    1. The makers of ram did not want to own the row hammer problem, I think. Maybe they had documented the “limitation” already, so it fell to the chip makers. Scrambling the address lines does not remove the problem, just makes it less likely… again.

      It leaves consumer hardware with statistically low chances of bit flips, same as before row hammer showed the problem.

  2. Summary: to get rid of the RAM refresh lockout, you write a parallel program that runs on separate cores and allocates some memory, then hammers it with parallel reads to find out whether the delays between different addresses are correlated or not.

    It maps which areas of the OS-given address space exists in different physical memory channels, and uses that information to duplicate the data you’re processing to exist in parallel copies across different physical devices with uncorrelated refresh events.

    The actual application sends the command to read the data and whichever core finishes first gets the job done – the loser is ignored.

    This works as long as the application doesn’t need subsequent access to ram based on the result, because that would add the overhead of synchronizing the state across the worker cores and issuing new read tasks to everyone. It works for high frequency trading where you’re just comparing against a threshold value and firing off a transaction order as soon as possible.

  3. If the loss of a few hundred nanoseconds is that important to your application couldn’t you just hire some specialty company to create a bespoke RAM module out of static chips? Yes, this would not be cheap, efficient, power friendly or maybe even wise, but it should be possible.

    Especially since you’d only need enough RAM to hold your critical application

    1. Twice the dram and a controller that runs the refresh out of sync. Still cheaper than sram. Even get redundancy for free.

      1. Redundancy really wouldn’t help since you wouldn’t know which copy was correct or even if there was an error in one of the copies. IF the memory had ECC, uncorrectable errors could be detected, but then her solution would get even more complicated in order to handle those errors.

  4. The day some researchers announced that they’ve been able to recover in-memory data from laptop SODIMMs many minutes after power off by freezing the RAM, was the day I realized that we’ll never be able to have secure nice things.

      1. Massive reduction of bandwidth and extremely high added cost of certification (even if the technology is “free to implement”, certification is all about “how do I recover my losses if someone lied to me”; cost is risk-driven, not complexity-driven)

        Also, whole new problems in EDAC, suspend/resume, UPS integration (if the brown-out detection is too sensitive, even normal usage can trigger it; if not, those nasty researchers will figure out how to make use of it), etc.

        It’s the kind of problem that people love to offer a solution for… at milspec prices.

    1. Oh, stellar. And now here’s a new report out of the NDSS conference a couple of months ago: Some researchers in Hong Kong came up with a way to install a tiny device (small enough to be hidden inside a network termination wall box) on the end of your telco fiber drop, that transforms it into a passive listening device. Coupled with a detector on the other end of the drop, it measures the distortions in the optical data signal caused by microscopic deformations sound waves make in the walls of the fiber, converting them back into audio with a > 80% transcription accuracy.

      Totally undetectable by standard electronic bug-sniffers (because there’s no RF signal to detect). Completely immune to the interference produced to deafen those same RF-transmitting, magnetic-coil microphones when a signal jammer is employed.

      We seriously can not have secure nice things.

  5. If the loss of a few hundred nanoseconds is that important to your application, you should redefine your application. I look at HFT bros (in a gender-neutral sense) and their endless nanosecond-shaving the same way I look at the crypto miners buying up all the GPUs: Not only are they using all that high-end hardware for less-than-admirable pursuits, but they’re making things worse for all the rest of us in the process.

    1. I don’t even like calling them “high frequency traders”. It suggests they are making investments, which is untrue. People who care about nanoseconds are just front-runners. I.e. when you reach for something on the grocery shelf, they snatch it first so you have to buy it from them at a markup. There’s no social value at all. It could be outlawed without any downsides for society.

      But the problem is too in invisible and abstract to interest the voting population so there will be no political solution.

      1. BTW you can vote with your dollars though. Choose a broker than lets you manually route your order to Investors Exchange (IEX). Fidelity has this feature. IEX is engineered to prevent most front-running.

      2. it’s so abstract that you two don’t even know what you’re talking about, just repeating stuff you found on reddit.

        if your broker is actually front running you, that’s securities fraud.

        but they’re not, because they don’t all want to go to jail.

        they’re selling order flow, which doesn’t sound as scary, so you call it something else that’s a crime to confuse people.

        1. What moo said – neither one of the original posters have any idea of what front running is.

          Citadel securities will fill your robin hood trades in meme-stock-of-the-day for better price on average than trying to sit on any exchange, including IEX.

        2. These people don’t need nor deserve your defense.

          The fact a system, originally meant for helping businesses get investors to help them grow and allow investors to earn money from dividends, has been corrupted into such a disgusting method for the rich to hoard money is a travesty.

        3. I absolutely do not know what front running is, and never claimed I did. Nor do I know what “selling order flow” entails, because holy shit is that a collection of words completely devoid of any substance.

          What I DO know ­- a conclusion arrived at completely on my own, because I have even less charitable things to say about reddit, trust me — is that if your application hinges on performing some operation nanoseconds faster than someone else, it’s fucking shady and you should get a real job. Defending this crap is like defending cat burglars in their pursuit of stealth clothing technology. I’m sure they ARE super invested in making technological advances on that front, however tiny and incremental. And I’m sure their motivations are completely beyond reproach.

  6. Yeah, I designed something to do this using a custom memory interface on an FPGA. It’s actually easier to do on the LPDDR variants because they have per-bank refresh: so even though it happens more often you know exactly which ones to delay.

    It’s still awkward, though.

  7. “…High Frequency Trading…”
    Burn them.
    Make them build the pyre themselves.
    Televise it.

    There is little more vile than parasites draining value from other people’s work, to accrue personal power, then using that power to steal more money and power with an even bigger scheme.

    HFT and micro trades are perfect examples of how human greed has twisted the very idea of “investment” to the point where it cannot even fulfill it’s original purpose.

    I would call them cancer, but I wouldn’t want to associate a mere medical condition with such evil.

    1. There is little more vile than parasites draining value from other people’s work, to accrue personal power, then using that power to steal more money and power with an even bigger scheme.

      All forms of power, economic or otherwise, follow this same route. The nature of existence in this universe is to exploit, consume, and grow. There is no reward mechanism for exercising restraint or mercy. Don’t get me wrong, I agree with you completely, it’s just that the problem is much, much bigger than high-frequency trading alone. Whoever designed this place clearly intended for us to torment each other in this exact way.

      1. That’s why we need to restructure our government and laws using something like Game Theory (the study not the YT channel) to make it so the optimal path is the one that benefits the citizens overall.

        As we learn and develop we should be restructuring to prevent the deep corruption we see.

        1. But “we” doing any “restructuring” requires power, which is gained via exploitation. See the problem? The fundamental mechanisms of reality are at fault here, not our particular implementation of governance. Evil has a competitive advantage on a fundamental/material level. All things drift toward power accumulation and energy consumption. The only way out for us, is to either rewrite the laws of the universe, or escape it completely.

          https://en.wikipedia.org/wiki/Maximum_power_principle

          1. The maximum power principle applies to an unthinking (no memory, no planning) mechanism subject to natural selection. Those maximizers win out in competition when the supply is limitless. When the resources are variable, the power maximizers tend to lose because their victory is based on growth which doesn’t work in periods when the resources are dwindling. Rapid growth and uncertain resource availability turns into boom and bust cycles which threaten the survival of the species. Here the natural selection favors moderation for long term survival.

            For example, plants don’t produce the maximal amount of energy from sunlight because sunlight is very variable and they have to survive long periods without it. They grow slowly instead, and don’t mind being inefficient with the conversion of sunlight to energy. If they were more efficient, they’d end up with more energy (sugars) than they need when the sun does shine, and nowhere to put it because they’re not growing fast enough to use it. All they would be doing is making themselves more edible with extra stores of sugar – something which we artificially select for, against what plants do in nature.

            Hence the notion of useful energy transformation. In the short view, simple power maximizers have the advantage. In the long view, the meaning of what is “useful” changes. Once the ability to think and predict forwards is considered, the game changes entirely, which is why the simple power maximizers tend to be anti-intellectual and nihilistic. They win as long as we are not thinking ahead.

          2. IMO, Evil has a short term gain in exchange for a long term penalty. Consider those many countries who would commit genocide to further their state. Even if all goes to plan which it usually doesn’t, as progress require co-operation they must inevitably cannibalize their own state when they run out of resources to plunder.
            In a way it’s a similar problem to that of locusts. when food is plentiful they are harmless grasshoppers, when it’s scarce they swarm.

          3. @bob It’s a nice thought. Still waiting on the signs of those long-term penalties, though.

            I mean, Donald Trump is going to die an un-incarcerated, financially comfortable political and economic power player. And the mere fact of his demise, alone, does not seem like sufficient recompense for… you know… an entire lifetime of those short-term benefits. Just to take one pertinent example.

      2. Whoever designed this place clearly intended for us to torment each other in this exact way.

        Eleanor Shellstrop: “It took me a while to figure it out, but… they’re never gonna call a train to take us to the Bad Place. They can’t… because we’re already here. This is the Bad Place!”

    2. Leeches is the word you are looking for. They are leeches. Sucking money out of society and providing no value to it. Lots of landlords fall in the same category (not all of them, but lots of them)

  8. Workarounds:
    1. Rewrite the software so that speed-critical data is kept in cache. This is faster than any dram access.
    2. Unless the tech has changed in the 35 years since I last looked, refresh is only needed if a row is not accessed in the maximum allowed refresh time. Disable refresh in hardware; rewrite software to access each row as often as needed.

    For those in the stock market: Place limit orders and plan to hold your stocks for years. A fraction of a penny isn’t going to hurt your results.

    1. “Disable refresh in hardware; rewrite software to access each row as often as needed.”

      No, that doesn’t work. Refreshes are different than accesses anyway, because you activate all the columns in the row to refresh all the cells. For sync DRAM you basically only get auto refresh (where you just say ‘refresh now’) or self-refresh.

      If you have multiple chips, you can stagger their refresh intervals and duplicate the data, and that’ll get you deterministic read latency. For LPDDR you don’t need multiple chips, they have per-bank refresh and you can stagger those with data duplication as well. If you’re just looking for deterministic latency and don’t care about slightly increased overall and lower bandwidth, you could also do RAID-like tricks and then it won’t cost you half the data.

  9. Since there is not single positive comment here yet (dispite the undeniable genius of the approach), here I go:

    This is great. Software hacking hardware limitations is just chefskiss

  10. comes down to high-frequency trading; a couple nanoseconds of latency can be the difference between winning or losing a buy order.

    What a shitshow.

  11. I spent most of this past week having to care way too much about what a DRAM controller was doing. What a weirdly thematically similar video to come across! It’s as if Technology Connections wasn’t nerdy enough for you.

  12. Calling the need for refresh in DRAM a design flaw seems sensational. But the headline caught my eye. Seems like they work as designed! What am I missing?

Leave a Reply

Please be kind and respectful to help make the comments section excellent. (Comment Policy)

This site uses Akismet to reduce spam. Learn how your comment data is processed.