Everything You Wanted To Know About SDRAM Timing But Were Afraid To Ask

One of the problems with being engaged in our hobby or profession is that people assume if you can build a computer out of chips, you must know all the details of their latest laptop computer. Most of the memory we deal with is pretty simple compared to DDR4 memory and if you’ve ever tried tweaking your memory, you know a good BIOS has dozens of settings for memory. [Actually Hardcore Overclocking] has a great description of a typical DDR4 datasheet and you can watch it in the video below.

Of course, he points out that knowing all this really doesn’t help you much with memory overclocking because you can’t really predict the complex effects without trial and error. However, most of us like to understand the knobs we are randomly twisting. On top of that, one theme of the video is that DRAM is dumb and simple. If you’ve ever thought about using it in a project, this might be a good place to start.

Continue reading “Everything You Wanted To Know About SDRAM Timing But Were Afraid To Ask”

Data Alignment Across Architectures: The Good, The Bad And The Ugly

Even though a computer’s memory map looks pretty smooth and very much byte-addressable at first glance, the same memory on a hardware level is a lot more bumpy. An essential term a developer may come across in this context is data alignment, which refers to how the hardware accesses the system’s random access memory (RAM). This and others are properties of the RAM and memory bus implementation of the system, with a variety of implications for software developers.

For a 32-bit memory bus, the optimal access type for some data would be a four bytes, aligned exactly on a four-byte border within memory. What happens when unaligned access is attempted – such as reading said four-byte value aligned halfway into a word – is implementation defined. Some hardware platforms have hardware support for unaligned access, others throw an exception that the operating system (OS) can catch and fallback to an unaligned routine in software. Other platforms will generally throw a bus error (SIGBUS in POSIX) if you attempt unaligned access.

Yet even if unaligned memory access is allowed, what is the true performance impact? Continue reading “Data Alignment Across Architectures: The Good, The Bad And The Ugly”

DDR-5? DDR-4, We Hardly Knew Ye

This month’s CES saw the introduction of max speed DDR5 memory from SK Hynix. Micron and other vendors are also reportedly sampling similar devices. You can’t get them through normal channels yet, but since you also can’t get motherboards that take them, that’s not a big problem. We hear Intel’s Xeon Sapphire Rapids will be among the first boards to take advantage of the new technology. But that begs the question: what is it?

SDRAM Basics

Broadly speaking, there are two primary contenders for a system that needs RAM memory: static and dynamic. There are newer technologies like FeRAM and MRAM, but the classic choice is between static and dynamic. Static RAM is really just a bunch of flip flops, one for each bit. That’s easy because you set it and forget it. Then later you read it. It can also be very fast. The problem is a flip flop usually takes at least four transistors, and often as many as six, so there’s only so many of them you can pack into a certain area. Power consumption is often high, too, although modern devices can do pretty well.

Continue reading “DDR-5? DDR-4, We Hardly Knew Ye”

Peering Into A Running Brain: SDRAM Refresh Analyzed From Userspace

Over on the Cloudflare blog, [Marek] found himself wondering about computer memory, as we all sometimes do. Specifically, he pondered if he could detect the refresh of his SDRAM from within a running program. We’re probably not ruining the surprise by telling you that the answer is yes — with a little more than 100 lines of C and help from our old friend the Fast Fourier Transform (FFT), [Marek] was able to detect SDRAM refresh cycles every 7818.6 ns, lining right up with the expected result.

The “D” in SDRAM stands for dynamic, meaning that unless periodically refreshed by reading and writing, data in the memory will decay. In this kind of memory, each bit is stored as a charge on a tiny capacitor. Given enough time (which varies with ambient temperature), this charge can leak away to neighboring silicon, turning all the 1s to 0s, and destroying the data. To combat this process, the memory controller periodically issues a refresh command which reads the data before it decays, then writes the data back to fully charge the capacitors again. Done often enough, this will preserve the memory contents indefinitely. SDRAM is relatively inexpensive and available in large capacity compared to the alternatives, but the drawback is that the CPU can’t access the portion of memory being refreshed, so execution gets delayed a little whenever a memory access and refresh cycle collide.

Chasing the Correct Hiccup

[Marek] figured that he could detect this “hiccup,” as he calls it, by running some memory accesses and recording the current time in a tight loop. Of course, the cache on modern CPUs would mean that for a small amount of data, the SDRAM would never be accessed, so he flushes the cache each time. The source code, which is available on GitHub, outputs the time taken by each iteration of the inner loop. In his case, the loop typically takes around 140 ns.

Hurray! The first frequency spike is indeed what we were looking for, and indeed does correlate with the refresh times.

The other spikes at 256kHz, 384kHz, 512kHz and so on, are multiplies of our base frequency of 128kHz called harmonics. These are a side effect of performing FFT on something like a square wave and totally expected.

As [Marek] notes, the raw data doesn’t reveal too much. After all, there are a lot of things that can cause little delays in a modern multitasking operating system, resulting in very noisy data. Even thresholding and resampling the data doesn’t bring refresh hiccups to the fore. To detect the SDRAM refresh cycles, he turned to the FFT, an efficient algorithm for computing the discrete Fourier transform, which excels at revealing periodicity. A few lines of python produced the desired result: a plot of the frequency spectrum of the lengthened loop iterations. Zooming in, he found the first frequency spike at 127.9 kHz, corresponding to the SDRAMs refresh period of 7.81 us, along with a number of other spikes representing harmonics of this fundamental frequency. To facilitate others’ experiments, [Marek] has created a command line version of the tool you can run on your own machine.

If this technique seems familiar, it may be because it’s similar the the Rowhammer attack we covered back in 2015, which can actually change data in SDRAM on vulnerable machines by rapidly accessing adjacent rows. As [Marek] points out, the fact that you can make these kinds of measurements from a userspace program can have profound security implications, as we saw with the meltdown and spectre attacks. We have to wonder what other vulnerabilities are lying inside our machines waiting to be discovered.

Thanks to [anfractuosity] for the tip!

Hackaday Prize Entry: The Cheapest Logic Analyzer

There are piles of old 128MB and 256MB sticks of RAM sitting around in supply closets and in parts bins. For his Hackaday Prize project, [esot.eric] is turning these obsolete sticks of RAM into something useful – a big, fast logic analyzer. It’s cheap, and simple enough that it can be built on a breadboard.

If using old SDRAM in strange configurations seems familiar, you’re correct. This project is based on [esot.eric]’s earlier AVR logic analyzer project that used a slow AVR to measure 32 channels of logic at 30 megasamples per second. The only way this build was possible was by hacking an old stick of RAM to ‘free run’, automatically logging data to the RAM and reading it out with an AVR later.

This project expands on the earlier projects by using bigger sticks of RAM faster, with the ultimate goal being a 32-bit, 133MS/s logic analyzer that is more of a peripheral than a single, monolithic project. With a Raspberry Pi Zero, a stick of RAM, and a few miscellaneous logic chips, this project can become anything from a logic analyzer to a data logger to an oscilloscope. It’s weird, yes, but the parts to make this very handy tool can be found in any hackerspace or workshop, making it a great trick for the enterprising hardware hacker.

SDRAM Logic Analyzer Uses An AVR And A Dirty Trick

We often see “logic analyzer” projects which are little more than microcontrollers reading data as fast as they can, sending it to a PC, and then plotting the results. Depending on how fast the microcontroller is, these projects range from adequate to not very useful.

At first glance, [esot.eric’s] logic analyzer project has an AVR in it, so it ought to be on the low end of the scale. Then you look at the specs: 32 channels at 30 megasamples per second. How does that work with an AVR in it?

The answer lies in the selection of components. The analyzer uses a 128MB SDRAM DIMM (like an older PC might use for main memory). That makes sense; the Arduino can’t store much data internally. However, it isn’t the storage capacity that makes this choice critical. It seems [esot.eric] has a way to make the RAM “free run”.

The idea is to use the Arduino (or other host microcontroller) to set up the memory. Some of the memory’s output bits feedback to the address and data lines. Then the microcontroller steps aside and the SDRAM clocks samples into its memory by itself at the prevailing clock rate for the memory.

Continue reading “SDRAM Logic Analyzer Uses An AVR And A Dirty Trick”