While the whole industry is scrambling on Spectre, Meltdown focused most of the spotlight on Intel and there is no shortage of outrage in Internet comments. Like many great discoveries, this one is obvious with the power of hindsight. So much so that the spectrum of reactions have spanned an extreme range. From “It’s so obvious, Intel engineers must be idiots” to “It’s so obvious, Intel engineers must have known! They kept it from us in a conspiracy with the NSA!”
We won’t try to sway those who choose to believe in a conspiracy that’s simultaneously secret and obvious to everyone. However, as evidence of non-obviousness, some very smart people got remarkably close to the Meltdown effect last summer, without getting it all the way. [Trammel Hudson] did some digging and found a paper from the early 1990s (PDF) that warns of the dangers of fetching info into the cache that might cross priviledge boundaries, but it wasn’t weaponized until recently. In short, these are old vulnerabilities, but exploiting them was hard enough that it took twenty years to do it.
Building a new CPU is the work of a large team over several years. But they weren’t all working on the same thing for all that time. Any single feature would have been the work of a small team of engineers over a period of months. During development they fixed many problems we’ll never see. But at the end of the day, they are only human. They can be 99.9% perfect and that won’t be good enough, because once hardware is released into the world: it is open season on that 0.1% the team missed.
The odds are stacked in the attacker’s favor. The team on defense has a handful of people working a few months to protect against all known and yet-to-be discovered attacks. It is a tough match against the attackers coming afterwards: there are a lot more of them, they’re continually refining the state of the art, they have twenty years to work on a problem if they need to, and they only need to find a single flaw to win. In that light, exploits like Spectre and Meltdown will probably always be with us.
Let’s look at some factors that paved the way to Intel’s current embarrassing situation.
In Intel’s x86 lineage of processors, the Pentium Pro in 1995 was first to perform speculative execution. It was the high-end offering for demanding roles like multi-user servers, so it had to keep low-privilege users’ applications from running wild. But the design only accounted for direct methods of access. The general concept of side-channel attacks were well-established by that time in the analog world but it hadn’t yet been proven applicable to the digital world. For instance, one of the groundbreaking papers in side-channel attacks, pulling encryption keys out of certain cryptography algorithm implementations, was not published until a year after the Pentium Pro came to market.
Computer security was a very different and a far smaller field in the 1990s. For one, Internet Explorer 6, the subject of many hard lessons in security, was not released until 2001. The growth of our global interconnected network would expand opportunities and fuel a tremendous growth in security research on offense and defense, but that was still years away. And in the early 1990s, software security was in such a horrible state that only a few researchers were looking into hardware.
The Need for Speed
During this time when more people were looking harder at more things, Intel’s never-ending quest for speed inadvertently made the vulnerability easier to exploit. Historically CPU performance advancements have outpaced those for memory, and their growing disparity was a drag on overall system performance. CPU memory caches were designed to help climb this “memory wall”, termed in a 1994 ACM paper. One Pentium Pro performance boost came from moving its L2 cache from the motherboard to its chip package. Later processors added a third level of cache, and eventually Intel integrated everything into a single piece of silicon. Each of these advances made cache access faster, but that also increased the time difference between reading cached and uncached data. On modern processors, this difference stands out clearly against the background noise, illustrated in this paper on Meltdown.
Flash-forward to today: timing attacks against cache memory have become very popular. Last year all the stars aligned as multiple teams independently examined how to employ the techniques against speculative execution. The acknowledgements credited Jann Horn of Google Project Zero as the first to notify Intel of Meltdown in June 2017, triggering investigation into how to handle a problem whose seeds were planted over twenty years ago.
This episode will be remembered as a milestone in computer security. It is a painful lesson with repercussions that will continue reverberating for some time. We have every right to hold industry-dominant Intel to high standards and put them under spotlight. We expect mitigation and fixes. The fundamental mismatch of fast processors that use slow memory will persist, so CPU design will evolve in response to these findings, and the state of the art will move forward. Both in how to find problems and how to respond to them, because there are certainly more flaws awaiting discovery.
So we can’t stop you if you want to keep calling Intel engineers idiots. But we think that the moral of this story is that there will always be exploits like these because attack is much easier than defense. The Intel engineers probably made what they thought was a reasonable security-versus-speed tradeoff back in the day. And given the state of play in 1995 and the fact that it took twenty years and some very clever hacking to weaponize this design flaw, we’d say they were probably right. Of course, now that the cat is out of the bag, it’s going to take even more cleverness to fix it up.