Extracting A Gate From AMD And Intel

The competition between Intel and AMD has been heating up in the last few years as Intel has released chips fabbed with their 14nm++ process and AMD has been using TMSC’s 7nm process. In the wake of the two semiconductor titans clashing, a debate between the merits of 14nm++ and 7nm has sprung up with some confusion about what those numbers actually measure. Not taking either number at their face value, [der8auer] decided to extract a transistor from both Intel’s and AMD’s latest offerings to try and shed some light.

Much of the confusion comes from the switch to the FinFET process. While older planar transistors could be thought of as largely 2d structures, FinFET’s are three dimensional. This means that the whole vertical fin can act as a gate, greatly reducing leakage. It is this fin or gate that [der8auer] is after. On each chip, a thin sliver from the L1 cache was chosen as caches tend to be fairly homogenous sections with transistors that are fairly indicative of the rest of the chip. Starting with a platinum gas intersecting with a focused ion beam on the surface of the chip, [der8auer] built a small deposit of platinum over several hours. This deposit protects the chip when he later cut it at an angle, forming a small lamella 100 micrometers long. In order for the lamella to be properly imaged by the scanning electron microscope, it needed to be even thinner (about 200 to 300nm).

Eventually, [der8auer] was ultimately able to measure the gate height, width, spacing, and other aspects of these two chips. The sheer amount of engineering and analysis that went into this project is remarkable and we love the deep dive into the actual gates that make up the processors we use. If you’re looking for a deep dive into the guts of a processor but perhaps at a more macro scale, why not learn about a forgotten Intel chip from the 1970s?

Continue reading “Extracting A Gate From AMD And Intel”

AMD Acquires Xilinx For $35 Billion

News this morning that AMD has reached an agreement to acquire Xilinx for $35 Billion in stock. The move to gobble up the leading company in the FPGA industry should come as no surprise for many reasons. First, the silicon business is thick in the age of mergers and acquisitions, but more importantly because AMD’s main competitor, Intel, purchased the other FPGA giant Altera back in 2015.

Primarily a maker of computer processors, AMD expands into the reconfigurable computing market as Field-Programmable Gate Arrays (FPGA) can be adapted to different tasks based on what bitstream (programming information written to the chips) has been sent to them. This allows the gates inside the chip to be reorganized to perform different functions at the hardware level even after being put into products already in the hands of customers.

Xilinx invented the FPGA back in the mid-1980s, and since then the falling costs of silicon fabrication and the acceleration of technological advancement have made them evermore highly desirable solutions. Depending on volume, they can be a more economical alternative to ASICs. They also help with future-proofing as technology not in existence at time of manufacture — such as compression algorithms and communications protocols — may be added to hardware in the field by reflashing the bitstream. Xilinx also makes the Zynq line of hybrid chips that contain both ARM and FPGA cores in the same device.

The deal awaits approval from both shareholders and regulators but is expected to be complete by the end of 2021.

This Week In Security: SMBv3, AMD And Intel, And Huawei Backdoors

Ready for more speculative execution news? Hope so, because both Intel and AMD are in the news this week.

LVI Logo

The first story is Load Value Injection, a different approach to reading arbitrary memory. Rather than try to read protected memory, LVI turns that on its head by injecting data into a target’s data. The processor speculatively executes based on that bad data, eventually discovers the fault, and unwinds the execution. As per other similar attacks, the execution still changes the under-the-hood state of the processor in ways that an attacker can detect.

What’s the actual attack vector where LVI could be a problem? Imagine a scenario where a single server hosts multiple virtual machines, and uses Intel’s Secure Guard eXentensions enclave to keep the VMs secure. The low-level nature of the attack means that not even SGX is safe.

The upside here is that the attack is quite difficult to pull off, and isn’t considered much of a threat to home users. On the other hand, the performance penalty of the suggested fixes can be pretty severe. It’s still early in the lifetime of this particular vulnerability, so keep an eye out for further updates.

AMD’s Takeaway Bug

AMD also found itself on the receiving end of a speculative execution attack (PDF original paper here). Collide+Probe and Load+Reload are the two specific attacks discovered by an international team of academics. The attacks are based around the reverse-engineering of a hash function used to speed up cache access. While this doesn’t leak protected data quite like Spectre and Meltdown, it still reveals internal data from the CPU. Time will tell where exactly this technique will lead in the future.

To really understand what’s going on here, we have to start with the concept of a hash table. This idea is a useful code paradigm that shows up all over the place. Python dictionaries? Hash tables under the hood.

Hash table image from Wikipedia by Jorge Stolfi

Imagine you have a set of a thousand values, and need to check whether a specific value is part of that set. Iterating over that entire set of values is a computationally expensive proposition. The alternative is to build a hash table. Create an array of a fixed length, let’s say 256. The trick is to use a hash function to sort the values into this array, using the first eight bits of the hash output to determine which array location each value is stored in.

When you need to check whether a value is present in your set, simply run that value through the hash function, and then check the array cell that corresponds to the hash output. You may be ahead of me on the math — yes, that works out to about four different values per array cell. These hash collisions are entirely normal for a hash table. The lookup function simply checks all the values held in the appropriate cell. It’s still far faster than searching the whole table.

AMD processors use a hash table function to check whether memory requests are present in L1 cache. The Takeaway researchers figured out that hash function, and can use hash collisions to leak information. When the hash values collide, the L1 cache has two separate chunks of memory that need to occupy the same cache line. It handles this by simply discarding the older data when loading the colliding memory. An attacker can abuse this by measuring the latency of memory lookups.checking

If an attacker knows the memory location of the target data, he can allocate memory in a different location that will be stored in the same cache line. Then by repeatedly loading his allocated memory, he knows whether the target location has been accessed since his last check. What real world attack does that enable? One of the interesting ones is mapping out the memory layout of ASLR/KASLR memory. It was also suggested that Takeaway could be combined with the Spectre attack.

There are two interesting wrinkles to this story. First, some have pointed out the presence of a thank-you to Intel in the paper’s acknowledgements. “Additional funding was provided by generous gifts from Intel.” This makes it sound like Intel has been funding security research into AMD processors, though it’s not clear what exactly this refers to.

Lastly, AMD’s response has been underwhelming. At the time of writing, their official statement is that “AMD believes these are not new speculation-based attacks.” Now that the paper has been publicly released, that statement will quickly be proven to be either accurate or misinformed.

Closed Source Privacy?

The Google play store and iOS app store is full of apps that offer privacy, whether it be a VPN, adblocker, or some other amazing sounding application. The vast majority of those apps, however, are closed source, meaning that you have little more than trust in the app publisher to ensure that your privacy is really being helped. In the case of Sensor Tower, it seems that faith is woefully misplaced.

A typical shell game is played, with paper companies appearing to provide apps like Luna VPN and Adblock Focus. While technically providing the services they claim to provide, the real aim of both apps is to send data back to Sensor Tower. When it’s possible, open source is the way to go, but even an open source app can’t protect you against a malicious VPN provider.

Huawei Back Doors

We haven’t talked much about it, but there has been a feud of sorts bubbling between the US government and Huawei. An article was published a few weeks back in the Wall Street Journal accusing Huawei of intentionally embedding backdoors in their network equipment. Huawei posted a response on Twitter, claiming that the backdoors in their equipment are actually for lawful access only. This official denial reminds me a bit of a certain Swiss company…

[Robert Graham] thought the whole story was fishy, and decided to write about it. He makes two important points. First, the Wall Street Journal article cites anonymous US officials. In his opinion, this is a huge red flag, and means that the information is either entirely false, or an intentional spin, and is being fed to journalists in order to shape the news. His second point is that Huawei’s redefinition of government-mandated backdoors as “front doors” takes the line of the FBI, and the Chinese Communist Party, that governments should be able to listen in on your communications at their discretion.

Graham shares a story from a few years back, when his company was working on Huawei brand mobile telephony equipment in a given country. While they were working, there was an unspecified international incident, and Graham watched the logs as a Huawei service tech remoted into the cell tower nearest the site of the incident. After the information was gathered, the logs were scrubbed, and the tech logged out as if nothing had happened.

Did this tech also work for the Chinese government? The NSA? The world will never know, but the fact is that a government-mandated “front door” is still a back door from the users’ perspective: they are potentially being snooped on without their knowledge or consent. The capability for abuse is built-in, whether it’s mandated by law or done in secret. “Front doors” are back doors. Huawei’s gear may not be dirtier than anyone else’s in this respect, but that’s different from saying it’s clean.

Abusing Regex to Fool Google

[xdavidhu] was poking at Google’s Gmail API, and found a widget that caught him by surprise. A button embedded on the page automatically generated an API key. Diving into the Javascript running on that page, as well as an iframe that gets loaded, he arrived at an ugly regex string that was key to keeping the entire process secure. He gives us a tip, www.debuggex.com, a regex visualizer, which he uses to find a bug in Google’s JS code. The essence of the bug is that part of the URL location is interpreted as being the domain name. “www.example.com\.corp.google.com” is considered to be a valid URL, pointing at example.com, but Google’s JS code sees the whole string as a domain, and thinks it must be a Google domain.

For his work, [xdavidhu] was awarded $6,000 because this bit of ugly regex is actually used in quite a few places throughout Google’s infrastructure.

SMBv3 Wormable Flaw

Microsoft’s SMBv3 implementation in Windows 10 and Server 2019 has a vulnerability in how it handles on-the-fly compression, CVE-2020-0796. A malicious packet using compression is enough to trigger a buffer overflow and remote code execution. It’s important to note that this vulnerability doesn’t required an authenticated user. Any unpatched, Internet-accessible server can be compromised. The flaw exists in both server and client code, so an unpatched Windows 10 client can be compromised by connecting to a malicious server.

There seems to have been a planned coordinated announcement of this bug, corresponding with Microsoft’s normal Patch Tuesday, as both Fortinet and Cisco briefly had pages discussing it on their sites. Apparently the patch was planned for that day, and was pulled from the release at the last moment. Two days later, on Thursday the 12th, a fix was pushed via Windows update. If you have Windows 10 machines or a Server 2019 install you’re responsible for, go make sure it has this update, as proof-of-concept code is already being developed.

AMD Introduces New Ryzen Mini PCs To Challenge Intel

For the majority of hacker and maker projects, the miniature computer of choice these last few years has been the Raspberry Pi. While the availability issues that seem to plague each new iteration of these extremely popular Single Board Computers (SBCs) can be annoying, they’ve otherwise proven to be an easy and economical way to perform relatively lightweight computational tasks. Depending on who you ask, the Pi 4 is even powerful enough for day-to-day desktop computing. Not bad for a device that consistently comes in under a $50 USD price point.

Intel NUC compared to the Raspberry Pi

But we all know there are things that the Pi isn’t particularly well suited to. If your project needs a lot of computing power, or you’ve got some software that needs to run on an x86 processor, then you’re going to want to look elsewhere. One of the best options for such Raspberry Pi graduates has been the Intel Next Unit of Computing (NUC).

NUCs have the advantage of being “real” computers, with upgradable components and desktop-class processors. Naturally this means they’re a bit larger than the Raspberry Pi, but not so much as to be impractical. If you’re working on a large rover for example, the size and weight difference between the two will be negligible. The same could be said for small form-factor cluster projects; ten NUCs won’t take a whole lot more space than the same number of Pis.

Unfortunately, where the Intel NUCs have absolutely nothing on the Raspberry Pi is price: these miniature computers start around $250, and depending on options, can sail past the $1,000 mark. Part of this sharp increase in price is naturally the vastly improved hardware, but we also can’t ignore that the lack of any strong competition in this segment hasn’t given Intel much incentive to cut costs, either. When you’re the only game in town, you can charge what you want.

But that’s about to change. In a recent press release, AMD announced an “open ecosystem” that would enable manufacturers to build small form-factor computers using an embedded version of the company’s Ryzen processor. According to Rajneesh Gaur, General Manager of AMD’s Embedded Solutions division, the company felt the time was right to make a bigger push outside of their traditional server and desktop markets:

The demand for high performance computing isn’t limited to servers or desktop PCs. Embedded customers want access to small form factor PCs that can support open software standards, demanding workloads at the edge, and even display 4K content, all with embedded processors that have a planned availability of 10 years.

Continue reading “AMD Introduces New Ryzen Mini PCs To Challenge Intel”

Your Arduino SAMD21 ADC Is Lying To You

One of the great things about the Arduino environment is that it covers a wide variety of hardware with a common interface. Importantly, this isn’t just about language, but also about abstracting away the gory details of the underlying silicon. The problem is, of course, that someone has to decode often cryptic datasheets to write that interface layer in the first place. In a recent blog post on omzlo.com, [Alain] explains how they found a bug in the Arduino SAMD21 analogRead() code which causes the output to be offset by between 25 mV and 57 mV. For a 12-bit ADC operating with a reference of 3.3 V, this represents a whopping error of up to 70 least-significant-bits!

Excerpt from the SAMD wiring_analog.c file in the Arduino Core repo.

While developing a shield that interfaces to 24 V systems, the development team noticed that the ADC readings on a SAMD21-based board were off by a consistent 35 mV; expanding their tests to a number of different analog pins and SAMD21 boards, they saw offsets between 25 mV and 57 mV. It seems like this offset was a known issue; Arduino actually provides code to calibrate the ADC on SAMD boards, which will “fix” the problem with software gain and offset factors, although this can reduce the range of the ADC slightly. Still, having to correct for this level of error on a microcontroller ADC in 2019 — or even 2015 when the code was written — seems really wrong.

After writing their own ADC read routine that produced errors of only between 1 mV and 5 mV (1 to 6 LSB), the team turned their attention to the Arduino code. That code disables the ADC between measurements, and when it is re-enabled for each measurement, the first result needs to be discarded. It turns out that the Arduino code doesn’t wait for the first, garbage, result to finish before starting the next one. That is enough to cause the observed offset issue.

It seems odd to us that such a bug would go unnoticed for so long, but we’ve all seen stranger things happen. There are instructions on the blog page on how to quickly test this bug. We didn’t have a SAMD21-based Arduino available for testing before press time, but if you’ve got one handy and can replicate these experiments to verify the results, definitely let us know in the comments section below.

If you don’t have an Arduino board with a SAMD21 uC, you can find out more about them here.

Home Server Has AMD CPU And IKEA Case

Readers who took part in the glory days of custom PC building will no doubt remember the stress of having to pick a case for their carefully-curated build. You may have wanted to lower the total cost a bit by getting a cheap case, but then you’d be stuck looking at some econo-box day in and day out. Plus, how do you post pictures online to boast about your latest build if there are no transparent windows and a lighting kit?

While some may have spent more time choosing their lighted case fans than their optical drive, [Miroslav Prašil] was surely not one of them. When he decided to build a new NAS for his home network, [Miroslav] decided he wanted to put all his money into the device’s internals, and house his build in a wooden storage crate from IKEA. While the low cost was certainly a major factor in the decision, it turns out the crate actually offers a decent amount of room for hardware components. As an added bonus, it doesn’t look completely terrible sitting out in the living room.

In a detailed series of posts on his blog, [Miroslav] walks us through the entire process of building what he has come to call the “NAScrate”. Wanting gigabit Ethernet and a real SATA controller, [Miroslav] went for the ASRock C70M1, a Mini-ITX board with integrated dual-core AMD processor. While not exactly a powerhouse, it will certainly wipe the floor with the fruit-inspired single board computers that so often dominate these types of builds.

To get his clearances worked out, [Miroslav] rendered the entire build in OnShape, which gave him enough confidence in his design to move on to actual construction. The build involves several 3D printed parts, most notably some clever hard drive mounting brackets which allow the drives to be stacked into a space-saving arrangement while still leaving room for airflow between them.

[Miroslav] deftly avoids any religious debates by leaving off his particular choice for software and operating system on his newly constructed NAS, but he does mention that something like FreeNAS would be a logical choice.

While this may be the first wooden one we’ve covered so far, home servers in general are a favorite project for hackers, from budget-friendly scratch builds all the way up to re-purposed enterprise hardware.

Under The Hood Of AMD’s Threadripper

Although AMD has been losing market share to Intel over the past decade, they’ve recently started to pick up steam again in the great battle for desktop processor superiority. A large part of this surge comes in the high-end, multi-core processor arena, where it seems like AMD’s threadripper is clearly superior to Intel’s competition. Thanks to overclocking expert [der8auer] we can finally see what’s going on inside of this huge chunk of silicon.

The elephant in the room is the number of dies on this chip. It has a massive footprint to accommodate all four dies, each with eight cores. However, it seems as though two of the cores are deactivated due to a combination of manufacturing processes and thermal issues. This isn’t necessarily a bad thing, either, or a reason not to use this processor if you need to utilize a huge number of cores, though; it seems as though AMD found it could use existing manufacturing techniques to save on the cost of production, while still making a competitive product.

Additionally, a larger die size than required opens the door for potentially activating the two currently disabled chips in the future. This could be the thing that brings AMD back into competition with Intel, although both companies still maintain the horrible practice of crippling their chips’ security from the start.