This Week In Security: Bad Signs From Microsoft, An Epyc VM Escape

Code signing is the silver bullet that will save us from malware, right? Not so much, particularly when vendors can be convinced to sign malicious code. Researchers at G DATA got a hit on a Windows kernel driver, indicating it might be malicious. That seemed strange, since the driver was properly signed by Microsoft. Upon further investigation, it became clear that this really was malware. The file was reported to Microsoft, the signature revoked, and the malware added to the Windows Defender definitions.

The official response from Microsoft is odd. They start off by assuring everyone that their driver signing process wasn’t actually compromised, like you would. The next part is weird. Talking about the people behind the malware: “The actor’s goal is to use the driver to spoof their geo-location to cheat the system and play from anywhere. The malware enables them to gain an advantage in games and possibly exploit other players by compromising their accounts through common tools like keyloggers.” This doesn’t seem to really match the observed behavior of the malware — it seemed to be decoding SSL connections and sending the data to the C&C server. We’ll update you if we hear anything more on this one.

Escaping the KVM

Let’s talk virtualization, specifically a flaw in the KVM code for AMD hardware. There’s a few distinctions to cover that makes this more understandable. First, virtualization in Linux is split into two distinct parts. The Kernel Virtual Machine (KVM) is the driver that runs in the kernel, and handles the heavy lifting, like memory management, scheduling, and sending control instructions to the CPU. The other half is the userspace part, the widest use project here being QEMU. This vulnerability is notable because it’s in KVM code itself, meaning that it runs in kernel space.

Our bug revolves around how the VMRUN instruction is handled in a nested virtualization environment. This instruction takes a block of data and initializes a new running VM. When it’s called from withing an already running VM, that data is sanity-checked, and then copied before being passed on to the underlying KVM. This process is the potential problem, because the check-then-copy process isn’t an atomic process. In other words, it’s possible to modify the nested VM initialization data after the checks are performed, but before the data is actually sent down the virtualization stack — it’s a Time Of Check, Time Of Use (TOCTOU) vulnerability.

There’s one more important concept. The KVM module on the bare metal handles the bring-up of all VMs, even nested ones. All VMRUN calls have to go through the hypervisor kernel, to get hardware virtualization acceleration. One bit of the VMRUN data is an indicator whether the KVM is supposed to do the interception of this instruction. Setting that bit to 0 isn’t supported, and just cancels the process. The problem is when a nested VM calls this command, but a process in the outer VM manages to change the bit to 0 after checks. This results in code being run in an unintended way, overwriting the outer VM’s configuration with the inner VM data.

To actually exploit this TOCTOU bug, the outer VM permissions get overwritten, giving the VM greater access to the underlying hardware. One of those permissions allows the VM to overwrite the saved context address for a VMEXIT call. So with a few other tricks, the VM can use the TOCTOU flaw to give itself this permission, construct a malicious context and trick the bare metal KVM process to switching into that malicious context, giving the attacker control over the system. I’ve glossed over a bunch of details here, so if you want the full details, go check out the full write-up, expertly put together by [Felix Wilhelm] of Project Zero.

Linkedin Data

A database of 700 million Linkedin users has shown up for sale on a forum, with one million samples released as evidence of good data. Certain sites are calling this a breach, which isn’t entirely correct, as the data seems to be scraped from the Linkedin API and it doesn’t include password hashes or private messages. This seems to be essentially the same data set as was reported back in April, possibly updated with fresh entries to make up the difference in numbers.

The My Book Story Continues

Last week we told you about the My Books that were being wiped remotely, and I speculated that it could be a ransomware campaign gone wrong. It seems like it wasn’t ransomware at all, but someone covering their tracks after a remote exploit. There are actually two vulnerabilities at play here. The previously known CVE-2018-18472 seems to have been used to install a malicious binary on internet-accessible devices. It’s not yet known what exactly that binary did, but probably something resembling botnet activity. Regardless, a second 0-day vulnerability, CVE-2021-35941, was used to trigger a remote factory reset. An early theory was that the binary was deployed by one attacker, and someone else triggered the reset, but WD’s analysis found that in some cases, both attacks were launched from the same IP. Hopefully more of the story will come to light as the binary is investigated.

Zyxel 0-day

Zyxel has published a response to a recent spate of device compromises. Their response is very short on details so far, most notably lacking a CVE, the details of a vulnerability being exploited, or firmware that actually fixes the vulnerability.

The threat actor attempts to access a device through WAN; if successful, they then bypass authentication and establish SSL VPN tunnels with unknown user accounts.

The post is very heavy on how to prevent attackers from accessing an exposed web interface from the Internet, but it seems to me that the big question is how an attacker could trivially “bypass authentication”. It’s possible that attackers are simply running through a password list, and there isn’t sufficient rate-limiting in the Zyxel firmware. I suspect, though, that this is a 0-day vulnerability being exploited in the wild.

As far as I can tell, it’s over a week since this notice was first announced, and Zyxel still hasn’t revealed whether they have a 0-day at play. That’s irresponsible. Then again, Zyxel doesn’t exactly have the best record for product security.

RPM’s Problem

Ah, the Red Hat Package Manager. In some ways, it defined what a Linux distribution should look like, with decent software management and hard-to-break updates. Seriously, if I could change only one thing about the non-free operating systems out there, it would be to move the whole OS to something like RPM or dpkg. Instantly more usable, but I digress.

One of the benefits of the CentOS forks is that more people are looking at some of the under-the-hood code behind RPM-based systems. As a result, problems are found, like the fact that RPM doesn’t check for certificate revocation or expiration. That sounds like a terrible vulnerability, but keep in mind that it was simply never part of the plan to use certificate revocation. That feature was never implemented, because it hasn’t ever been needed or used. On the other hand, the lack of verification means that if a distro loses control of one of their signing keys, they will have a harder time containing the problem. Either way, patches are being worked on to add the checks to RPM’s OpenPGP implementation.

Disable Print Spooler to Avoid PrintNightmares

There was a Windows vulnerability patched in June of this year, CVE-2021-1675, that allowed RCE using the print spooler. It appears that Microsoft’s patch was a poor one, preventing one particular exploit, rather than fixing the real problem. Once the patch was pushed as part of patch Tuesday, multiple PoCs have been disclosed, but surprisingly some of them still work! The still-working exploit is being tracked as CVE-2021-34527. A quick glance at the PoCs seems to indicate that it’s a way to push an unsigned printer driver into a machine that offers remote printing.

This vulnerability is easy to exploit, and working exploits are available, so expect attackers to add this to their bag of tricks very soon. It’s serious enough that Microsoft and CISA are suggesting that we all turn off print spooler altogether on domain controllers, as well as any system that doesn’t need to print.

14 thoughts on “This Week In Security: Bad Signs From Microsoft, An Epyc VM Escape

  1. > it seemed to be decoding SSL connections and sending the data to the C&C server
    Plug “SSLKEYLOGFILE” into your search engine of choice. The logged keys could then be used for debugging TLS traffic.

    SSL is only as secure as the two ends points and the few hundred valid certificate authorities (CA’s).

  2. Ah the wonderful world of VM security, all those extra layers that may be compromised…

    That said on the whole still far far better to use a VM for anything that might have security concerns, as its much easier to have it restored back to the known good state after every “play” than bare metal, when its working right really isolates environments, and even if it does have a hole its still somewhat isolated and protected – which will make the scum work harder to get to your important stuff…

    1. Disagree. I can restore bare metal to a known good state by DDing the hard drive. With a VM, I can do a VM reset trivially, but if something escaped the sandbox I’ve got a compromised machine. So I’ve not really got a “known good” state any more.

      1. Depending on how paranoid we want to be, compromised bare metal has the possibility of malicious firmware installs, on top of the regular malware concerns. Though if your VM has been escaped, same issue.

      2. As Jonathan says it depends on how paranoid you want to be, and also on time – a VM can be reset to known good (making the assumptions that the host isn’t compromised) in basically no time. DDing a HDD is definitely effective (assuming the HDD’s onboard controller and all the hidden bits in the Mobo etc are not compromised), but far from as quick or convenient.

  3. > Then again, Zyxel doesn’t exactly have the best record for product security.

    None of the most-common router-/switch-manufacturers do, really. If both the device and the firmware running on it is made by a single company, in all likelihood there will be holes and vulnerabilities. There’s no incentive on keeping the firmware up-to-date and bug-free in the long run, since you’re expected to keep buying new hardware quite frequently. In fact, there’s an incentive on not doing so: “Oh, oops! That bug just slipped in there, but it’s too late to fix the firmware anymore. Please, buy our newer device with firmware that doesn’t have that bug!”

  4. >The term “zero-day” originally referred to the number of days since a new piece of software was released to the public, so “zero-day software” was obtained by hacking into a developer’s computer before release.

    Seems like we stretched that zero-day to include end-of-life products these days. May be we should just say vulnerability instead? Buzz words has lost their meaning when used as click-baits..

    >Earlier this week, Zyxel published an advisory on the vulnerability, revealing that it impacted over a dozen NAS devices, including ten that were no longer supported.

    1. Zero day is reference to the types of attacks available before the “developer has an opportunity to create a patch to fix the vulnerability—hence “zero-day.”

    2. That’s an old usage, and I think related more to cracks than vulnerabilities. On that usage, we had a bit of software cracked at something like -30 days once. That was fantastic, as it proved that the issue wasn’t as much the software as the supply chain leaking.

Leave a Reply

Please be kind and respectful to help make the comments section excellent. (Comment Policy)

This site uses Akismet to reduce spam. Learn how your comment data is processed.