This Week In Security: The AI Hacker, FortMajeure, And Project Zero

August 15, 2025 by Jonathan Bennett 5 Comments

One of the hot topics currently is using LLMs for security research. Poor quality reports written by LLMs have become the bane of vulnerability disclosure programs. But there is an equally interesting effort going on to put LLMs to work doing actually useful research. One such story is [Romy Haik] at ULTRARED, trying to build an AI Hacker. This isn’t an over-eager newbie naively asking an AI to find vulnerabilities, [Romy] knows what he’s doing. We know this because he tells us plainly that the LLM-driven hacker failed spectacularly.

The plan was to build a multi-LLM orchestra, with a single AI sitting at the top that maintains state through the entire process. Multiple LLMs sit below that one, deciding what to do next, exactly how to approach the problem, and actually generating commands for those tools. Then yet another AI takes the output and figures out if the attack was successful. The tooling was assembled, and [Romy] set it loose on a few intentionally vulnerable VMs.

As we hinted at up above, the results were fascinating but dismal. This LLM successfully found one Remote Code Execution (RCE), one SQL injection, and three Cross-Site Scripting (XSS) flaws. This whole post is sort of sneakily an advertisement for ULTRARED’s actual automated scanner, that uses more conventional methods for scanning for vulnerabilities. But it’s a useful comparison, and it found nearly 100 vulnerabilities among the collection of targets.

The AI did what you’d expect, finding plenty of false positives. Ask an AI to describe a vulnerability, and it will glad do so — no real vulnerability required. But the real problem was the multitude of times that the AI stack did demonstrate a problem, and failed to realize it. [Romy] has thoughts on why this attempt failed, and two points stand out. The first is that while the LLM can be creative in making attacks, it’s really terrible at accurately analyzing the results. The second observation is one of the most important observations to keep in mind regarding today’s AIs. It doesn’t actually want to find a vulnerability. One of the marks of security researchers is the near obsession they have with finding a great score. Continue reading “This Week In Security: The AI Hacker, FortMajeure, And Project Zero” →

This Week In Security: NAME:WRECK, Signal Hacks Back, Updates, And More

April 23, 2021 by Jonathan Bennett 9 Comments

NAME:WRECK is a collection of vulnerabilities in DNS implementations, discovered by Forescout and JSOF Research. This body of research can be seen as a continuation of Ripple20 and AMNESIA:33, as it builds on a class of vulnerability discovered in other network stacks, problems with DNS message compression.

Their PDF Whitepaper contains a brief primer on the DNS message format, which is useful for understanding the class of problem. In such a message, a DNS name is encoded with a length-value scheme, with each full name ending in a null byte. So in a DNS Request, Hackaday.com would get represented as [0x08]Hackaday[0x03]com[0x00]. The dots get replaced by these length values, and it makes for an easily parsable format.

Very early on, it was decided that continually repeating the same host names in a DNS message was wasteful of space, so a compression scheme was devised. DNS compression takes advantage of the maximum host/domain length of 63 characters. This max size means that the binary representation of that length value will never contain “1”s in the first two digits. Since it can never be used, length values starting with a binary “11” are used to point to a previously occurring domain name. The 14 bits that follow this two bit flag are known as a compression pointer, and represent a byte offset from the beginning of the message. The DNS message parser pulls the intended value from that location, and then continues parsing.

The problems found were generally based around improper validation. For example, the NetX stack doesn’t check whether the compression pointer points at itself. This scenario leads to a tight infinite loop, a classic DoS attack. Other systems don’t properly validate the location being referenced, leading to data copy past the allocated buffer, leading to remote code execution (RCE). FreeBSD has this issue, but because it’s tied to DHCP packets, the vulnerability can only be exploited by a device on the local network. While looking for message compression issues, they also found a handful of vulnerabilities in DNS response parsing that aren’t directly related to compression. The most notable here being an RCE in Seimens’ Nucleus Net stack. Continue reading “This Week In Security: NAME:WRECK, Signal Hacks Back, Updates, And More” →

Google Creates Debuggable IPhone

November 3, 2019 by Al Williams 8 Comments

Apple is known for a lot of things, but opening up their platforms to the world isn’t one of those things. According to a recent Google post by [Brandon Azad], there do exist special iPhones that are made for development with JTAG ports and other magic capabilities. The port is in all iPhones (though unpopulated), but is locked down by default. We don’t know what it takes to get a magic iPhone, but we are guessing Google can’t send in the box tops to three Macbook Pros to get on the waiting list. But what is locked can be unlocked, and [Brandon] set out to build a debuggable iPhone.

Exploiting some debug registers, it is possible to debug the A11 CPU at any point in its execution. [Brandon’s] tool single steps the system reset and makes some modifications to the CPU after key instructions to prevent the lockdown of kernel memory. After that, the world’s your oyster. KTRW is a tool built using this technique that can debug an iPhone with a standard cable.

Continue reading “Google Creates Debuggable IPhone” →

This Week In Security: Use Emacs, Crash A Windows Server, And A Cryptocurrency Heist

June 14, 2019 by Jonathan Bennett 5 Comments

It looks like Al was right, we should all be using Emacs. On the 4th of June, [Armin Razmjou] announced a flaw in Vim that allowed a malicious text file to trigger arbitrary code execution. It’s not every day we come across a malicious text file, and the proof of concept makes use of a clever technique — escape sequences hide the actual payload. Printing the file with cat returns “Nothing here.” Cat has a “-v” flag, and that flag spills the secrets of our malicious text file. For simplicity, we’ll look at the PoC that doesn’t include the control characters. The vulnerability is Vim’s modeline function. This is the ability to include editor options in a text file. If a text file only works with 80 character columns, a modeline might set “textwidth=80”. Modeline already makes use of a sandbox to prevent the most obvious exploits, but [Armin] realized that the “:source!” command could run the contents of a file outside that sandbox. “:source! %” runs the contents of the current file — the malicious text file.

:!uname -a||" vi:fen:fdm=expr:fde=assert_fails("source\!\ \%"):fdl=0:fdt="

Taking this apart one element at a time, the “:!” is the normal mode command to run something in the shell, so the rest of the line is what gets run. “uname -a” is the arbitrary command, benign in this case. Up next is the OR operator, “||” which fully evaluates the first term first, and only evaluates what comes after the operator if the first term returns false. In this case, it’s a simple way to get the payload to run even though the rest of the line is garbage, as far as bash is concerned. “vi:” informs Vim that we have a modeline string. “:fen” enables folding, and “:fdm=expr” sets the folding method to use an expression. This feature is usually used to automatically hide lines matching a regular expression. “:fde=” is the command to set the folding expression. Here’s the exploit, the folding expression can be a function like “execute()” or “assert_fails()”, which allows calling the :source! command. This pops execution out of the sandbox, and begins executing the text file inside vim, just as if a user were typing it in from the keyboard. Continue reading “This Week In Security: Use Emacs, Crash A Windows Server, And A Cryptocurrency Heist” →

Project Zero Finds A Graphic Zero Day

February 20, 2017 by Michael Uttmark 46 Comments

After finding the infamous Heartbleed vulnerability along with a variety of other zero days, Google decided to form a full-time team dedicated to finding similar vulnerabilities. That team, dubbed Project Zero, just released a new vulnerability, and this one’s particularly graphic, consisting of a group of flaws in the Windows Nvidia Driver.

Most of the vulnerabilities found were due to poor programming techniques. From writing to user provided pointers blindly, to incorrect bounds checking, most vulnerabilities were due to simple mistakes that were quickly fixed by Nvidia. As the author put it, Nvidia’s “drivers contained a lot of code which probably shouldn’t be in the kernel, and most of the bugs discovered were very basic mistakes.”

When even our mice aren’t safe it may seem that a secure system is unattainable. However, there is light at the end of the tunnel. While the bugs found showed that Nvidia has a lot of work to do, their response to Google was “quick and positive.” Most bugs were fixed well under the deadline, and google reports that Nvidia has been finding some bugs on their own. It also appears that Nvidia is working on re-architecturing their kernel drivers for security. This isn’t the first time we’ve heard from Google’s Project Zero, and in all honesty, it probably won’t be last.

Creative DRAM Abuse With Rowhammer

March 13, 2015 by Adam Fabio 70 Comments

Project Zero, Google’s security analyst unit, has proved that rowhammer can be used as an exploit to gain superuser privileges on some computers. Row Hammer, or rowhammer is a method of flipping bits in DRAM by hammering rows with fast read accesses. [Mark Seaborn] and the rest of the Project Zero team learned of rowhammer by reading [Yoongu Kim’s] 2014 paper “Flipping Bits in Memory Without Accessing Them:
An Experimental Study of DRAM Disturbance Errors” (PDF link). According to [Kim], the memory industry has known about the issue since at least 2012, when Intel began filing patents for mitigation techniques.

“Row hammer” by Dsimic – Own work. Licensed under CC BY-SA 4.0 via Wikimedia Commons.

The technique is deceptively simple. Dynamic RAM is organized into a matrix of rows and columns. By performing fast reads on addresses in the same row, bits in adjacent rows can be flipped. In the example image to the left, fast reads on the purple row can cause bit flips in either of the yellow rows. The Project Zero team discovered an even more aggressive technique they call “double-sided hammering”. In this case, fast reads are performed on both yellow rows. The team found that double-sided hammering can cause more than 25 bits to flip in a single row on a particularly vulnerable computer.

Why does this happen? The answer lies within the internal structure of DRAM, and a bit of semiconductor physics. A DRAM memory bit is essentially a transistor and a capacitor. Data is stored by charging up the capacitor, which immediately begins to leak. DRAM must be refreshed before all the charge leaks away. Typically this refresh happens every 64ms. Higher density RAM chips have forced these capacitors to be closer together than ever before. So close in fact, that they can interact. Repeated reads of one row will cause the capacitors in adjacent rows to leak charge faster than normal. If enough charge leaks away before a refresh, the bit stored by that capacitor will flip.

Cache is not the answer

If you’re thinking that memory subsystems shouldn’t work this way due to cache, you’re right. Under normal circumstances, repeated data reads would be stored in the processor’s data cache and never touch RAM. Cache can be flushed though, which is exactly what the Project Zero team is doing. The X86 CLFLUSH opcode ensures that each read will go out to physical RAM.

Wanton bit flipping is all fine and good, but the Project Zero team’s goal was to use the technique as an exploit. To pull that off, they had to figure out which bits they were flipping, and flip them in such a way as to give elevated access to a user level process. The Project Zero team eventually came up with two working exploits. One works to escape Google’s Native Client (NaCL) sandbox. The other exploit works as a userspace program on x86-64 Linux boxes.

Native Client sandbox escape exploit

Google defines Native Client (NaCL) as ” a sandbox for running compiled C and C++ code in the browser efficiently and securely, independent of the user’s operating system.” It was designed specifically as a way to run code in the browser, without the risk of it escaping to the host system. Let that sink in for a moment. Now consider the fact that rowhammer is able to escape the walled garden and access physical memory. The exploit works by allocating 250MB of memory, and rowhammering on random addresses, and checking for bit flips. Once bit flips are detected, the real fun starts. The exploit hides unsafe instructions inside immediate arguments of “safe” institutions. In an example from the paper:

20EA0: 48 b8 0f 05 EB 0C F4 F4 F4 F4 movabs $0xF4F4F4F40CEB050F,%rax

Viewed from memory address 0x20EA0, this is an absolute move of a 64 bit value to register rax. However, if we move off alignment and read the instruction from address 0x20EA02, now it’s a SYSCALL – (0F 05). The NaCL escape exploit does exactly this, running shell commands which were hidden inside instructions that appeared to be safe.

Linux kernel privilege escalation exploit

The Project Zero team used rowhammer to give a Linux process access to all of physical memory. The process is more complex than the NaCL exploit, but the basic idea revolves around page table entries (PTE). Since the underlying structure of Linux’s page table is well known, rowhammer can be used to modify the bits which are used to translate virtual to physical addresses. By carefully controlling which bits are flipped, the attacking process can relocate its own pages anywhere in RAM. The team used this technique to redirect /bin/ping to their own shell code. Since Ping normally runs with superuser privileges, the shell code can do anything it wants.

The TL;DR

Rowhammer is a nasty vulnerability, but the sky isn’t falling just yet. Google has already patched NaCL by removing access to the CLFLUSH opcode, so NaCL is safe from any currently known rowhammer attacks. Project Zero didn’t run an exhaustive test to find out which computer and RAM manufacturers are vulnerable to rowhammer. In fact, they were only able to flip bits on laptops. The desktop machines they tried used ECC RAM, which may have corrected the bit flips as they happened. ECC RAM will help, but doesn’t guarantee protection from rowhammer – especially when multiple bit flips occur. The best protection is a new machine – New RAM technologies include mitigation techniques. The LPDDR4 standard includes “Targeted Row Refresh” (TRR) and “Maximum Activate Count” (MAC), both methods to avoid rowhammer vulnerability. That’s a good excuse to buy a new laptop if we ever heard one!

If you want to play along at home, the Project Zero team have a rowhammer test up on GitHub.

Hackaday

project zero

6 Articles

This Week In Security: The AI Hacker, FortMajeure, And Project Zero

This Week In Security: NAME:WRECK, Signal Hacks Back, Updates, And More

Google Creates Debuggable IPhone

This Week In Security: Use Emacs, Crash A Windows Server, And A Cryptocurrency Heist

Project Zero Finds A Graphic Zero Day