On November 14, Google released a bulletin reporting a serious vulnerability in a number of Intel processors — starting from the Ice Lake generation released in 2019. Potentially this vulnerability can lead to denial of service, privilege escalation, or disclosure of sensitive information. At the time of writing, microcode updates addressing the issue have been released for the 12th and 13th generation Intel processors (Alder Lake and Raptor Lake, respectively). Patches for 10th and 11th generation processors (Ice Lake and Tiger Lake) are in progress. The full list of affected processors is available on the Intel website in the form of an extensive spreadsheet.
According to Intel representatives, the company’s engineers were aware of the processors’ abnormal behavior, but the issue was considered non-critical, and plans to resolve it were postponed to the first half of 2024. However, the situation changed when Google researchers discovered the problem independently. In fact, all of the details about the vulnerability actually come from Google specialists, specifically from this article by Tavis Ormandy.
Tavis Ormandy has discovered numerous major vulnerabilities in various programs and devices. Recently, we wrote about his previous research that found the Zenbleed vulnerability in AMD processors. On that occasion, Tavis talked about adopting fuzzing to find hardware vulnerabilities.
Fuzzing is a testing method that involves feeding random information into the input of the information system being tested. Usually, it’s used to automate the search for software vulnerabilities: a special fuzzing tool is created to interact with the program and monitor its state. Subsequently, tens or hundreds of thousands of tests are conducted to identify unusual behavior in the tested code.
When it comes to testing processors, things are a bit more complicated. We have to generate random programs that operate with no failures of their own and run them on the processor. How can we differentiate normal processor behavior from abnormal behavior in such a case? After all, not every error during software execution leads to a crash. Ormandy proposed a technique in which the same “random” code is simultaneously executed on different processors. Theoretically, the output of an identical program should also be identical; if it isn’t, it could indicate a problem. It was this approach that revealed the vulnerability in the Intel processors.
Useless but dangerous code
To understand how the Reptar vulnerability works, we need to go down to the lowest level of programming — the machine code that processors execute directly. Assembly language is used to represent such basic instructions in a more convenient way. A snippet of assembly language code looks something like this:
The last line features the movsb instruction, which tells the processor to move data from one memory area to another. It’s preceded by the rep modifier, which indicates that the movsb command should be executed several times in a row. Such prefixes are not relevant for all instructions. Intel processors know how to skip meaningless prefixes. Tavis Ormandy gives an example:
Let’s add another prefix, the so-called rex.rxb. It was introduced alongside the x86-64 architecture to handle eight additional processor registers. Although what exactly it does is not that important — all we need to know is that this prefix doesn’t make sense when used with the movsb command:
In fact, this prefix changes the behavior of Intel processors (starting from Ice Lake), although it shouldn’t. In this generation of processors, a technology called “Fast Short Repeat Move” was added. It’s designed to accelerate operations involving data movement in RAM. Among other things, this technology can optimize the execution of the rep movsb instruction. Along with the “Fast Short Repeat Move” feature, a flaw crept into the processor’s logic, first discovered by Intel engineers and later by Google experts.
What could executing this instruction, which disrupts the normal behavior of the processor, lead to? According to Ormandy, the results are unpredictable. The researchers observed execution of random code, parts of the program being ignored, and various failures in the processor, all the way up to complete failure. For the latter, one needs to somehow exploit the vulnerability on a pair of processor cores simultaneously. To check their own systems for this vulnerability, a team of Google researchers prepared a test program.
Unpredictable behavior is bad enough. The most important difference between this “processor bug” and all the others is that it directly threatens providers of virtual private server hosting services, or cloud solution providers in general. This industry is built on the ability to share a single powerful server among dozens or hundreds of clients — each managing their own virtual operating system. It’s crucial that under no circumstances should one client see another client’s data or the data of the host — the operating system managing the virtual containers.
Now imagine that a client can execute a program in their virtual OS that causes the host to crash. At the very least, this could enable a DoS attack on the provider. In fact, Ormandy didn’t present any other exploitation scenarios, citing the fact that it’s very difficult to predict the behavior of a processor operating in black-box mode; although it’s theoretically possible for an attacker to execute specific malicious code instead of relying on random failures. Intel representatives themselves acknowledge that “code execution” and “information disclosure” are possible. Therefore, it’s extremely important to install microcode updates prepared by Intel (for virtual hosting service providers at least).