SPEAR attacks - transient bypass of Go bounds checks

Authors: Alexandra, Andrea, Alessandro, Anil

Topics: security, transient-execution

Bypassing bounds checks in Go through speculative control flow hijack #

In a previous post we discussed how a speculative adversary might bypass stack smashing protector (SSP) and hijack the speculative control flow, possibly causing information leaks. In this post we analyse the applicability of SPEAR-style attacks against the Go programming language. We show how compiler-inserted bounds checks for array accesses may be speculatively bypassed, allowing the transient execution of out-of-bounds load and store operations. We show under which conditions this leads to a speculative control flow hijack, which may turn into data exfiltration attacks. This work extends and generalises existing works (Spectre and Speculative buffer overflows) showing new vulnerable code patterns and their applicability to memory-safe languages. In particular, we:

Extended work

Background #

SPEAR vulnerabilities: a brief refresher #

The SPEAR class includes the set of vulnerabilities wherein control-flow metadata (for both forward and backward edges) is first overwritten either speculatively or architecturally – and then it is used to steer the control flow as part of speculative execution. The attacker can cause the overwritten edge to point to the location of a suitable speculative gadget, which will be speculatively executed if the attack succeeds. If the gadget implements a side-channel send operation, the attacker may later retrieve the data using the corresponding receive, thus successfully exfiltrating data from the victim memory space.

Bound checks for array accesses in Go #

Arrays in Go are represented in memory as the following struct

1
2
3
4
5
type slice struct {
    array unsafe.Pointer
    len   int
    cap   int
}

where the address of the contiguous chunk of memory backing the array is stored in array. cap stores the number of elements that array can hold (and implicitly the size of the memory chunk since Go is strongly typed and the size of the elements is always known). len stores the number of elements that have been stored so far.

Whenever an array access is performed, the compiler will add appropriate bounds checks. This is achieved in the course of the compiler pass to translate the abstract syntax tree into the static single assignment intermediate representation by adding an IsInBounds metaoperation before every array load or store. IsInBounds takes two arguments, index of the current access and length of the array, and drives a conditional jump either to the basic block that performs the array access if the index is between zero and length minus one, or a jump to a function that raises a panic otherwise.

IsInBounds is translated by later passes in a sequence of instructions similar to the following

1
2
3
4
mov    0xd0121(%rip),%rcx        # <main.array>
cmp    %rax,0xd0122(%rip)        # <main.array+0x8>
jbe    486a66 <main.main+0xa6>
mov    (%rcx,%rax,8),%rax

The snippet shows a load from an array of integers: at first rcx is loaded with the address of the memory array, a compare instruction is issued between the index in rax and the array length at main.array+8. If the index is not strictly less than the length, the code jumps to a call to the runtime.panicindex function. Otherwise the array access is performed.

Speculatively bypassing bounds checks #

The conditional jump generated by the IsInBounds metaoperation may speculatively execute the wrong jump target and perform a transient load or store operation out of bounds. We show two distinct code patterns, one leveraging a load and one a store, that may lead to speculative control flow hijack.

Threat model #

For both attacks, we assume the same threat model as Spectre v1; that is, a local attacker, who knows the code of the target program and may interact with the program through the external APIs it exposes. In particular, the attacker should be able to supply input to the program leading to a bounds check, which the attacker will attempt to bypass speculatively. In one of the two variants, a preparatory phase is included, during which the attacker supplies input to the program and the program in turn stores it in its memory space. We assume that the attacker knows (or can predict) the memory layout of the program, for instance the location of the variables used to store its input: this may be achieved given that the go memory allocator is deterministic. An important simplifying factor is that ASLR is disabled for most Go programs.

For both attacks, the attacker is required to supply an index to an array access such that the index lies outside of the bounds of the array. The Go runtime will handle this occurrence by calling an appropriate panic function which – in the absence of recovery – will terminate the program; recovery offers a way to continue the execution of the program. If speculative execution does not take the branch that leads to the panic and the taken branch contains an indirect call whose target may be influenced by the array operation, speculative control flow hijack is possible. The attacker may thus transiently execute a side channel send operation using as payload any secret stored in the memory space of the target process (passwords, cryptographic keys, confidential application data), and later recover the payload through an applicable side channel receive operation. If the application recovers from the panic, the attacker may get multiple chances to perform exfiltration attacks over the same secret or different secrets. This holds also if the application is restarted after termination and the secret is not ephemeral to a single run of the application.

Load-based speculative control flow hijack #

The first vulnerable pattern is shown in the following listing

1
array[index].function()

It consists of an interface function call, where the interface is stored into an array of interfaces array, dereferenced at position index. Note that the array must be an array of interfaces so that calling the function is achieved by an indirect call. For the attack to be successful, we require the following conditions to be met: i) index must be attacker-controlled; ii) the attacker must be able to store the value of two pointers in the memory space of the target process at a known location. The first condition is met whenever a process accesses an array using an index that is received as an external input. The second condition is very commonly met since programs store user-provided input for processing. Knowledge of the location of the stored pointers depends on the memory area being used, and is aided by the deterministic nature of Go’s allocator and the absence of address space layout randomisation in Go’s processes.

Without loss of generality, we describe the exploitation of the above dereference/call pattern when function is the first function defined by the interface. Exploitation proceeds as follows: at first the attacker prepares the memory structures that are used when an interface call is performed. The structures are shown below

1
2
3
4
5
6
7
8
9
10
11
12
type iface struct {
    tab  *itab
    data unsafe.Pointer
}

type itab struct {
    inter *interfacetype
    _type *_type
    hash  uint32
    _     [4]byte
    fun   [1]uintptr
}

and are used by dereferencing the tab pointer from the iface struct and then calling into the fun array.

In preparation for the exploitation, the attacker ensures that the memory layout of the target program contains a pattern similar to the one below

1
2
3
4
5
6
7
8
9
0x561000:            0x562000
0x561008:  0x0000000000000000
             ...
0x562000:  0x0000000000000000
0x562008:  0x0000000000000000
0x562010:  0x0000000000000000
0x562018:            0x563000
             ...
0x563000:   <CFH target here>

In particular, assuming that the attacker wants to speculatively redirect the control flow to address 0x563000, the attacker creates a fake itab structure (in the example at 0x562000) such that the first entry in the fun pointer array points to the desired CFH target. Then the attacker further creates a fake iface structure (in the example at 0x561000 such that the tab pointer points to the aforementioned itab structure. With the memory thus prepared, the attacker supplies the index into the array such that the resulting address (which means base address plus index multiplied by the size of a iface structure) equals the fake iface structure (0x561000 in our example). With the index thus set the program will call the runtime.panicindex function; however if the bounds check is mispredicted, the dereference and subsequent indirect call will take place transiently.

We prototype the attack to evaluate its effectiveness in a proof of concept. The PoC contains the vulnerable pattern called in a for loop that is off by one with respect to the length of the array. The loop ensures that the branch predictor considers the bounds check conditional jump as strongly non-taken, thus ensuring the speculative execution of the final out-of-bounds load. The index used to access the array in the loop goes from zero to len minus one to call the function on each of the elements in the array, and is finally set to the target index computed as described above in the last iteration.

In order to verify whether speculative control flow hijack takes place we follow the Speculator approach and instrumented the program to read performance monitor counters (PMC) during the execution of the loop, and set the speculative control flow hijack target to contain a speculation marker. The runtime.panicindex function is modified to read and persist PMC values for each execution. This instrumentation permits us to verify that speculative control flow hijack indeed takes place. The success rate is influenced by several factors that we review here. The most relevant factor is the size of the speculation window, which is influenced by how quickly the correct jump target is determined. The speculation window is maximised if the variables used in the compare instruction that drives the jump – especially the array length – are not present in any of the levels of the cache. In order to get empirical evidence of this fact we instrument the program with a clflush instruction right before the array dereference to ensure that the array length is not cached. In practice, an attacker may achieve the same result by performing cache eviction code sequences. However flushing the cache alone does not ensure a high success rate. This is because the array length is stored right after the base address of the array, whose address is loaded into memory as the first instruction of the dereference sequence. We verify that if the two memory locations belong to different cache lines, the speculation window is maximised. Another factor that influences the success rate is whether the target of the speculative control flow hijack is already in the instruction cache and iTLB. We make sure that this be the case by inserting a call to the marker function in the warmup phase before the loop. We report success rates exceeding 80% when the array length is flushed and is in a separate cache line as the base address.

Store-based speculative control flow hijack #

The second code pattern is shown below

1
2
3
array[index] = value
...
interface.function()

The pattern consists of a store operation of an attacker-controlled value at an attacker-controlled location into an array, followed by an interface call. The interface call does not need to be related to the array in any way. It only needs to be in close proximity of the store operation so that it may still be speculatively executed. This pattern does not require any ability to perform preparatory store operations in the memory space of the target program. The pattern makes use of store-to-load forwarding, since the store in the array is used to (speculatively) overwrite a function pointer which is later (speculatively) loaded and called. The elements stored in the array must permit storage of a pointer. Smaller sizes would permit partial control over the speculative control flow hijack target.

The store part of the pattern consists of a speculative version of a ``write-what-where’’ primitive. It may be exploited in several ways to hijack the interface call: the most basic one would be to overwrite the tab pointer in the iface struct. However this would either require the attacker to perform a set of preparatory stores identical to those discussed in the previous section, or it would restrict the freedom of the attacker to choose a target out of the existing interface pointers. Another strategy would be for the attacker to overwrite the fun pointer in the itab structure directly. These structures are stored in a non-writeable virtual memory region. However, given that the store takes place speculatively, the attacker is able to bypass the write restrictions and overwrite the pointer. We therefore chose to prototype this simpler and more effective variant.

Exploitation proceeds as follows: the attacker computes the index into the array such that the sum of the base array pointer and the index multiplied by the size of the array elements equals the address of the fun pointer in the itab structure for the interface whose function is called in the second part of the vulnerable pattern. The value is set to the address of the desired speculative control flow hijack target. As in the previous section, with the index thus set the program will panic; however if the bounds check is mispredicted, the store-to-load and subsequent indirect call will take place, achieving speculative control flow hijack.

We prototype the attack to evaluate its effectiveness. We employ a similar instrumentation as the previous section, with PMCs and speculation markers employed to identify successful runs, and an off-by-one for loop to set the predictor state. The success rate is similarly influenced by ensuring that the variables driving the conditional branch are not cached, and that the speculative control flow hijack target is in cache. Under these conditions, we report success rates exceeding 80%.

Mitigations #

We investigate possible mitigations for the both attack scenarios. The mitigations consist of two different compiler passes that ensure that the vulnerability is no longer exploitable.

The first mitigation consists of adding an lfence instruction after the cmp instruction in the sequence that implements the IsInBounds metaoperation: the lfence instruction is inserted after the cmp on line 2. The insertion ensures that all prior instructions have completed, which means that there will be no misprediction of the branch target and any out-of-bound access will result in a panic with no transient execution. The instruction is added explicitly in the pass that translates the AST into SSA by defining a new Lfence metaoperation and adding it after each IsInBounds operation. We ensure that the operation is neither reordered nor eliminated.

The second mitigation we investigate entails the addition of an appropriate masking sequence that ensures that the index is set to a “safe” value in case of out-of-bounds accesses. The masking sequence amounts to a no-op in case the access is in bounds by performing an and operation on the index with a sign extended -1 mask; if the access is not in bounds, in our implementation, the masking operation forces an access of the element at index 0 in the array by performing an and operation on the index with a 0 mask. We can see the masking sequence in the following listing:

1
2
3
4
5
6
7
8
cmp    %rcx,%rdx
jae    <raise-panic-code>
mov    %rdx,%rbx
sub    %rcx,%rdx
sbb    %rcx,%rcx
and    %rbx,%rcx
shl    $0x4,%rcx
mov    (%rax,%rcx,1),%rax

after the usual cmp and jmp instructions, length and index and subtracted in order to set the carry flag. Then the sbb instruction is used to set a register to be -1 in case of in bounds access or 0 otherwise. The array is subsequently accessed after performing an and operation on the index with the mask thus obtained. The pattern might be further optimised by using the cmp instruction of the bounds check to set the carry flag. This however is not always possible since the compiler will use a compare instruction with an immediate whenever possible, and the immediate can only be the second source operand, forcing the direction of the comparison instruction. For the sake of simplicity with thus rely on an extra subtraction operation. The masking instruction sequence is added in a similar way employed for the lfence, namely, by defining three new metaoperations – OpMaskStep1, OpMaskStep2 and OpMaskStep3 – which are later lowered into a sub, sbb and and, respectively.

We measure the overhead of both mitigations by building the Go runtime version 1.12.0 and running the full benchmark suite. The empirical cumulative distribution function of the overhead of each of the two mitigation strategies can be seen in the plot below

We can see how the lfence-based approach incurs a high overhead (143% mean and 84% median) due to the fact that lfence will terminate any speculative execution and thus severely curtail the instruction throughput. On the other hand, the masking approach shows a much lighter overhead (12% mean and 6% median) since the instructions involved are simple and do not cause any memory-related operation.

The situation in Go now #

After we disclosed the issue to the Go team back in 2020, mitigations were developed as part of the Go compiler. The mitigations are described here in the form of a compiler switch (-spectre) with two, non-mutually-exclusive options, index and ret. The index option protects against the attacks we describe in this blog post, whereas the ret protects against Spectre v2 by replacing indirect calls with retpolines.

Endnotes #

Cite this article