SPEAR attacks - transient bypass of Go bounds checks
Bypassing bounds checks in Go through speculative control flow hijack #
In a previous post we
discussed how a speculative adversary might bypass stack smashing protector
(SSP) and hijack the speculative control flow, possibly causing information
leaks. In this post we analyse the applicability of SPEAR-style attacks
against the Go
programming language. We show how compiler-inserted bounds
checks for array accesses may be speculatively bypassed, allowing the
transient execution of out-of-bounds load and store operations. We show
under which conditions this leads to a speculative control flow hijack,
which may turn into data exfiltration attacks. This work extends and
generalises existing works (Spectre and
Speculative buffer overflows) showing
new vulnerable code patterns and their applicability to memory-safe
languages. In particular, we:
- discover, analyse and prototype two vulnerable code patterns; we evaluate the success rate of speculative control-flow hijack attacks over the vulnerable code patterns and discuss the influencing factors;
- we implement countermeasures against the discovered attack and measure their overhead to establish their viability;
- provide thoughts on how the search for vulnerable code patterns might be automated;
Background #
SPEAR vulnerabilities: a brief refresher #
The SPEAR class includes the set of vulnerabilities wherein control-flow metadata (for both forward and backward edges) is first overwritten either speculatively or architecturally – and then it is used to steer the control flow as part of speculative execution. The attacker can cause the overwritten edge to point to the location of a suitable speculative gadget, which will be speculatively executed if the attack succeeds. If the gadget implements a side-channel send operation, the attacker may later retrieve the data using the corresponding receive, thus successfully exfiltrating data from the victim memory space.
Bound checks for array accesses in Go #
Arrays in Go
are represented in memory as the following struct
1
2
3
4
5
type slice struct {
array unsafe.Pointer
len int
cap int
}
where the address of the contiguous chunk of memory backing the array is
stored in array
. cap
stores the number of elements that array
can
hold (and implicitly the size of the memory chunk since Go
is strongly
typed and the size of the elements is always known). len
stores the
number of elements that have been stored so far.
Whenever an array access is performed, the compiler will add appropriate
bounds checks. This is achieved in the course of the compiler pass to
translate the abstract syntax tree into the static single assignment
intermediate representation by adding an IsInBounds
metaoperation before
every array load or store. IsInBounds
takes two arguments, index of the
current access and length of the array, and drives a conditional jump
either to the basic block that performs the array access if the index is
between zero and length minus one, or a jump to a function that raises a
panic
otherwise.
IsInBounds
is translated by later passes in a sequence of instructions
similar to the following
1
2
3
4
mov 0xd0121(%rip),%rcx # <main.array>
cmp %rax,0xd0122(%rip) # <main.array+0x8>
jbe 486a66 <main.main+0xa6>
mov (%rcx,%rax,8),%rax
The snippet shows a load from an array of integers: at first rcx
is
loaded with the address of the memory array, a compare instruction is
issued between the index in rax
and the array length at main.array+8
.
If the index is not strictly less than the length, the code jumps to a call
to the runtime.panicindex
function. Otherwise the array access is
performed.
Speculatively bypassing bounds checks #
The conditional jump generated by the IsInBounds
metaoperation may
speculatively execute the wrong jump target and perform a transient load or
store operation out of bounds. We show two distinct code patterns, one
leveraging a load and one a store, that may lead to speculative control
flow hijack.
Threat model #
For both attacks, we assume the same threat model as Spectre v1; that is, a
local attacker, who knows the code of the target program and may interact
with the program through the external APIs it exposes. In particular, the
attacker should be able to supply input to the program leading to a bounds
check, which the attacker will attempt to bypass speculatively. In one of
the two variants, a preparatory phase is included, during which the
attacker supplies input to the program and the program in turn stores it in
its memory space. We assume that the attacker knows (or can predict) the
memory layout of the program, for instance the location of the variables
used to store its input: this may be achieved given that the go memory
allocator is deterministic. An important simplifying factor is that ASLR is
disabled for most Go
programs.
For both attacks, the attacker is required to supply an index to an array
access such that the index lies outside of the bounds of the array. The
Go
runtime will handle this occurrence by calling an appropriate panic
function which – in the absence of recovery – will terminate the program;
recovery offers a way to continue the execution of the program. If
speculative execution does not take the branch that leads to the panic and
the taken branch contains an indirect call whose target may be influenced
by the array operation, speculative control flow hijack is possible. The
attacker may thus transiently execute a side channel send operation using
as payload any secret stored in the memory space of the target process
(passwords, cryptographic keys, confidential application data), and later
recover the payload through an applicable side channel receive operation.
If the application recovers from the panic, the attacker may get multiple
chances to perform exfiltration attacks over the same secret or different
secrets. This holds also if the application is restarted after termination
and the secret is not ephemeral to a single run of the application.
Load-based speculative control flow hijack #
The first vulnerable pattern is shown in the following listing
1
array[index].function()
It consists of an interface function call, where the interface is stored
into an array of interfaces array
, dereferenced at position index
. Note
that the array must be an array of interfaces so that calling the function
is achieved by an indirect call. For the attack to be successful, we
require the following conditions to be met: i) index
must be
attacker-controlled; ii) the attacker must be able to store the value of
two pointers in the memory space of the target process at a known location.
The first condition is met whenever a process accesses an array using an
index that is received as an external input. The second condition is very
commonly met since programs store user-provided input for processing.
Knowledge of the location of the stored pointers depends on the memory area
being used, and is aided by the deterministic nature of Go
’s allocator
and the absence of address space layout randomisation in Go
’s processes.
Without loss of generality, we describe the exploitation of the above
dereference/call pattern when function
is the first function defined by
the interface. Exploitation proceeds as follows: at first the attacker
prepares the memory structures that are used when an interface call is
performed. The structures are shown below
1
2
3
4
5
6
7
8
9
10
11
12
type iface struct {
tab *itab
data unsafe.Pointer
}
type itab struct {
inter *interfacetype
_type *_type
hash uint32
_ [4]byte
fun [1]uintptr
}
and are used by dereferencing the tab
pointer from the iface
struct and
then calling into the fun
array.
In preparation for the exploitation, the attacker ensures that the memory layout of the target program contains a pattern similar to the one below
1
2
3
4
5
6
7
8
9
0x561000: 0x562000
0x561008: 0x0000000000000000
...
0x562000: 0x0000000000000000
0x562008: 0x0000000000000000
0x562010: 0x0000000000000000
0x562018: 0x563000
...
0x563000: <CFH target here>
In particular, assuming that the attacker wants to speculatively redirect
the control flow to address 0x563000
, the attacker creates a fake itab
structure (in the example at 0x562000
) such that the first entry in the
fun
pointer array points to the desired CFH target. Then the attacker
further creates a fake iface
structure (in the example at 0x561000
such
that the tab
pointer points to the aforementioned itab
structure. With
the memory thus prepared, the attacker supplies the index into the array
such that the resulting address (which means base address plus index
multiplied by the size of a iface
structure) equals the fake iface
structure (0x561000
in our example). With the index thus set the program
will call the runtime.panicindex
function; however if the bounds check is
mispredicted, the dereference and subsequent indirect call will take place
transiently.
We prototype the attack to evaluate its effectiveness in a proof of concept. The
PoC contains the vulnerable pattern called in a for
loop that is off by
one with respect to the length of the array. The loop ensures that the branch
predictor considers the bounds check conditional jump as strongly non-taken,
thus ensuring the speculative execution of the final out-of-bounds load. The
index used to access the array in the loop goes from zero to len
minus
one to call the function on each of the elements in the array, and is finally
set to the target index computed as described above in the last iteration.
In order to verify whether speculative control flow hijack takes place we
follow the Speculator
approach and instrumented the program to read performance monitor counters
(PMC) during the execution of the loop, and set the speculative control
flow hijack target to contain a speculation marker. The
runtime.panicindex
function is modified to read and persist PMC values
for each execution. This instrumentation permits us to verify that
speculative control flow hijack indeed takes place. The success rate is
influenced by several factors that we review here. The most relevant factor
is the size of the speculation window, which is influenced by how quickly
the correct jump target is determined. The speculation window is maximised
if the variables used in the compare instruction that drives the jump –
especially the array length – are not present in any of the levels of the
cache. In order to get empirical evidence of this fact we instrument the
program with a clflush
instruction right before the array dereference to
ensure that the array length is not cached. In practice, an attacker may
achieve the same result by performing cache eviction code sequences.
However flushing the cache alone does not ensure a high success rate. This
is because the array length is stored right after the base address of the
array, whose address is loaded into memory as the first instruction of the
dereference sequence. We verify that if the two memory locations belong to
different cache lines, the speculation window is maximised. Another factor
that influences the success rate is whether the target of the speculative
control flow hijack is already in the instruction cache and iTLB. We make
sure that this be the case by inserting a call to the marker function in
the warmup phase before the loop. We report success rates exceeding 80%
when the array length is flushed and is in a separate cache line as the
base address.
Store-based speculative control flow hijack #
The second code pattern is shown below
1
2
3
array[index] = value
...
interface.function()
The pattern consists of a store operation of an attacker-controlled value at an attacker-controlled location into an array, followed by an interface call. The interface call does not need to be related to the array in any way. It only needs to be in close proximity of the store operation so that it may still be speculatively executed. This pattern does not require any ability to perform preparatory store operations in the memory space of the target program. The pattern makes use of store-to-load forwarding, since the store in the array is used to (speculatively) overwrite a function pointer which is later (speculatively) loaded and called. The elements stored in the array must permit storage of a pointer. Smaller sizes would permit partial control over the speculative control flow hijack target.
The store part of the pattern consists of a speculative version of a
``write-what-where’’ primitive. It may be exploited in several ways to
hijack the interface call: the most basic one would be to overwrite the
tab
pointer in the iface
struct. However this would either require the
attacker to perform a set of preparatory stores identical to those
discussed in the previous section, or it would restrict the freedom of the
attacker to choose a target out of the existing interface pointers. Another
strategy would be for the attacker to overwrite the fun
pointer in the
itab
structure directly. These structures are stored in a non-writeable
virtual memory region. However, given that the store takes place
speculatively, the attacker is able to bypass the write restrictions and
overwrite the pointer. We therefore chose to prototype this simpler and
more effective variant.
Exploitation proceeds as follows: the attacker computes the index into the
array such that the sum of the base array pointer and the index multiplied
by the size of the array elements equals the address of the fun
pointer
in the itab
structure for the interface whose function is called in the
second part of the vulnerable pattern. The value is set to the address of
the desired speculative control flow hijack target. As in the previous
section, with the index thus set the program will panic; however if the
bounds check is mispredicted, the store-to-load and subsequent indirect
call will take place, achieving speculative control flow hijack.
We prototype the attack to evaluate its effectiveness. We employ a similar
instrumentation as the previous section, with PMCs and speculation markers
employed to identify successful runs, and an off-by-one for
loop to set
the predictor state. The success rate is similarly influenced by ensuring
that the variables driving the conditional branch are not cached, and that
the speculative control flow hijack target is in cache. Under these
conditions, we report success rates exceeding 80%.
Mitigations #
We investigate possible mitigations for the both attack scenarios. The mitigations consist of two different compiler passes that ensure that the vulnerability is no longer exploitable.
The first mitigation consists of adding an lfence
instruction after the
cmp
instruction in the sequence that implements the IsInBounds
metaoperation: the lfence
instruction is inserted after the cmp
on line 2.
The insertion ensures that all prior instructions have completed, which
means that there will be no misprediction of the branch target and any
out-of-bound access will result in a panic with no transient execution. The
instruction is added explicitly in the pass that translates the AST into
SSA by defining a new Lfence
metaoperation and adding it after each
IsInBounds
operation. We ensure that the operation is neither reordered
nor eliminated.
The second mitigation we investigate entails the addition of an appropriate
masking sequence that ensures that the index is set to a “safe” value in
case of out-of-bounds accesses. The masking sequence amounts to a no-op in
case the access is in bounds by performing an and
operation on the index
with a sign extended -1 mask; if the access is not in bounds, in our
implementation, the masking operation forces an access of the element at
index 0 in the array by performing an and
operation on the index with a 0
mask. We can see the masking sequence in the following listing:
1
2
3
4
5
6
7
8
cmp %rcx,%rdx
jae <raise-panic-code>
mov %rdx,%rbx
sub %rcx,%rdx
sbb %rcx,%rcx
and %rbx,%rcx
shl $0x4,%rcx
mov (%rax,%rcx,1),%rax
after the usual cmp
and jmp
instructions, length and index and
subtracted in order to set the carry flag. Then the sbb
instruction is
used to set a register to be -1 in case of in bounds access or 0 otherwise.
The array is subsequently accessed after performing an and
operation on
the index with the mask thus obtained. The pattern might be further
optimised by using the cmp
instruction of the bounds check to set the
carry flag. This however is not always possible since the compiler will use
a compare instruction with an immediate whenever possible, and the
immediate can only be the second source operand, forcing the direction of
the comparison instruction. For the sake of simplicity with thus rely on an
extra subtraction operation. The masking instruction sequence is added in a
similar way employed for the lfence, namely, by defining three new
metaoperations – OpMaskStep1
, OpMaskStep2
and OpMaskStep3
– which
are later lowered into a sub
, sbb
and and
, respectively.
We measure the overhead of both mitigations by building the Go
runtime
version 1.12.0 and running the full benchmark suite. The empirical
cumulative distribution function of the overhead of each of the two
mitigation strategies can be seen in the plot below
We can see how the lfence
-based approach incurs a high overhead (143%
mean and 84% median) due to the fact that lfence
will terminate any
speculative execution and thus severely curtail the instruction throughput.
On the other hand, the masking approach shows a much lighter overhead (12%
mean and 6% median) since the instructions involved are simple and do not
cause any memory-related operation.
The situation in Go now #
After we disclosed the issue to the Go team back in 2020, mitigations were
developed as part of the Go compiler. The mitigations are described
here in the form of a compiler
switch (-spectre
) with two, non-mutually-exclusive options, index
and
ret
. The index
option protects against the attacks we describe in this
blog post, whereas the ret
protects against Spectre v2 by replacing
indirect calls with retpolines.