Skip to content
This repository was archived by the owner on Feb 5, 2019. It is now read-only.

Commit 9416121

Browse files
committed
[x86/SLH] Teach the x86 speculative load hardening pass to harden
against v1.2 BCBS attacks directly. Attacks using spectre v1.2 (a subset of BCBS) are described in the paper here: https://people.csail.mit.edu/vlk/spectre11.pdf The core idea is to speculatively store over the address in a vtable, jumptable, or other target of indirect control flow that will be subsequently loaded. Speculative execution after such a store can forward the stored value to subsequent loads, and if called or jumped to, the speculative execution will be steered to this potentially attacker controlled address. Up until now, this could be mitigated by enableing retpolines. However, that is a relatively expensive technique to mitigate this particular flavor. Especially because in most cases SLH will have already mitigated this. To fully mitigate this with SLH, we need to do two core things: 1) Unfold loads from calls and jumps, allowing the loads to be post-load hardened. 2) Force hardening of incoming registers even if we didn't end up needing to harden the load itself. The reason we need to do these two things is because hardening calls and jumps from this particular variant is importantly different from hardening against leak of secret data. Because the "bad" data here isn't a secret, but in fact speculatively stored by the attacker, it may be loaded from any address, regardless of whether it is read-only memory, mapped memory, or a "hardened" address. The only 100% effective way to harden these instructions is to harden the their operand itself. But to the extent possible, we'd like to take advantage of all the other hardening going on, we just need a fallback in case none of that happened to cover the particular input to the control transfer instruction. For users of SLH, currently they are paing 2% to 6% performance overhead for retpolines, but this mechanism is expected to be substantially cheaper. However, it is worth reminding folks that this does not mitigate all of the things retpolines do -- most notably, variant #2 is not in *any way* mitigated by this technique. So users of SLH may still want to enable retpolines, and the implementation is carefuly designed to gracefully leverage retpolines to avoid the need for further hardening here when they are enabled. Differential Revision: https://reviews.llvm.org/D49663 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@337878 91177308-0d34-0410-b5e6-96231b3b80d8
1 parent 0bca0d9 commit 9416121

File tree

2 files changed

+379
-16
lines changed

2 files changed

+379
-16
lines changed

lib/Target/X86/X86SpeculativeLoadHardening.cpp

+200
Original file line numberDiff line numberDiff line change
@@ -70,6 +70,8 @@ STATISTIC(NumAddrRegsHardened,
7070
"Number of address mode used registers hardaned");
7171
STATISTIC(NumPostLoadRegsHardened,
7272
"Number of post-load register values hardened");
73+
STATISTIC(NumCallsOrJumpsHardened,
74+
"Number of calls or jumps requiring extra hardening");
7375
STATISTIC(NumInstsInserted, "Number of instructions inserted");
7476
STATISTIC(NumLFENCEsInserted, "Number of lfence instructions inserted");
7577

@@ -105,6 +107,13 @@ static cl::opt<bool>
105107
"significant security is provided."),
106108
cl::init(true), cl::Hidden);
107109

110+
static cl::opt<bool> HardenIndirectCallsAndJumps(
111+
PASS_KEY "-indirect",
112+
cl::desc("Harden indirect calls and jumps against using speculatively "
113+
"stored attacker controlled addresses. This is designed to "
114+
"mitigate Spectre v1.2 style attacks."),
115+
cl::init(true), cl::Hidden);
116+
108117
namespace llvm {
109118

110119
void initializeX86SpeculativeLoadHardeningPassPass(PassRegistry &);
@@ -168,6 +177,8 @@ class X86SpeculativeLoadHardeningPass : public MachineFunctionPass {
168177
SmallVector<MachineInstr *, 16>
169178
tracePredStateThroughCFG(MachineFunction &MF, ArrayRef<BlockCondInfo> Infos);
170179

180+
void unfoldCallAndJumpLoads(MachineFunction &MF);
181+
171182
void hardenAllLoads(MachineFunction &MF);
172183

173184
unsigned saveEFLAGS(MachineBasicBlock &MBB,
@@ -196,6 +207,9 @@ class X86SpeculativeLoadHardeningPass : public MachineFunctionPass {
196207
DebugLoc Loc);
197208
unsigned hardenPostLoad(MachineInstr &MI);
198209
void hardenReturnInstr(MachineInstr &MI);
210+
void hardenIndirectCallOrJumpInstr(
211+
MachineInstr &MI,
212+
SmallDenseMap<unsigned, unsigned, 32> &AddrRegToHardenedReg);
199213
};
200214

201215
} // end anonymous namespace
@@ -507,6 +521,11 @@ bool X86SpeculativeLoadHardeningPass::runOnMachineFunction(
507521
}
508522
}
509523

524+
// If we are going to harden calls and jumps we need to unfold their memory
525+
// operands.
526+
if (HardenIndirectCallsAndJumps)
527+
unfoldCallAndJumpLoads(MF);
528+
510529
// Now harden all of the loads in the function using the predicate state.
511530
hardenAllLoads(MF);
512531

@@ -817,6 +836,110 @@ X86SpeculativeLoadHardeningPass::tracePredStateThroughCFG(
817836
return CMovs;
818837
}
819838

839+
/// Compute the register class for the unfolded load.
840+
///
841+
/// FIXME: This should probably live in X86InstrInfo, potentially by adding
842+
/// a way to unfold into a newly created vreg rather than requiring a register
843+
/// input.
844+
static const TargetRegisterClass *
845+
getRegClassForUnfoldedLoad(MachineFunction &MF, const X86InstrInfo &TII,
846+
unsigned Opcode) {
847+
unsigned Index;
848+
unsigned UnfoldedOpc = TII.getOpcodeAfterMemoryUnfold(
849+
Opcode, /*UnfoldLoad*/ true, /*UnfoldStore*/ false, &Index);
850+
const MCInstrDesc &MCID = TII.get(UnfoldedOpc);
851+
return TII.getRegClass(MCID, Index, &TII.getRegisterInfo(), MF);
852+
}
853+
854+
void X86SpeculativeLoadHardeningPass::unfoldCallAndJumpLoads(
855+
MachineFunction &MF) {
856+
for (MachineBasicBlock &MBB : MF)
857+
for (auto MII = MBB.instr_begin(), MIE = MBB.instr_end(); MII != MIE;) {
858+
// Grab a reference and increment the iterator so we can remove this
859+
// instruction if needed without disturbing the iteration.
860+
MachineInstr &MI = *MII++;
861+
862+
// Must either be a call or a branch.
863+
if (!MI.isCall() && !MI.isBranch())
864+
continue;
865+
// We only care about loading variants of these instructions.
866+
if (!MI.mayLoad())
867+
continue;
868+
869+
switch (MI.getOpcode()) {
870+
default: {
871+
LLVM_DEBUG(
872+
dbgs() << "ERROR: Found an unexpected loading branch or call "
873+
"instruction:\n";
874+
MI.dump(); dbgs() << "\n");
875+
report_fatal_error("Unexpected loading branch or call!");
876+
}
877+
878+
case X86::FARCALL16m:
879+
case X86::FARCALL32m:
880+
case X86::FARCALL64:
881+
case X86::FARJMP16m:
882+
case X86::FARJMP32m:
883+
case X86::FARJMP64:
884+
// We cannot mitigate far jumps or calls, but we also don't expect them
885+
// to be vulnerable to Spectre v1.2 style attacks.
886+
continue;
887+
888+
case X86::CALL16m:
889+
case X86::CALL16m_NT:
890+
case X86::CALL32m:
891+
case X86::CALL32m_NT:
892+
case X86::CALL64m:
893+
case X86::CALL64m_NT:
894+
case X86::JMP16m:
895+
case X86::JMP16m_NT:
896+
case X86::JMP32m:
897+
case X86::JMP32m_NT:
898+
case X86::JMP64m:
899+
case X86::JMP64m_NT:
900+
case X86::TAILJMPm64:
901+
case X86::TAILJMPm64_REX:
902+
case X86::TAILJMPm:
903+
case X86::TCRETURNmi64:
904+
case X86::TCRETURNmi: {
905+
// Use the generic unfold logic now that we know we're dealing with
906+
// expected instructions.
907+
// FIXME: We don't have test coverage for all of these!
908+
auto *UnfoldedRC = getRegClassForUnfoldedLoad(MF, *TII, MI.getOpcode());
909+
if (!UnfoldedRC) {
910+
LLVM_DEBUG(dbgs()
911+
<< "ERROR: Unable to unfold load from instruction:\n";
912+
MI.dump(); dbgs() << "\n");
913+
report_fatal_error("Unable to unfold load!");
914+
}
915+
unsigned Reg = MRI->createVirtualRegister(UnfoldedRC);
916+
SmallVector<MachineInstr *, 2> NewMIs;
917+
// If we were able to compute an unfolded reg class, any failure here
918+
// is just a programming error so just assert.
919+
bool Unfolded =
920+
TII->unfoldMemoryOperand(MF, MI, Reg, /*UnfoldLoad*/ true,
921+
/*UnfoldStore*/ false, NewMIs);
922+
(void)Unfolded;
923+
assert(Unfolded &&
924+
"Computed unfolded register class but failed to unfold");
925+
// Now stitch the new instructions into place and erase the old one.
926+
for (auto *NewMI : NewMIs)
927+
MBB.insert(MI.getIterator(), NewMI);
928+
MI.eraseFromParent();
929+
LLVM_DEBUG({
930+
dbgs() << "Unfolded load successfully into:\n";
931+
for (auto *NewMI : NewMIs) {
932+
NewMI->dump();
933+
dbgs() << "\n";
934+
}
935+
});
936+
continue;
937+
}
938+
}
939+
llvm_unreachable("Escaped switch with default!");
940+
}
941+
}
942+
820943
/// Returns true if the instruction has no behavior (specified or otherwise)
821944
/// that is based on the value of any of its register operands
822945
///
@@ -1441,6 +1564,14 @@ void X86SpeculativeLoadHardeningPass::hardenAllLoads(MachineFunction &MF) {
14411564
continue;
14421565
}
14431566

1567+
// Check for an indirect call or branch that may need its input hardened
1568+
// even if we couldn't find the specific load used, or were able to avoid
1569+
// hardening it for some reason. Note that here we cannot break out
1570+
// afterward as we may still need to handle any call aspect of this
1571+
// instruction.
1572+
if ((MI.isCall() || MI.isBranch()) && HardenIndirectCallsAndJumps)
1573+
hardenIndirectCallOrJumpInstr(MI, AddrRegToHardenedReg);
1574+
14441575
// After we finish processing the instruction and doing any hardening
14451576
// necessary for it, we need to handle transferring the predicate state
14461577
// into a call and recovering it after the call returns (if it returns).
@@ -2010,6 +2141,75 @@ void X86SpeculativeLoadHardeningPass::hardenReturnInstr(MachineInstr &MI) {
20102141
mergePredStateIntoSP(MBB, InsertPt, Loc, PS->SSA.GetValueAtEndOfBlock(&MBB));
20112142
}
20122143

2144+
/// An attacker may speculatively store over a value that is then speculatively
2145+
/// loaded and used as the target of an indirect call or jump instruction. This
2146+
/// is called Spectre v1.2 or Bounds Check Bypass Store (BCBS) and is described
2147+
/// in this paper:
2148+
/// https://people.csail.mit.edu/vlk/spectre11.pdf
2149+
///
2150+
/// When this happens, the speculative execution of the call or jump will end up
2151+
/// being steered to this attacker controlled address. While most such loads
2152+
/// will be adequately hardened already, we want to ensure that they are
2153+
/// definitively treated as needing post-load hardening. While address hardening
2154+
/// is sufficient to prevent secret data from leaking to the attacker, it may
2155+
/// not be sufficient to prevent an attacker from steering speculative
2156+
/// execution. We forcibly unfolded all relevant loads above and so will always
2157+
/// have an opportunity to post-load harden here, we just need to scan for cases
2158+
/// not already flagged and add them.
2159+
void X86SpeculativeLoadHardeningPass::hardenIndirectCallOrJumpInstr(
2160+
MachineInstr &MI,
2161+
SmallDenseMap<unsigned, unsigned, 32> &AddrRegToHardenedReg) {
2162+
switch (MI.getOpcode()) {
2163+
case X86::FARCALL16m:
2164+
case X86::FARCALL32m:
2165+
case X86::FARCALL64:
2166+
case X86::FARJMP16m:
2167+
case X86::FARJMP32m:
2168+
case X86::FARJMP64:
2169+
// We don't need to harden either far calls or far jumps as they are
2170+
// safe from Spectre.
2171+
return;
2172+
2173+
default:
2174+
break;
2175+
}
2176+
2177+
// We should never see a loading instruction at this point, as those should
2178+
// have been unfolded.
2179+
assert(!MI.mayLoad() && "Found a lingering loading instruction!");
2180+
2181+
// If the first operand isn't a register, this is a branch or call
2182+
// instruction with an immediate operand which doesn't need to be hardened.
2183+
if (!MI.getOperand(0).isReg())
2184+
return;
2185+
2186+
// For all of these, the target register is the first operand of the
2187+
// instruction.
2188+
auto &TargetOp = MI.getOperand(0);
2189+
unsigned OldTargetReg = TargetOp.getReg();
2190+
2191+
// Try to lookup a hardened version of this register. We retain a reference
2192+
// here as we want to update the map to track any newly computed hardened
2193+
// register.
2194+
unsigned &HardenedTargetReg = AddrRegToHardenedReg[OldTargetReg];
2195+
2196+
// If we don't have a hardened register yet, compute one. Otherwise, just use
2197+
// the already hardened register.
2198+
//
2199+
// FIXME: It is a little suspect that we use partially hardened registers that
2200+
// only feed addresses. The complexity of partial hardening with SHRX
2201+
// continues to pile up. Should definitively measure its value and consider
2202+
// eliminating it.
2203+
if (!HardenedTargetReg)
2204+
HardenedTargetReg = hardenValueInRegister(
2205+
OldTargetReg, *MI.getParent(), MI.getIterator(), MI.getDebugLoc());
2206+
2207+
// Set the target operand to the hardened register.
2208+
TargetOp.setReg(HardenedTargetReg);
2209+
2210+
++NumCallsOrJumpsHardened;
2211+
}
2212+
20132213
INITIALIZE_PASS_BEGIN(X86SpeculativeLoadHardeningPass, DEBUG_TYPE,
20142214
"X86 speculative load hardener", false, false)
20152215
INITIALIZE_PASS_END(X86SpeculativeLoadHardeningPass, DEBUG_TYPE,

0 commit comments

Comments
 (0)