-
Notifications
You must be signed in to change notification settings - Fork 13.6k
[WebAssembly] Add assembly support for final EH proposal #107917
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
This adds the basic assembly generation support for the final EH proposal, which was newly adopted in Sep 2023 and advanced into Phase 4 in Jul 2024: https://github.com/WebAssembly/exception-handling/blob/main/proposals/exception-handling/Exceptions.md This adds support for the generation of new `try_table` and `throw_ref` instruction in .s asesmbly format. This does NOT yet include - Block annotation comment generation for .s format - .o object file generation - .s assembly parsing - Type checking (AsmTypeCheck) - Disassembler - Fixing unwind mismatches in CFGStackify These will be added as follow-up PRs. --- The format for `TRY_TABLE`, both for `MachineInstr` and `MCInst`, is as follows: ``` TRY_TABLE type number_of_catches catch_clauses* ``` where `catch_clause` is ``` catch_opcode tag+ destination ``` `catch_opcode` should be one of 0/1/2/3, which denotes `CATCH`/`CATCH_REF`/`CATCH_ALL`/`CATCH_ALL_REF` respectively. (See `BinaryFormat/Wasm.h`) `tag` exists when the catch is one of `CATCH` or `CATCH_REF`. The MIR format is printed as just the list of raw operands. The (stack-based) assembly instruction supports pretty-printing, including printing `catch` clauses by name, in InstPrinter. In addition to the new instructions `TRY_TABLE` and `THROW_REF`, this adds four pseudo instructions: `CATCH`, `CATCH_REF`, `CATCH_ALL`, and `CATCH_ALL_REF`. These are pseudo instructions to simulate block return values of `catch`, `catch_ref`, `catch_all`, `catch_all_ref` clauses in `try_table` respectively, given that we don't support block return values except for one case (`fixEndsAtEndOfFunction` in CFGStackify). These will be omitted when we lower the instructions to `MCInst` at the end. LateEHPrepare now will have one more stage to covert `CATCH`/`CATCH_ALL`s to `CATCH_REF`/`CATCH_ALL_REF`s when there is a `RETHROW` to rethrow its exception. The pass also converts `RETHROW`s into `THROW_REF`. Note that we still use `RETHROW` as an interim pseudo instruction until we convert them to `THROW_REF` in LateEHPrepare. CFGStackify has a new `placeTryTableMarker` function, which places `try_table`/`end_try_table` markers with a necessary `catch` clause and also `block`/`end_block` markers for the destination of the `catch` clause. In MCInstLower, now we need to support one more case for the multivalue block signature (`catch_ref`'s destination's `(i32, exnref)` return type). InstPrinter has a new routine to print the `catch_list` type, which is used to print `try_table` instructions. The new test, `exception.ll`'s source is the same as `exception-legacy.ll`, with the FileCheck expectations changed. One difference is the commands in this file have `-wasm-enable-exnref` to test the new format, and don't have `-wasm-disable-explicit-locals -wasm-keep-registers`, because the new custom InstPrinter routine to print `catch_list` only works for the stack-based instructions (`_S`), and we can't use `-wasm-keep-registers` for them. As in `exception-legacy.ll`, the FileCheck lines for the new tests do not contain the whole program; they mostly contain only the control flow instructions for readability.
@llvm/pr-subscribers-backend-webassembly @llvm/pr-subscribers-llvm-binary-utilities Author: Heejin Ahn (aheejin) ChangesThis adds the basic assembly generation support for the final EH proposal, which was newly adopted in Sep 2023 and advanced into Phase 4 in Jul 2024: This adds support for the generation of new
These will be added as follow-up PRs. The format for
where
In addition to the new instructions LateEHPrepare now will have one more stage to covert CFGStackify has a new In MCInstLower, now we need to support one more case for the multivalue block signature ( InstPrinter has a new routine to print the The new test, As in Patch is 56.70 KiB, truncated to 20.00 KiB below, full version: https://github.com/llvm/llvm-project/pull/107917.diff 14 Files Affected:
diff --git a/llvm/include/llvm/BinaryFormat/Wasm.h b/llvm/include/llvm/BinaryFormat/Wasm.h
index acf89885af6fdb..9b21d6d65c2a8e 100644
--- a/llvm/include/llvm/BinaryFormat/Wasm.h
+++ b/llvm/include/llvm/BinaryFormat/Wasm.h
@@ -144,6 +144,14 @@ enum : unsigned {
WASM_OPCODE_I32_RMW_CMPXCHG = 0x48,
};
+// Sub-opcodes for catch clauses in a try_table instruction
+enum : unsigned {
+ WASM_OPCODE_CATCH = 0x00,
+ WASM_OPCODE_CATCH_REF = 0x01,
+ WASM_OPCODE_CATCH_ALL = 0x02,
+ WASM_OPCODE_CATCH_ALL_REF = 0x03,
+};
+
enum : unsigned {
WASM_LIMITS_FLAG_NONE = 0x0,
WASM_LIMITS_FLAG_HAS_MAX = 0x1,
diff --git a/llvm/lib/Target/WebAssembly/AsmParser/WebAssemblyAsmParser.cpp b/llvm/lib/Target/WebAssembly/AsmParser/WebAssemblyAsmParser.cpp
index 24a9ad67cfe042..5299e6ea06f0bd 100644
--- a/llvm/lib/Target/WebAssembly/AsmParser/WebAssemblyAsmParser.cpp
+++ b/llvm/lib/Target/WebAssembly/AsmParser/WebAssemblyAsmParser.cpp
@@ -45,7 +45,7 @@ namespace {
/// WebAssemblyOperand - Instances of this class represent the operands in a
/// parsed Wasm machine instruction.
struct WebAssemblyOperand : public MCParsedAsmOperand {
- enum KindTy { Token, Integer, Float, Symbol, BrList } Kind;
+ enum KindTy { Token, Integer, Float, Symbol, BrList, CatchList } Kind;
SMLoc StartLoc, EndLoc;
@@ -99,6 +99,7 @@ struct WebAssemblyOperand : public MCParsedAsmOperand {
bool isMem() const override { return false; }
bool isReg() const override { return false; }
bool isBrList() const { return Kind == BrList; }
+ bool isCatchList() const { return Kind == CatchList; }
MCRegister getReg() const override {
llvm_unreachable("Assembly inspects a register operand");
@@ -151,6 +152,10 @@ struct WebAssemblyOperand : public MCParsedAsmOperand {
Inst.addOperand(MCOperand::createImm(Br));
}
+ void addCatchListOperands(MCInst &Inst, unsigned N) const {
+ // TODO
+ }
+
void print(raw_ostream &OS) const override {
switch (Kind) {
case Token:
@@ -168,6 +173,9 @@ struct WebAssemblyOperand : public MCParsedAsmOperand {
case BrList:
OS << "BrList:" << BrL.List.size();
break;
+ case CatchList:
+ // TODO
+ break;
}
}
};
diff --git a/llvm/lib/Target/WebAssembly/MCTargetDesc/WebAssemblyInstPrinter.cpp b/llvm/lib/Target/WebAssembly/MCTargetDesc/WebAssemblyInstPrinter.cpp
index b85ed1d93593bd..903dbcd21ea967 100644
--- a/llvm/lib/Target/WebAssembly/MCTargetDesc/WebAssemblyInstPrinter.cpp
+++ b/llvm/lib/Target/WebAssembly/MCTargetDesc/WebAssemblyInstPrinter.cpp
@@ -367,3 +367,44 @@ void WebAssemblyInstPrinter::printWebAssemblySignatureOperand(const MCInst *MI,
}
}
}
+
+void WebAssemblyInstPrinter::printCatchList(const MCInst *MI, unsigned OpNo,
+ raw_ostream &O) {
+ unsigned OpIdx = OpNo;
+ const MCOperand &Op = MI->getOperand(OpIdx++);
+ unsigned NumCatches = Op.getImm();
+
+ auto PrintTagOp = [&](const MCOperand &Op) {
+ const MCSymbolRefExpr *TagExpr = nullptr;
+ const MCSymbolWasm *TagSym = nullptr;
+ assert(Op.isExpr());
+ TagExpr = dyn_cast<MCSymbolRefExpr>(Op.getExpr());
+ TagSym = cast<MCSymbolWasm>(&TagExpr->getSymbol());
+ O << TagSym->getName() << " ";
+ };
+
+ for (unsigned I = 0; I < NumCatches; I++) {
+ const MCOperand &Op = MI->getOperand(OpIdx++);
+ O << "(";
+ switch (Op.getImm()) {
+ case wasm::WASM_OPCODE_CATCH:
+ O << "catch ";
+ PrintTagOp(MI->getOperand(OpIdx++));
+ break;
+ case wasm::WASM_OPCODE_CATCH_REF:
+ O << "catch_ref ";
+ PrintTagOp(MI->getOperand(OpIdx++));
+ break;
+ case wasm::WASM_OPCODE_CATCH_ALL:
+ O << "catch_all ";
+ break;
+ case wasm::WASM_OPCODE_CATCH_ALL_REF:
+ O << "catch_all_ref ";
+ break;
+ }
+ O << MI->getOperand(OpIdx++).getImm(); // destination
+ O << ")";
+ if (I < NumCatches - 1)
+ O << " ";
+ }
+}
diff --git a/llvm/lib/Target/WebAssembly/MCTargetDesc/WebAssemblyInstPrinter.h b/llvm/lib/Target/WebAssembly/MCTargetDesc/WebAssemblyInstPrinter.h
index 8fd54d16409059..b499926ab82965 100644
--- a/llvm/lib/Target/WebAssembly/MCTargetDesc/WebAssemblyInstPrinter.h
+++ b/llvm/lib/Target/WebAssembly/MCTargetDesc/WebAssemblyInstPrinter.h
@@ -47,6 +47,7 @@ class WebAssemblyInstPrinter final : public MCInstPrinter {
raw_ostream &O);
void printWebAssemblySignatureOperand(const MCInst *MI, unsigned OpNo,
raw_ostream &O);
+ void printCatchList(const MCInst *MI, unsigned OpNo, raw_ostream &O);
// Autogenerated by tblgen.
std::pair<const char *, uint64_t> getMnemonic(const MCInst *MI) override;
diff --git a/llvm/lib/Target/WebAssembly/MCTargetDesc/WebAssemblyMCTargetDesc.h b/llvm/lib/Target/WebAssembly/MCTargetDesc/WebAssemblyMCTargetDesc.h
index 00f15e1db5e13a..e3a60fa4812d8f 100644
--- a/llvm/lib/Target/WebAssembly/MCTargetDesc/WebAssemblyMCTargetDesc.h
+++ b/llvm/lib/Target/WebAssembly/MCTargetDesc/WebAssemblyMCTargetDesc.h
@@ -87,6 +87,8 @@ enum OperandType {
OPERAND_BRLIST,
/// 32-bit unsigned table number.
OPERAND_TABLE,
+ /// A list of catch clauses for try_table.
+ OPERAND_CATCH_LIST,
};
} // end namespace WebAssembly
@@ -119,6 +121,10 @@ enum TOF {
// address relative the __table_base wasm global.
// Only applicable to function symbols.
MO_TABLE_BASE_REL,
+
+ // On a block signature operand this indicates that this is a destination
+ // block of a (catch_ref) clause in try_table.
+ MO_CATCH_BLOCK_SIG,
};
} // end namespace WebAssemblyII
@@ -462,6 +468,22 @@ inline bool isMarker(unsigned Opc) {
case WebAssembly::TRY_S:
case WebAssembly::END_TRY:
case WebAssembly::END_TRY_S:
+ case WebAssembly::TRY_TABLE:
+ case WebAssembly::TRY_TABLE_S:
+ case WebAssembly::END_TRY_TABLE:
+ case WebAssembly::END_TRY_TABLE_S:
+ return true;
+ default:
+ return false;
+ }
+}
+
+inline bool isTry(unsigned Opc) {
+ switch (Opc) {
+ case WebAssembly::TRY:
+ case WebAssembly::TRY_S:
+ case WebAssembly::TRY_TABLE:
+ case WebAssembly::TRY_TABLE_S:
return true;
default:
return false;
@@ -474,6 +496,14 @@ inline bool isCatch(unsigned Opc) {
case WebAssembly::CATCH_LEGACY_S:
case WebAssembly::CATCH_ALL_LEGACY:
case WebAssembly::CATCH_ALL_LEGACY_S:
+ case WebAssembly::CATCH:
+ case WebAssembly::CATCH_S:
+ case WebAssembly::CATCH_REF:
+ case WebAssembly::CATCH_REF_S:
+ case WebAssembly::CATCH_ALL:
+ case WebAssembly::CATCH_ALL_S:
+ case WebAssembly::CATCH_ALL_REF:
+ case WebAssembly::CATCH_ALL_REF_S:
return true;
default:
return false;
diff --git a/llvm/lib/Target/WebAssembly/MCTargetDesc/WebAssemblyMCTypeUtilities.h b/llvm/lib/Target/WebAssembly/MCTargetDesc/WebAssemblyMCTypeUtilities.h
index 063ee4dba9068e..4aca092e0e4c44 100644
--- a/llvm/lib/Target/WebAssembly/MCTargetDesc/WebAssemblyMCTypeUtilities.h
+++ b/llvm/lib/Target/WebAssembly/MCTargetDesc/WebAssemblyMCTypeUtilities.h
@@ -33,11 +33,15 @@ enum class BlockType : unsigned {
Externref = unsigned(wasm::ValType::EXTERNREF),
Funcref = unsigned(wasm::ValType::FUNCREF),
Exnref = unsigned(wasm::ValType::EXNREF),
- // Multivalue blocks (and other non-void blocks) are only emitted when the
- // blocks will never be exited and are at the ends of functions (see
- // WebAssemblyCFGStackify::fixEndsAtEndOfFunction). They also are never made
- // to pop values off the stack, so the exact multivalue signature can always
- // be inferred from the return type of the parent function in MCInstLower.
+ // Multivalue blocks are emitted in two cases:
+ // 1. When the blocks will never be exited and are at the ends of functions
+ // (see WebAssemblyCFGStackify::fixEndsAtEndOfFunction). In this case the
+ // exact multivalue signature can always be inferred from the return type
+ // of the parent function.
+ // 2. (catch_ref ...) clause in try_table instruction. Currently all tags we
+ // support (cpp_exception and c_longjmp) throws a single i32, so the
+ // multivalue signature for this case will be (i32, exnref).
+ // The real multivalue siganture will be added in MCInstLower.
Multivalue = 0xffff,
};
diff --git a/llvm/lib/Target/WebAssembly/WebAssemblyAsmPrinter.cpp b/llvm/lib/Target/WebAssembly/WebAssemblyAsmPrinter.cpp
index 6dd6145ed00573..14c0eaac17daaa 100644
--- a/llvm/lib/Target/WebAssembly/WebAssemblyAsmPrinter.cpp
+++ b/llvm/lib/Target/WebAssembly/WebAssemblyAsmPrinter.cpp
@@ -683,6 +683,17 @@ void WebAssemblyAsmPrinter::emitInstruction(const MachineInstr *MI) {
// This is a compiler barrier that prevents instruction reordering during
// backend compilation, and should not be emitted.
break;
+ case WebAssembly::CATCH:
+ case WebAssembly::CATCH_S:
+ case WebAssembly::CATCH_REF:
+ case WebAssembly::CATCH_REF_S:
+ case WebAssembly::CATCH_ALL:
+ case WebAssembly::CATCH_ALL_S:
+ case WebAssembly::CATCH_ALL_REF:
+ case WebAssembly::CATCH_ALL_REF_S:
+ // These are pseudo instructions to represent catch clauses in try_table
+ // instruction to simulate block return values.
+ break;
default: {
WebAssemblyMCInstLower MCInstLowering(OutContext, *this);
MCInst TmpInst;
diff --git a/llvm/lib/Target/WebAssembly/WebAssemblyCFGStackify.cpp b/llvm/lib/Target/WebAssembly/WebAssemblyCFGStackify.cpp
index 3cccc57e629fd7..a5f73fabca3542 100644
--- a/llvm/lib/Target/WebAssembly/WebAssemblyCFGStackify.cpp
+++ b/llvm/lib/Target/WebAssembly/WebAssemblyCFGStackify.cpp
@@ -9,9 +9,9 @@
/// \file
/// This file implements a CFG stacking pass.
///
-/// This pass inserts BLOCK, LOOP, and TRY markers to mark the start of scopes,
-/// since scope boundaries serve as the labels for WebAssembly's control
-/// transfers.
+/// This pass inserts BLOCK, LOOP, TRY, and TRY_TABLE markers to mark the start
+/// of scopes, since scope boundaries serve as the labels for WebAssembly's
+/// control transfers.
///
/// This is sufficient to convert arbitrary CFGs into a form that works on
/// WebAssembly, provided that all loops are single-entry.
@@ -21,6 +21,7 @@
///
//===----------------------------------------------------------------------===//
+#include "MCTargetDesc/WebAssemblyMCTargetDesc.h"
#include "Utils/WebAssemblyTypeUtilities.h"
#include "WebAssembly.h"
#include "WebAssemblyExceptionInfo.h"
@@ -29,6 +30,7 @@
#include "WebAssemblySubtarget.h"
#include "WebAssemblyUtilities.h"
#include "llvm/ADT/Statistic.h"
+#include "llvm/BinaryFormat/Wasm.h"
#include "llvm/CodeGen/MachineDominators.h"
#include "llvm/CodeGen/MachineInstrBuilder.h"
#include "llvm/CodeGen/MachineLoopInfo.h"
@@ -74,6 +76,7 @@ class WebAssemblyCFGStackify final : public MachineFunctionPass {
void placeBlockMarker(MachineBasicBlock &MBB);
void placeLoopMarker(MachineBasicBlock &MBB);
void placeTryMarker(MachineBasicBlock &MBB);
+ void placeTryTableMarker(MachineBasicBlock &MBB);
// Exception handling related functions
bool fixCallUnwindMismatches(MachineFunction &MF);
@@ -97,11 +100,11 @@ class WebAssemblyCFGStackify final : public MachineFunctionPass {
void fixEndsAtEndOfFunction(MachineFunction &MF);
void cleanupFunctionData(MachineFunction &MF);
- // For each BLOCK|LOOP|TRY, the corresponding END_(BLOCK|LOOP|TRY) or DELEGATE
- // (in case of TRY).
+ // For each BLOCK|LOOP|TRY|TRY_TABLE, the corresponding
+ // END_(BLOCK|LOOP|TRY|TRY_TABLE) or DELEGATE (in case of TRY).
DenseMap<const MachineInstr *, MachineInstr *> BeginToEnd;
- // For each END_(BLOCK|LOOP|TRY) or DELEGATE, the corresponding
- // BLOCK|LOOP|TRY.
+ // For each END_(BLOCK|LOOP|TRY|TRY_TABLE) or DELEGATE, the corresponding
+ // BLOCK|LOOP|TRY|TRY_TABLE.
DenseMap<const MachineInstr *, MachineInstr *> EndToBegin;
// <TRY marker, EH pad> map
DenseMap<const MachineInstr *, MachineBasicBlock *> TryToEHPad;
@@ -150,9 +153,10 @@ class WebAssemblyCFGStackify final : public MachineFunctionPass {
} // end anonymous namespace
char WebAssemblyCFGStackify::ID = 0;
-INITIALIZE_PASS(WebAssemblyCFGStackify, DEBUG_TYPE,
- "Insert BLOCK/LOOP/TRY markers for WebAssembly scopes", false,
- false)
+INITIALIZE_PASS(
+ WebAssemblyCFGStackify, DEBUG_TYPE,
+ "Insert BLOCK/LOOP/TRY/TRY_TABLE markers for WebAssembly scopes", false,
+ false)
FunctionPass *llvm::createWebAssemblyCFGStackify() {
return new WebAssemblyCFGStackify();
@@ -314,12 +318,13 @@ void WebAssemblyCFGStackify::placeBlockMarker(MachineBasicBlock &MBB) {
#endif
}
- // If there is a previously placed BLOCK/TRY marker and its corresponding
- // END marker is before the current BLOCK's END marker, that should be
- // placed after this BLOCK. Otherwise it should be placed before this BLOCK
- // marker.
+ // If there is a previously placed BLOCK/TRY/TRY_TABLE marker and its
+ // corresponding END marker is before the current BLOCK's END marker, that
+ // should be placed after this BLOCK. Otherwise it should be placed before
+ // this BLOCK marker.
if (MI.getOpcode() == WebAssembly::BLOCK ||
- MI.getOpcode() == WebAssembly::TRY) {
+ MI.getOpcode() == WebAssembly::TRY ||
+ MI.getOpcode() == WebAssembly::TRY_TABLE) {
if (BeginToEnd[&MI]->getParent()->getNumber() <= MBB.getNumber())
AfterSet.insert(&MI);
#ifndef NDEBUG
@@ -329,10 +334,11 @@ void WebAssemblyCFGStackify::placeBlockMarker(MachineBasicBlock &MBB) {
}
#ifndef NDEBUG
- // All END_(BLOCK|LOOP|TRY) markers should be before the BLOCK.
+ // All END_(BLOCK|LOOP|TRY|TRY_TABLE) markers should be before the BLOCK.
if (MI.getOpcode() == WebAssembly::END_BLOCK ||
MI.getOpcode() == WebAssembly::END_LOOP ||
- MI.getOpcode() == WebAssembly::END_TRY)
+ MI.getOpcode() == WebAssembly::END_TRY ||
+ MI.getOpcode() == WebAssembly::END_TRY_TABLE)
BeforeSet.insert(&MI);
#endif
@@ -374,6 +380,11 @@ void WebAssemblyCFGStackify::placeBlockMarker(MachineBasicBlock &MBB) {
// loop is above this block's header, the END_LOOP should be placed after
// the END_BLOCK, because the loop contains this block. Otherwise the
// END_LOOP should be placed before the END_BLOCK. The same for END_TRY.
+ //
+ // Note that while there can be existing END_TRYs, there can't be
+ // END_TRY_TABLEs; END_TRYs are placed when its corresponding EH pad is
+ // processed, so they are placed below MBB (EH pad) in placeTryMarker. But
+ // END_TRY_TABLE is placed like a END_BLOCK, so they can't be here already.
if (MI.getOpcode() == WebAssembly::END_LOOP ||
MI.getOpcode() == WebAssembly::END_TRY) {
if (EndToBegin[&MI]->getParent()->getNumber() >= Header->getNumber())
@@ -657,7 +668,251 @@ void WebAssemblyCFGStackify::placeTryMarker(MachineBasicBlock &MBB) {
updateScopeTops(Header, End);
}
+void WebAssemblyCFGStackify::placeTryTableMarker(MachineBasicBlock &MBB) {
+ assert(MBB.isEHPad());
+ MachineFunction &MF = *MBB.getParent();
+ auto &MDT = getAnalysis<MachineDominatorTreeWrapperPass>().getDomTree();
+ const auto &TII = *MF.getSubtarget<WebAssemblySubtarget>().getInstrInfo();
+ const auto &MLI = getAnalysis<MachineLoopInfoWrapperPass>().getLI();
+ const auto &WEI = getAnalysis<WebAssemblyExceptionInfo>();
+ SortRegionInfo SRI(MLI, WEI);
+ const auto &MFI = *MF.getInfo<WebAssemblyFunctionInfo>();
+
+ // Compute the nearest common dominator of all unwind predecessors
+ MachineBasicBlock *Header = nullptr;
+ int MBBNumber = MBB.getNumber();
+ for (auto *Pred : MBB.predecessors()) {
+ if (Pred->getNumber() < MBBNumber) {
+ Header = Header ? MDT.findNearestCommonDominator(Header, Pred) : Pred;
+ assert(!explicitlyBranchesTo(Pred, &MBB) &&
+ "Explicit branch to an EH pad!");
+ }
+ }
+ if (!Header)
+ return;
+
+ assert(&MBB != &MF.front() && "Header blocks shouldn't have predecessors");
+ MachineBasicBlock *LayoutPred = MBB.getPrevNode();
+
+ // If the nearest common dominator is inside a more deeply nested context,
+ // walk out to the nearest scope which isn't more deeply nested.
+ for (MachineFunction::iterator I(LayoutPred), E(Header); I != E; --I) {
+ if (MachineBasicBlock *ScopeTop = ScopeTops[I->getNumber()]) {
+ if (ScopeTop->getNumber() > Header->getNumber()) {
+ // Skip over an intervening scope.
+ I = std::next(ScopeTop->getIterator());
+ } else {
+ // We found a scope level at an appropriate depth.
+ Header = ScopeTop;
+ break;
+ }
+ }
+ }
+
+ // Decide where in Header to put the TRY_TABLE.
+
+ // Instructions that should go before the TRY_TABLE.
+ SmallPtrSet<const MachineInstr *, 4> BeforeSet;
+ // Instructions that should go after the TRY_TABLE.
+ SmallPtrSet<const MachineInstr *, 4> AfterSet;
+ for (const auto &MI : *Header) {
+ // If there is a previously placed LOOP marker and the bottom block of the
+ // loop is above MBB, it should be after the TRY_TABLE, because the loop is
+ // nested in this TRY_TABLE. Otherwise it should be before the TRY_TABLE.
+ if (MI.getOpcode() == WebAssembly::LOOP) {
+ auto *LoopBottom = BeginToEnd[&MI]->getParent()->getPrevNode();
+ if (MBB.getNumber() > LoopBottom->getNumber())
+ AfterSet.insert(&MI);
+#ifndef NDEBUG
+ else
+ BeforeSet.insert(&MI);
+#endif
+ }
+
+ // All previously inserted BLOCK/TRY_TABLE markers should be after the
+ // TRY_TABLE because they are all nested blocks/try_tables.
+ if (MI.getOpcode() == WebAssembly::BLOCK ||
+ MI.getOpcode() == WebAssembly::TRY_TABLE)
+ AfterSet.insert(&MI);
+
+#ifndef NDEBUG
+ // All END_(BLOCK/LOOP/TRY_TABLE) markers should be before the TRY_TABLE.
+ if (MI.getOpcode() == WebAssembly::END_BLOCK ||
+ MI.getOpcode() == WebAssembly::END_LOOP ||
+ MI.getOpcode() == WebAssembly::END_TRY_TABLE)
+ BeforeSet.insert(&MI);
+#endif
+
+ // Terminators should go after the TRY_TABLE.
+ if (MI.isTerminator())
+ AfterSet.insert(&MI);
+ }
+
+ // If Header unwinds to MBB (= Header contains 'invoke'), the try_table block
+ // should contain the call within it. So the call should go after the
+ // TRY_TABLE. The exception is when the header's terminator is a rethrow
+ // instruction, in which case that instruction, not a call instruction before
+ // it, is gonna throw.
+ MachineInstr *ThrowingCall = nullptr;
+ if (MBB.isPredecessor(Header)) {
+ auto TermPos = Header->getFirstTerminator();
+ if (TermPos == Header->end() ||
+ TermPos->getOpcode() != WebAssembly::RETHROW) {
+ for (auto &MI : reverse(*Header)) {
+ if (MI.isCall()) {
+ AfterSet.insert(&MI);
+ ThrowingCall = &MI;
+ // Possibly throwing calls are usually wrapped by EH_LABEL
+ // instructions. We don't want to split them and the call.
+ if (MI.getIterator() != Header->begin() &&
+ std::prev(MI.getIterator())->isEHLabel()) {
+ AfterSet.insert(&*std::prev(MI.getIterator()));
+ ThrowingCall = &*std::prev(MI.getIterator());
+ }
+ break;
+ }
+ }
+ }
+ }
+
+ // Local expression tree should go after the TRY_TABLE.
+ // For BLOCK placement, we start the search from the previous instruction of a
+ // BB's terminator, but in TRY_TABLE's case, we should start from the previous
+ // instruction of a call that can throw, or a EH_LABEL that precedes the call,
+ // because the return values of the call's previous instructions can be
+ // stackified and consumed by the throwing call.
+ auto SearchStartPt = ThrowingCall ? MachineBasicBlock::iterator(ThrowingCall)
+ : Header->getFirstTerminator();
+ for (auto I = SearchStartPt, E = Header->begin(); I != E; --I) {
+ if (std::prev(I)->isDebugInstr() || std::prev(I)->isPosition())
+ continue;
+ if (WebAssembly::isChild(*std::prev(I), MFI))
+ AfterSet.insert(&*std::prev(I));
+ else
+ break;
+ }
+
+ // Add the TRY_TABLE and a BLOCK for the catch destination. We currently
+ // generate only one CATCH clause for a TRY_TABLE, so we need one BLOCK for
+ // its destination.
+ //
+ // Header:
+ // block
+ // try_table (catch ... $MBB)
+ // ...
+ //
+ // MBB:
+ // end_try_table
+ // end_block ...
[truncated]
|
@@ -147,6 +178,8 @@ let isTerminator = 1, hasSideEffects = 1, isBarrier = 1, hasCtrlDep = 1, | |||
// usage gets low enough. | |||
|
|||
// Rethrowing an exception: rethrow | |||
// The new exnref proposal also uses this instruction as an interim pseudo |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is this just because we want to be able to support both the old and new proposals for now, or is there a more fundamental reason why it's better to use rethrow early in the pipeline and then convert it?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No it happens to be more convenient too. Until we match instructions (llvm.wasm.rethrow
intrinsic or cleanupret
) that later becomes throw_ref
with the corresponding EH pad and its catch_***
instruction, which happens in LateEHPrepare, we need a way to express "rethrow". If we really want to avoid the use of the pseudo RETHROW
, we have to do this matching in isel, which I think will be very inconvenient.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If we want to be strict maybe we can make the old rethrow RETHROW_LEGACY
or something too and add a new pseudo RETHROW
, but I didn't feel it very necessary given that this is not very confusing. (In CATCH
/CATCH_ALL
's case it was confusing because in CFGStackify or MCInstLower I had to check WebAssembly::WasmEnableExnref
many times to check whether the current instruction was the legacy CATCH
or the new pseudo CATCH
)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think I see; before LateEHPrepare, there is no exnref that comes out of the catch, so it doesn't make sense to have a throw_ref either. So instead we have rethrow, which has semantics that are meaningful on their own in this phase of compilation even though they don't correspond to what we need at the end.
And yeah, given that rethrow is a meaningful concept independent of the legacy spec, I don't think we need to try to keep it labeled as legacy permanently.
Catch->eraseFromParent(); | ||
|
||
for (auto *Rethrow : Rethrows) { | ||
auto InsertPos = Rethrow->getIterator()++; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The use of post-increment on the getter seems a little bit confusing to me (since it seems like it's being used on an rvalue rather than on a variable that is still live after the statement). Is this equivalent to getIterator()->getNextNode()
like you used above?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
getNextNode
does not work here because it is very likely that there is no next node, because RETHROW
is a terminator. But I can't just do Rethrow->getParent()->end()
because there can be other non-functional instructions like DBG_VALUE
after that.
Switched to std::next(Rethrow->getIterator())
, given that this seems safer and also frequently used in the codebase.
// Local expression tree should go after the TRY_TABLE. | ||
// For BLOCK placement, we start the search from the previous instruction of a | ||
// BB's terminator, but in TRY_TABLE's case, we should start from the previous | ||
// instruction of a call that can throw, or a EH_LABEL that precedes the call, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I thought that throwing calls were also terminators. Is this not the case?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
isTerminator
property checked by MachineBasicBlock::getFirstTerminator
llvm-project/llvm/lib/CodeGen/MachineBasicBlock.cpp
Lines 242 to 249 in 0f56ba1
MachineBasicBlock::iterator MachineBasicBlock::getFirstTerminator() { | |
iterator B = begin(), E = end(), I = E; | |
while (I != B && ((--I)->isTerminator() || I->isDebugInstr())) | |
; /*noop */ | |
while (I != E && !I->isTerminator()) | |
++I; | |
return I; | |
} |
is this:
bit isTerminator = false; // Is this part of the terminator for a basic block? |
And this is set only on branches, returns, and return calls. So calls don't have that property set.
This is basically copied form the existing placeTryBlock
:
llvm-project/llvm/lib/Target/WebAssembly/WebAssemblyCFGStackify.cpp
Lines 582 to 587 in 0f56ba1
// Local expression tree should go after the TRY. | |
// For BLOCK placement, we start the search from the previous instruction of a | |
// BB's terminator, but in TRY's case, we should start from the previous | |
// instruction of a call that can throw, or a EH_LABEL that precedes the call, | |
// because the return values of the call's previous instructions can be | |
// stackified and consumed by the throwing call. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ah right, I guess at this point in the pipeline we don't have a distinction between call and invoke the way LLVM IR does.
This adds the basic assembly generation support for the final EH proposal, which was newly adopted in Sep 2023 and advanced into Phase 4 in Jul 2024:
https://github.com/WebAssembly/exception-handling/blob/main/proposals/exception-handling/Exceptions.md
This adds support for the generation of new
try_table
andthrow_ref
instruction in .s asesmbly format. This does NOT yet includeThese will be added as follow-up PRs.
The format for
TRY_TABLE
, both forMachineInstr
andMCInst
, is as follows:where
catch_clause
iscatch_opcode
should be one of 0/1/2/3, which denotesCATCH
/CATCH_REF
/CATCH_ALL
/CATCH_ALL_REF
respectively. (SeeBinaryFormat/Wasm.h
)tag
exists when the catch is one ofCATCH
orCATCH_REF
.The MIR format is printed as just the list of raw operands. The (stack-based) assembly instruction supports pretty-printing, including printing
catch
clauses by name, in InstPrinter.In addition to the new instructions
TRY_TABLE
andTHROW_REF
, this adds four pseudo instructions:CATCH
,CATCH_REF
,CATCH_ALL
, andCATCH_ALL_REF
. These are pseudo instructions to simulate block return values ofcatch
,catch_ref
,catch_all
,catch_all_ref
clauses intry_table
respectively, given that we don't support block return values except for one case (fixEndsAtEndOfFunction
in CFGStackify). These will be omitted when we lower the instructions toMCInst
at the end.LateEHPrepare now will have one more stage to covert
CATCH
/CATCH_ALL
s toCATCH_REF
/CATCH_ALL_REF
s when there is aRETHROW
to rethrow its exception. The pass also convertsRETHROW
s intoTHROW_REF
. Note that we still useRETHROW
as an interim pseudo instruction until we convert them toTHROW_REF
in LateEHPrepare.CFGStackify has a new
placeTryTableMarker
function, which placestry_table
/end_try_table
markers with a necessarycatch
clause and alsoblock
/end_block
markers for the destination of thecatch
clause.In MCInstLower, now we need to support one more case for the multivalue block signature (
catch_ref
's destination's(i32, exnref)
return type).InstPrinter has a new routine to print the
catch_list
type, which is used to printtry_table
instructions.The new test,
exception.ll
's source is the same asexception-legacy.ll
, with the FileCheck expectations changed. One difference is the commands in this file have-wasm-enable-exnref
to test the new format, and don't have-wasm-disable-explicit-locals -wasm-keep-registers
, because the new custom InstPrinter routine to printcatch_list
only works for the stack-based instructions (_S
), and we can't use-wasm-keep-registers
for them.As in
exception-legacy.ll
, the FileCheck lines for the new tests do not contain the whole program; they mostly contain only the control flow instructions for readability.