Skip to content

[GlobalISel] Combiner: Observer-based DCE and retrying of combines #102163

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
10 changes: 8 additions & 2 deletions llvm/include/llvm/CodeGen/GlobalISel/Combiner.h
Original file line number Diff line number Diff line change
Expand Up @@ -15,13 +15,13 @@
#ifndef LLVM_CODEGEN_GLOBALISEL_COMBINER_H
#define LLVM_CODEGEN_GLOBALISEL_COMBINER_H

#include "llvm/CodeGen/GlobalISel/CombinerInfo.h"
#include "llvm/CodeGen/GlobalISel/GIMatchTableExecutor.h"
#include "llvm/CodeGen/GlobalISel/GISelChangeObserver.h"
#include "llvm/CodeGen/GlobalISel/MachineIRBuilder.h"

namespace llvm {
class MachineRegisterInfo;
struct CombinerInfo;
class GISelCSEInfo;
class TargetPassConfig;
class MachineFunction;
Expand All @@ -33,8 +33,12 @@ class MachineIRBuilder;
/// TODO: Is it worth making this module-wide?
class Combiner : public GIMatchTableExecutor {
private:
using WorkListTy = GISelWorkList<512>;

class WorkListMaintainer;
GISelWorkList<512> WorkList;
template <CombinerInfo::ObserverLevel Lvl> class WorkListMaintainerImpl;

WorkListTy WorkList;

// We have a little hack here where keep the owned pointers private, and only
// expose a reference. This has two purposes:
Expand All @@ -48,6 +52,8 @@ class Combiner : public GIMatchTableExecutor {

bool HasSetupMF = false;

static bool tryDCE(MachineInstr &MI, MachineRegisterInfo &MRI);

public:
/// If CSEInfo is not null, then the Combiner will use CSEInfo as the observer
/// and also create a CSEMIRBuilder. Pass nullptr if CSE is not needed.
Expand Down
25 changes: 25 additions & 0 deletions llvm/include/llvm/CodeGen/GlobalISel/CombinerInfo.h
Original file line number Diff line number Diff line change
Expand Up @@ -53,6 +53,31 @@ struct CombinerInfo {
/// The maximum number of times the Combiner will iterate over the
/// MachineFunction. Setting this to 0 enables fixed-point iteration.
unsigned MaxIterations = 0;

enum class ObserverLevel {
/// Only retry combining created/changed instructions.
/// This replicates the legacy default Observer behavior for use with
/// fixed-point iteration.
Basic,
/// Enables Observer-based detection of dead instructions. This can save
/// some compile-time if full disabling of fixed-point iteration is not
/// desired. If the input IR doesn't contain dead instructions, consider
/// disabling \p EnableFullDCE.
DCE,
/// Enables Observer-based DCE and additional heuristics that retry
/// combining defined and used instructions of modified instructions.
/// This provides a good balance between compile-time and completeness of
/// combining without needing fixed-point iteration.
SinglePass,
};

/// Select how the Combiner acts on MIR changes.
ObserverLevel ObserverLvl = ObserverLevel::Basic;

/// Whether dead code elimination is performed before each Combiner iteration.
/// If Observer-based DCE is enabled, this controls if a full DCE pass is
/// performed before the first Combiner iteration.
bool EnableFullDCE = true;
};
} // namespace llvm

Expand Down
219 changes: 181 additions & 38 deletions llvm/lib/CodeGen/GlobalISel/Combiner.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -45,61 +45,190 @@ cl::OptionCategory GICombinerOptionCategory(
);
} // end namespace llvm

/// This class acts as the glue the joins the CombinerHelper to the overall
/// This class acts as the glue that joins the CombinerHelper to the overall
/// Combine algorithm. The CombinerHelper is intended to report the
/// modifications it makes to the MIR to the GISelChangeObserver and the
/// observer subclass will act on these events. In this case, instruction
/// erasure will cancel any future visits to the erased instruction and
/// instruction creation will schedule that instruction for a future visit.
/// Other Combiner implementations may require more complex behaviour from
/// their GISelChangeObserver subclass.
/// observer subclass will act on these events.
class Combiner::WorkListMaintainer : public GISelChangeObserver {
using WorkListTy = GISelWorkList<512>;
WorkListTy &WorkList;
protected:
#ifndef NDEBUG
Copy link
Contributor

@arsenm arsenm Aug 15, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does this need LLVM_ENABLE_ABI_BREAKING_CHECKS check?

Copy link
Contributor Author

@tobias-stadler tobias-stadler Aug 15, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good point, but I don't think it's needed in this case, because you can't get the complete type of WorkListMaintainer outside of Combiner.cpp. Combiner.h only contains the declaration of the class. Thoughts?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok, I thought this was in the header

/// The instructions that have been created but we want to report once they
/// have their operands. This is only maintained if debug output is requested.
#ifndef NDEBUG
SetVector<const MachineInstr *> CreatedInstrs;
SmallSetVector<const MachineInstr *, 32> CreatedInstrs;
#endif
using Level = CombinerInfo::ObserverLevel;

public:
WorkListMaintainer(WorkListTy &WorkList) : WorkList(WorkList) {}
static std::unique_ptr<WorkListMaintainer>
create(Level Lvl, WorkListTy &WorkList, MachineRegisterInfo &MRI);

virtual ~WorkListMaintainer() = default;

void reportFullyCreatedInstrs() {
LLVM_DEBUG({
for (auto *MI : CreatedInstrs) {
dbgs() << "Created: " << *MI;
}
CreatedInstrs.clear();
});
}

virtual void reset() = 0;
virtual void appliedCombine() = 0;
};

/// A configurable WorkListMaintainer implementation.
/// The ObserverLevel determines how the WorkListMaintainer reacts to MIR
/// changes.
template <CombinerInfo::ObserverLevel Lvl>
class Combiner::WorkListMaintainerImpl : public Combiner::WorkListMaintainer {
WorkListTy &WorkList;
MachineRegisterInfo &MRI;

// Defer handling these instructions until the combine finishes.
SmallSetVector<MachineInstr *, 32> DeferList;

// Track VRegs that (might) have lost a use.
SmallSetVector<Register, 32> LostUses;

public:
WorkListMaintainerImpl(WorkListTy &WorkList, MachineRegisterInfo &MRI)
: WorkList(WorkList), MRI(MRI) {}

virtual ~WorkListMaintainerImpl() = default;

void reset() override {
DeferList.clear();
LostUses.clear();
}

void erasingInstr(MachineInstr &MI) override {
LLVM_DEBUG(dbgs() << "Erasing: " << MI << "\n");
// MI will become dangling, remove it from all lists.
LLVM_DEBUG(dbgs() << "Erasing: " << MI; CreatedInstrs.remove(&MI));
WorkList.remove(&MI);
if constexpr (Lvl != Level::Basic) {
DeferList.remove(&MI);
noteLostUses(MI);
}
}

void createdInstr(MachineInstr &MI) override {
LLVM_DEBUG(dbgs() << "Creating: " << MI << "\n");
WorkList.insert(&MI);
LLVM_DEBUG(CreatedInstrs.insert(&MI));
LLVM_DEBUG(dbgs() << "Creating: " << MI; CreatedInstrs.insert(&MI));
if constexpr (Lvl == Level::Basic)
WorkList.insert(&MI);
else
// Defer handling newly created instructions, because they don't have
// operands yet. We also insert them into the WorkList in reverse
// order so that they will be combined top down.
DeferList.insert(&MI);
}

void changingInstr(MachineInstr &MI) override {
LLVM_DEBUG(dbgs() << "Changing: " << MI << "\n");
WorkList.insert(&MI);
LLVM_DEBUG(dbgs() << "Changing: " << MI);
// Some uses might get dropped when MI is changed.
// For now, overapproximate by assuming all uses will be dropped.
// TODO: Is a more precise heuristic or manual tracking of use count
// decrements worth it?
if constexpr (Lvl != Level::Basic)
noteLostUses(MI);
}

void changedInstr(MachineInstr &MI) override {
LLVM_DEBUG(dbgs() << "Changed: " << MI << "\n");
WorkList.insert(&MI);
LLVM_DEBUG(dbgs() << "Changed: " << MI);
if constexpr (Lvl == Level::Basic)
WorkList.insert(&MI);
else
// Defer this for DCE
DeferList.insert(&MI);
}

void reportFullyCreatedInstrs() {
LLVM_DEBUG(for (const auto *MI
: CreatedInstrs) {
dbgs() << "Created: ";
MI->print(dbgs());
});
LLVM_DEBUG(CreatedInstrs.clear());
// Only track changes during the combine and then walk the def/use-chains once
// the combine is finished, because:
// - instructions might have multiple defs during the combine.
// - use counts aren't accurate during the combine.
void appliedCombine() override {
if constexpr (Lvl == Level::Basic)
return;

// DCE deferred instructions and add them to the WorkList bottom up.
while (!DeferList.empty()) {
MachineInstr &MI = *DeferList.pop_back_val();
if (tryDCE(MI, MRI))
continue;

if constexpr (Lvl >= Level::SinglePass)
addUsersToWorkList(MI);

WorkList.insert(&MI);
}

// Handle instructions that have lost a user.
while (!LostUses.empty()) {
Register Use = LostUses.pop_back_val();
MachineInstr *UseMI = MRI.getVRegDef(Use);
if (!UseMI)
continue;

// If DCE succeeds, UseMI's uses are added back to LostUses by
// erasingInstr.
if (tryDCE(*UseMI, MRI))
continue;

if constexpr (Lvl >= Level::SinglePass) {
// OneUse checks are relatively common, so we might be able to combine
// the single remaining user of this Reg.
if (MRI.hasOneNonDBGUser(Use))
WorkList.insert(&*MRI.use_instr_nodbg_begin(Use));

WorkList.insert(UseMI);
}
}
}

void noteLostUses(MachineInstr &MI) {
for (auto &Use : MI.explicit_uses()) {
if (!Use.isReg() || !Use.getReg().isVirtual())
continue;
LostUses.insert(Use.getReg());
}
}

void addUsersToWorkList(MachineInstr &MI) {
for (auto &Def : MI.defs()) {
Register DefReg = Def.getReg();
if (!DefReg.isVirtual())
continue;
for (auto &UseMI : MRI.use_nodbg_instructions(DefReg)) {
WorkList.insert(&UseMI);
}
}
}
};

std::unique_ptr<Combiner::WorkListMaintainer>
Combiner::WorkListMaintainer::create(Level Lvl, WorkListTy &WorkList,
MachineRegisterInfo &MRI) {
switch (Lvl) {
case Level::Basic:
return std::make_unique<WorkListMaintainerImpl<Level::Basic>>(WorkList,
MRI);
case Level::DCE:
return std::make_unique<WorkListMaintainerImpl<Level::DCE>>(WorkList, MRI);
case Level::SinglePass:
return std::make_unique<WorkListMaintainerImpl<Level::SinglePass>>(WorkList,
MRI);
default:
llvm_unreachable("Illegal ObserverLevel");
}
}

Combiner::Combiner(MachineFunction &MF, CombinerInfo &CInfo,
const TargetPassConfig *TPC, GISelKnownBits *KB,
GISelCSEInfo *CSEInfo)
: Builder(CSEInfo ? std::make_unique<CSEMIRBuilder>()
: std::make_unique<MachineIRBuilder>()),
WLObserver(std::make_unique<WorkListMaintainer>(WorkList)),
WLObserver(WorkListMaintainer::create(CInfo.ObserverLvl, WorkList,
MF.getRegInfo())),
ObserverWrapper(std::make_unique<GISelObserverWrapper>()), CInfo(CInfo),
Observer(*ObserverWrapper), B(*Builder), MF(MF), MRI(MF.getRegInfo()),
KB(KB), TPC(TPC), CSEInfo(CSEInfo) {
Expand All @@ -115,6 +244,15 @@ Combiner::Combiner(MachineFunction &MF, CombinerInfo &CInfo,

Combiner::~Combiner() = default;

bool Combiner::tryDCE(MachineInstr &MI, MachineRegisterInfo &MRI) {
if (!isTriviallyDead(MI, MRI))
return false;
LLVM_DEBUG(dbgs() << "Dead: " << MI);
llvm::salvageDebugInfo(MRI, MI);
MI.eraseFromParent();
return true;
}

bool Combiner::combineMachineInstrs() {
// If the ISel pipeline failed, do not bother running this pass.
// FIXME: Should this be here or in individual combiner passes.
Expand All @@ -141,27 +279,29 @@ bool Combiner::combineMachineInstrs() {
++Iteration;
LLVM_DEBUG(dbgs() << "\n\nCombiner iteration #" << Iteration << '\n');

Changed = false;
WorkList.clear();
WLObserver->reset();
ObserverWrapper->clearObservers();
if (CSEInfo)
ObserverWrapper->addObserver(CSEInfo);

// If Observer-based DCE is enabled, perform full DCE only before the first
// iteration.
bool EnableDCE = CInfo.ObserverLvl >= CombinerInfo::ObserverLevel::DCE
? CInfo.EnableFullDCE && Iteration == 1
: CInfo.EnableFullDCE;

// Collect all instructions. Do a post order traversal for basic blocks and
// insert with list bottom up, so while we pop_back_val, we'll traverse top
// down RPOT.
Changed = false;

RAIIMFObsDelInstaller DelInstall(MF, *ObserverWrapper);
for (MachineBasicBlock *MBB : post_order(&MF)) {
for (MachineInstr &CurMI :
llvm::make_early_inc_range(llvm::reverse(*MBB))) {
// Erase dead insts before even adding to the list.
if (isTriviallyDead(CurMI, MRI)) {
LLVM_DEBUG(dbgs() << CurMI << "Is dead; erasing.\n");
llvm::salvageDebugInfo(MRI, CurMI);
CurMI.eraseFromParent();
if (EnableDCE && tryDCE(CurMI, MRI))
continue;
}
WorkList.deferred_insert(&CurMI);
}
}
Expand All @@ -171,10 +311,13 @@ bool Combiner::combineMachineInstrs() {
ObserverWrapper->addObserver(WLObserver.get());
// Main Loop. Process the instructions here.
while (!WorkList.empty()) {
MachineInstr *CurrInst = WorkList.pop_back_val();
LLVM_DEBUG(dbgs() << "\nTry combining " << *CurrInst;);
Changed |= tryCombineAll(*CurrInst);
WLObserver->reportFullyCreatedInstrs();
MachineInstr &CurrInst = *WorkList.pop_back_val();
LLVM_DEBUG(dbgs() << "\nTry combining " << CurrInst);
bool AppliedCombine = tryCombineAll(CurrInst);
LLVM_DEBUG(WLObserver->reportFullyCreatedInstrs());
Changed |= AppliedCombine;
if (AppliedCombine)
WLObserver->appliedCombine();
}
MFChanged |= Changed;

Expand Down
Loading