Skip to content

[clang] Redefine noconvergent and generate convergence control tokens #136282

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 1 commit into
base: users/ssahasra/clang-convergence-spec
Choose a base branch
from
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
27 changes: 27 additions & 0 deletions clang/docs/ThreadConvergence.rst
Original file line number Diff line number Diff line change
Expand Up @@ -564,6 +564,33 @@ backwards ``goto`` instead of a ``while`` statement.
``outside_loop``. This includes threads that jumped from ``G2`` as well as
threads that reached ``outside_loop`` after executing ``C``.

.. _noconvergent-statement:

The ``noconvergent`` Statement
==============================

When a statement is marked as ``noconvergent`` the convergence of threads at the
start of this statement is not constrained by any convergent operations inside
the statement.

- When two threads execute a statement marked ``noconvergent``, it is
implementation-defined whether they are converged at that execution. [Note:
The resulting evaluations must still satisfy the strict partial order imposed
by convergence-before.]
- When two threads are converged at the start of this statement (as determined
by the implementation), whether they are converged at each convergent
operation inside this statement is determined by the usual rules.

For every label statement ``L`` occurring inside a ``noconvergent``
statement, every ``goto`` or ``switch`` statement that transfers control to
``L`` must also occur inside that statement.

.. note::

Convergence control tokens are necessary for correctly implementing the
"noconvergent" statement attribute. When tokens are not in use, the legacy
behaviour is retained, where the only effect of this attribute is that
``asm`` calls within the statement are not treated as convergent operations.

Implementation-defined Convergence
==================================
Expand Down
3 changes: 2 additions & 1 deletion clang/include/clang/Analysis/Analyses/ConvergenceCheck.h
Original file line number Diff line number Diff line change
Expand Up @@ -18,7 +18,8 @@ class AnalysisDeclContext;
class Sema;
class Stmt;

void analyzeForConvergence(Sema &S, AnalysisDeclContext &AC);
void analyzeForConvergence(Sema &S, AnalysisDeclContext &AC,
bool GenerateWarnings, bool GenerateTokens);

} // end namespace clang

Expand Down
15 changes: 9 additions & 6 deletions clang/include/clang/Basic/AttrDocs.td
Original file line number Diff line number Diff line change
Expand Up @@ -1700,13 +1700,12 @@ def NoConvergentDocs : Documentation {
This attribute prevents a function from being treated as convergent; when a
function is marked ``noconvergent``, calls to that function are not
automatically assumed to be convergent, unless such calls are explicitly marked
as ``convergent``. If a statement is marked as ``noconvergent``, any calls to
inline ``asm`` in that statement are no longer treated as convergent.
as ``convergent``.

In languages following SPMD/SIMT programming model, e.g., CUDA/HIP, function
declarations and inline asm calls are treated as convergent by default for
correctness. This ``noconvergent`` attribute is helpful for developers to
prevent them from being treated as convergent when it's safe.
If a statement is marked as ``noconvergent``, the semantics depends on whether
convergence control tokens are used in the generated LLVM IR. When convergence
control tokens are not in use, any calls to inline ``asm`` in that statement are
treated as not convergent.

.. code-block:: c

Expand All @@ -1719,6 +1718,10 @@ prevent them from being treated as convergent when it's safe.
[[clang::noconvergent]] { asm volatile ("nop"); } // the asm call is non-convergent
}

When tokens are in use, placing the ``noconvergent`` attribute on a statement
indicates that thread convergence at the entry to that statement is
:ref:`implementation-defined<noconvergent-statement>`.

}];
}

Expand Down
2 changes: 2 additions & 0 deletions clang/include/clang/Basic/DiagnosticSemaKinds.td
Original file line number Diff line number Diff line change
Expand Up @@ -6514,6 +6514,8 @@ def note_goto_affects_convergence : Note<
"jump from this goto statement affects convergence">;
def note_switch_case_affects_convergence : Note<
"jump to this case statement affects convergence of loop">;
def err_jump_into_noconvergent : Error<
"cannot jump into a noconvergent statement from outside">;
def err_goto_into_protected_scope : Error<
"cannot jump from this goto statement to its label">;
def ext_goto_into_protected_scope : ExtWarn<
Expand Down
2 changes: 2 additions & 0 deletions clang/include/clang/Basic/LangOptions.def
Original file line number Diff line number Diff line change
Expand Up @@ -306,6 +306,8 @@ LANGOPT(HIPUseNewLaunchAPI, 1, 0, "Use new kernel launching API for HIP")
LANGOPT(OffloadUniformBlock, 1, 0, "Assume that kernels are launched with uniform block sizes (default true for CUDA/HIP and false otherwise)")
LANGOPT(HIPStdPar, 1, 0, "Enable Standard Parallel Algorithm Acceleration for HIP (experimental)")
LANGOPT(HIPStdParInterposeAlloc, 1, 0, "Replace allocations / deallocations with HIP RT calls when Standard Parallel Algorithm Acceleration for HIP is enabled (Experimental)")
LANGOPT(ConvergenceControl, 1, 0,
"Generate explicit convergence control (experimental)")

LANGOPT(OpenACC , 1, 0, "OpenACC Enabled")

Expand Down
5 changes: 5 additions & 0 deletions clang/include/clang/Driver/Options.td
Original file line number Diff line number Diff line change
Expand Up @@ -1397,6 +1397,11 @@ def fhip_emit_relocatable : Flag<["-"], "fhip-emit-relocatable">,
HelpText<"Compile HIP source to relocatable">;
def fno_hip_emit_relocatable : Flag<["-"], "fno-hip-emit-relocatable">,
HelpText<"Do not override toolchain to compile HIP source to relocatable">;
defm convergence_control : BoolFOption<"convergence-control",
LangOpts<"ConvergenceControl">, DefaultFalse,
PosFlag<SetTrue, [], [ClangOption, CC1Option], "Generate">,
NegFlag<SetFalse, [], [ClangOption], "Don't generate">,
BothFlags<[], [ClangOption], " explicit convergence control tokens (experimental)">>;
}

// Clang specific/exclusive options for OpenACC.
Expand Down
43 changes: 33 additions & 10 deletions clang/lib/Analysis/ConvergenceCheck.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -16,6 +16,11 @@
using namespace clang;
using namespace llvm;

static void errorJumpIntoNoConvergent(Sema &S, Stmt *From, Stmt *Parent) {
S.Diag(Parent->getBeginLoc(), diag::err_jump_into_noconvergent);
S.Diag(From->getBeginLoc(), diag::note_goto_affects_convergence);
}

static void warnGotoCycle(Sema &S, Stmt *From, Stmt *Parent) {
S.Diag(Parent->getBeginLoc(),
diag::warn_cycle_created_by_goto_affects_convergence);
Expand All @@ -27,7 +32,8 @@ static void warnJumpIntoLoop(Sema &S, Stmt *From, Stmt *Loop) {
S.Diag(From->getBeginLoc(), diag::note_goto_affects_convergence);
}

static void checkConvergenceOnGoto(Sema &S, GotoStmt *From, ParentMap &PM) {
static void checkConvergenceOnGoto(Sema &S, GotoStmt *From, ParentMap &PM,
bool GenerateWarnings, bool GenerateTokens) {
Stmt *To = From->getLabel()->getStmt();

unsigned ToDepth = PM.getParentDepth(To) + 1;
Expand All @@ -42,7 +48,7 @@ static void checkConvergenceOnGoto(Sema &S, GotoStmt *From, ParentMap &PM) {
}

// Special case: the goto statement is a descendant of the label statement.
if (ExpandedFrom == ExpandedTo) {
if (GenerateWarnings && ExpandedFrom == ExpandedTo) {
assert(ExpandedTo == To);
warnGotoCycle(S, From, To);
return;
Expand All @@ -60,10 +66,18 @@ static void checkConvergenceOnGoto(Sema &S, GotoStmt *From, ParentMap &PM) {

SmallVector<Stmt *> Loops;
for (Stmt *I = To; I != ParentFrom; I = PM.getParent(I)) {
if (GenerateTokens)
if (const auto *AS = dyn_cast<AttributedStmt>(I))
if (hasSpecificAttr<NoConvergentAttr>(AS->getAttrs()))
errorJumpIntoNoConvergent(S, From, I);
// Can't jump into a ranged-for, so we don't need to look for it here.
if (isa<ForStmt, WhileStmt, DoStmt>(I))
if (GenerateWarnings && isa<ForStmt, WhileStmt, DoStmt>(I))
Loops.push_back(I);
}

if (!GenerateWarnings)
return;

for (Stmt *I : reverse(Loops))
warnJumpIntoLoop(S, From, I);

Expand All @@ -88,21 +102,29 @@ static void warnSwitchIntoLoop(Sema &S, Stmt *Case, Stmt *Loop) {
}

static void checkConvergenceForSwitch(Sema &S, SwitchStmt *Switch,
ParentMap &PM) {
ParentMap &PM, bool GenerateWarnings,
bool GenerateTokens) {
for (SwitchCase *Case = Switch->getSwitchCaseList(); Case;
Case = Case->getNextSwitchCase()) {
SmallVector<Stmt *> Loops;
for (Stmt *I = Case; I != Switch; I = PM.getParent(I)) {
if (GenerateTokens)
if (const auto *AS = dyn_cast<AttributedStmt>(I))
if (hasSpecificAttr<NoConvergentAttr>(AS->getAttrs()))
errorJumpIntoNoConvergent(S, Switch, I);
// Can't jump into a ranged-for, so we don't need to look for it here.
if (isa<ForStmt, WhileStmt, DoStmt>(I))
if (GenerateWarnings && isa<ForStmt, WhileStmt, DoStmt>(I))
Loops.push_back(I);
}
for (Stmt *I : reverse(Loops))
warnSwitchIntoLoop(S, Case, I);
if (GenerateWarnings) {
for (Stmt *I : reverse(Loops))
warnSwitchIntoLoop(S, Case, I);
}
}
}

void clang::analyzeForConvergence(Sema &S, AnalysisDeclContext &AC) {
void clang::analyzeForConvergence(Sema &S, AnalysisDeclContext &AC,
bool GenerateWarnings, bool GenerateTokens) {
// Iterating over the CFG helps trim unreachable blocks, and locates Goto
// statements faster than iterating over the whole body.
CFG *cfg = AC.getCFG();
Expand All @@ -111,9 +133,10 @@ void clang::analyzeForConvergence(Sema &S, AnalysisDeclContext &AC) {
for (CFGBlock *BI : *cfg) {
Stmt *Term = BI->getTerminatorStmt();
if (GotoStmt *Goto = dyn_cast_or_null<GotoStmt>(Term)) {
checkConvergenceOnGoto(S, Goto, PM);
checkConvergenceOnGoto(S, Goto, PM, GenerateWarnings, GenerateTokens);
} else if (SwitchStmt *Switch = dyn_cast_or_null<SwitchStmt>(Term)) {
checkConvergenceForSwitch(S, Switch, PM);
checkConvergenceForSwitch(S, Switch, PM, GenerateWarnings,
GenerateTokens);
}
}
}
8 changes: 7 additions & 1 deletion clang/lib/CodeGen/CGCall.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -5773,7 +5773,13 @@ RValue CodeGenFunction::EmitCall(const CGFunctionInfo &CallInfo,
Attrs =
Attrs.addFnAttribute(getLLVMContext(), llvm::Attribute::AlwaysInline);

// Remove call-site convergent attribute if requested.
// Remove call-site convergent attribute if this call occurs inside a
// noconvergent statement. This is the legacy behaviour when convergence
// control tokens are not in use. It only affects inline asm calls, since all
// other function calls inherit the convergent attribute from the callee. When
// convergence control tokens are in use, any inline asm calls should be
// explicitly marked noconvergent, else they simply inherit whatever token is
// currently in scope.
if (InNoConvergentAttributedStmt)
Attrs =
Attrs.removeFnAttribute(getLLVMContext(), llvm::Attribute::Convergent);
Expand Down
44 changes: 31 additions & 13 deletions clang/lib/CodeGen/CGStmt.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -829,14 +829,24 @@ void CodeGenFunction::EmitAttributedStmt(const AttributedStmt &S) {
} break;
}
}
bool LegacyNoConvergent = noconvergent && !CGM.shouldEmitConvergenceTokens();
SaveAndRestore save_nomerge(InNoMergeAttributedStmt, nomerge);
SaveAndRestore save_noinline(InNoInlineAttributedStmt, noinline);
SaveAndRestore save_alwaysinline(InAlwaysInlineAttributedStmt, alwaysinline);
SaveAndRestore save_noconvergent(InNoConvergentAttributedStmt, noconvergent);
SaveAndRestore save_noconvergent(InNoConvergentAttributedStmt,
LegacyNoConvergent);
SaveAndRestore save_musttail(MustTailCall, musttail);
SaveAndRestore save_flattenOrBranch(HLSLControlFlowAttr, flattenOrBranch);
CGAtomicOptionsRAII AORAII(CGM, AA);
if (noconvergent && CGM.shouldEmitConvergenceTokens()) {
EmitBlock(createBasicBlock("noconvergent.anchor"));
ConvergenceTokenStack.push_back(
emitConvergenceAnchorToken(Builder.GetInsertBlock()));
}
EmitStmt(S.getSubStmt(), S.getAttrs());
if (noconvergent && CGM.shouldEmitConvergenceTokens()) {
ConvergenceTokenStack.pop_back();
}
}

void CodeGenFunction::EmitGotoStmt(const GotoStmt &S) {
Expand Down Expand Up @@ -3317,16 +3327,6 @@ CodeGenFunction::GenerateCapturedStmtFunction(const CapturedStmt &S) {
return F;
}

// Returns the first convergence entry/loop/anchor instruction found in |BB|.
// std::nullptr otherwise.
static llvm::ConvergenceControlInst *getConvergenceToken(llvm::BasicBlock *BB) {
for (auto &I : *BB) {
if (auto *CI = dyn_cast<llvm::ConvergenceControlInst>(&I))
return CI;
}
return nullptr;
}

llvm::CallBase *
CodeGenFunction::addConvergenceControlToken(llvm::CallBase *Input) {
llvm::ConvergenceControlInst *ParentToken = ConvergenceTokenStack.back();
Expand All @@ -3348,15 +3348,33 @@ CodeGenFunction::emitConvergenceLoopToken(llvm::BasicBlock *BB) {
return llvm::ConvergenceControlInst::CreateLoop(*BB, ParentToken);
}

llvm::ConvergenceControlInst *
CodeGenFunction::emitConvergenceAnchorToken(llvm::BasicBlock *BB) {
return llvm::ConvergenceControlInst::CreateAnchor(*BB);
}

llvm::ConvergenceControlInst *
CodeGenFunction::getOrEmitConvergenceEntryToken(llvm::Function *F) {
llvm::BasicBlock *BB = &F->getEntryBlock();
llvm::ConvergenceControlInst *Token = getConvergenceToken(BB);
llvm::ConvergenceControlInst *Token = llvm::getConvergenceControlDef(*BB);
if (Token)
return Token;

// Adding a convergence token requires the function to be marked as
// Adding a convergence entry token requires the function to be marked as
// convergent.
F->setConvergent();
return llvm::ConvergenceControlInst::CreateEntry(*BB);
}

llvm::ConvergenceControlInst *
CodeGenFunction::getOrEmitConvergenceAnchorToken(llvm::Function *F) {
llvm::BasicBlock *BB = &F->getEntryBlock();
llvm::ConvergenceControlInst *Token = llvm::getConvergenceControlDef(*BB);
if (Token)
return Token;

// Adding a convergence anchor token requires the function to be marked as
// not convergent.
F->setNotConvergent();
return llvm::ConvergenceControlInst::CreateAnchor(*BB);
}
23 changes: 15 additions & 8 deletions clang/lib/CodeGen/CodeGenFunction.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -47,6 +47,7 @@
#include "llvm/Support/CRC.h"
#include "llvm/Support/xxhash.h"
#include "llvm/Transforms/Scalar/LowerExpectIntrinsic.h"
#include "llvm/Transforms/Utils/FixConvergenceControl.h"
#include "llvm/Transforms/Utils/PromoteMemToReg.h"
#include <optional>

Expand Down Expand Up @@ -371,12 +372,6 @@ void CodeGenFunction::FinishFunction(SourceLocation EndLoc) {
assert(DeferredDeactivationCleanupStack.empty() &&
"mismatched activate/deactivate of cleanups!");

if (CGM.shouldEmitConvergenceTokens()) {
ConvergenceTokenStack.pop_back();
assert(ConvergenceTokenStack.empty() &&
"mismatched push/pop in convergence stack!");
}

bool OnlySimpleReturnStmts = NumSimpleReturnExprs > 0
&& NumSimpleReturnExprs == NumReturnExprs
&& ReturnBlock.getBlock()->use_empty();
Expand Down Expand Up @@ -1362,8 +1357,13 @@ void CodeGenFunction::StartFunction(GlobalDecl GD, QualType RetTy,
if (const auto *VecWidth = CurFuncDecl->getAttr<MinVectorWidthAttr>())
LargestVectorWidth = VecWidth->getVectorWidth();

if (CGM.shouldEmitConvergenceTokens())
ConvergenceTokenStack.push_back(getOrEmitConvergenceEntryToken(CurFn));
if (CGM.shouldEmitConvergenceTokens()) {
llvm::ConvergenceControlInst *Token =
(FD && FD->hasAttr<NoConvergentAttr>())
? getOrEmitConvergenceAnchorToken(CurFn)
: getOrEmitConvergenceEntryToken(CurFn);
ConvergenceTokenStack.push_back(Token);
}
}

void CodeGenFunction::EmitFunctionBody(const Stmt *Body) {
Expand Down Expand Up @@ -1647,6 +1647,13 @@ void CodeGenFunction::GenerateCode(GlobalDecl GD, llvm::Function *Fn,
}
}

if (CGM.shouldEmitConvergenceTokens()) {
ConvergenceTokenStack.pop_back();
assert(ConvergenceTokenStack.empty() &&
"mismatched push/pop in convergence stack!");
fixConvergenceControl(CurFn);
}

// Emit the standard function epilogue.
FinishFunction(BodyRange.getEnd());

Expand Down
13 changes: 11 additions & 2 deletions clang/lib/CodeGen/CodeGenFunction.h
Original file line number Diff line number Diff line change
Expand Up @@ -5339,15 +5339,24 @@ class CodeGenFunction : public CodeGenTypeCache {
// as it's parent convergence instr.
llvm::ConvergenceControlInst *emitConvergenceLoopToken(llvm::BasicBlock *BB);

// Emits a convergence_anchor instruction for the given |BB|.
llvm::ConvergenceControlInst *
emitConvergenceAnchorToken(llvm::BasicBlock *BB);

// Adds a convergence_ctrl token with |ParentToken| as parent convergence
// instr to the call |Input|.
llvm::CallBase *addConvergenceControlToken(llvm::CallBase *Input);

// Find the convergence_entry instruction |F|, or emits ones if none exists.
// Returns the convergence instruction.
// Find the convergence control token in the entry block of |F|, or if none
// exists, create an entry token.
llvm::ConvergenceControlInst *
getOrEmitConvergenceEntryToken(llvm::Function *F);

// Find the convergence control token in the entry block of |F|, or if none
// exists, create an anchor token.
llvm::ConvergenceControlInst *
getOrEmitConvergenceAnchorToken(llvm::Function *F);

private:
llvm::MDNode *getRangeForLoadFromType(QualType Ty);
void EmitReturnOfRValue(RValue RV, QualType Ty);
Expand Down
2 changes: 1 addition & 1 deletion clang/lib/CodeGen/CodeGenModule.h
Original file line number Diff line number Diff line change
Expand Up @@ -1751,7 +1751,7 @@ class CodeGenModule : public CodeGenTypeCache {
bool shouldEmitConvergenceTokens() const {
// TODO: this should probably become unconditional once the controlled
// convergence becomes the norm.
return getTriple().isSPIRVLogical();
return getTriple().isSPIRVLogical() || getLangOpts().ConvergenceControl;
}

void addUndefinedGlobalForTailCall(
Expand Down
3 changes: 3 additions & 0 deletions clang/lib/Driver/ToolChains/Clang.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -7098,6 +7098,9 @@ void Clang::ConstructJob(Compilation &C, const JobAction &JA,
if (Args.hasFlag(options::OPT_fhip_new_launch_api,
options::OPT_fno_hip_new_launch_api, true))
CmdArgs.push_back("-fhip-new-launch-api");
if (Args.hasFlag(options::OPT_fconvergence_control,
options::OPT_fno_convergence_control, false))
CmdArgs.push_back("-fconvergence-control");
Args.addOptInFlag(CmdArgs, options::OPT_fgpu_allow_device_init,
options::OPT_fno_gpu_allow_device_init);
Args.AddLastArg(CmdArgs, options::OPT_hipstdpar);
Expand Down
Loading
Loading