Skip to content

Commit 237adfc

Browse files
authored
[OpenMP] Rework handling of global ctor/dtors in OpenMP (#71739)
Summary: This patch reworks how we handle global constructors in OpenMP. Previously, we emitted individual kernels that were all registered and called individually. In order to provide more generic support, this patch moves all handling of this to the target backend and the runtime plugin. This has the benefit of supporting the GNU extensions for constructors an destructors, removing a class of failures related to shared library destruction order, and allows targets other than OpenMP to use the same support without needing to change the frontend. This is primarily done by calling kernels that the backend emits to iterate a list of ctor / dtor functions. For x64, this is automatic and we get it for free with the standard `dlopen` handling. For AMDGPU, we emit `amdgcn.device.init` and `amdgcn.device.fini` functions which handle everything atuomatically and simply need to be called. For NVPTX, a patch #71549 provides the kernels to call, but the runtime needs to set up the array manually by pulling out all the known constructor / destructor functions. One concession that this patch requires is the change that for GPU targets in OpenMP offloading we will use `llvm.global_dtors` instead of using `atexit`. This is because `atexit` is a separate runtime function that does not mesh well with the handling we're trying to do here. This should be equivalent in all cases except for cases where we would need to destruct manually such as: ``` struct S { ~S() { foo(); } }; void foo() { static S s; } ``` However this is broken in many other ways on the GPU, so it is not regressing any support, simply increasing the scope of what we can handle. This changes the handling of ctors / dtors. This patch now outputs a information message regarding the deprecation if the old format is used. This will be completely removed in a later release. Depends on: #71549
1 parent 133bcac commit 237adfc

File tree

20 files changed

+318
-217
lines changed

20 files changed

+318
-217
lines changed

clang/include/clang/Basic/LangOptions.h

Lines changed: 3 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -597,6 +597,9 @@ class LangOptions : public LangOptionsBase {
597597
return !requiresStrictPrototypes() && !OpenCL;
598598
}
599599

600+
/// Returns true if the language supports calling the 'atexit' function.
601+
bool hasAtExit() const { return !(OpenMP && OpenMPIsTargetDevice); }
602+
600603
/// Returns true if implicit int is part of the language requirements.
601604
bool isImplicitIntRequired() const { return !CPlusPlus && !C99; }
602605

clang/lib/CodeGen/CGDeclCXX.cpp

Lines changed: 9 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -327,6 +327,15 @@ void CodeGenFunction::registerGlobalDtorWithAtExit(const VarDecl &VD,
327327
registerGlobalDtorWithAtExit(dtorStub);
328328
}
329329

330+
/// Register a global destructor using the LLVM 'llvm.global_dtors' global.
331+
void CodeGenFunction::registerGlobalDtorWithLLVM(const VarDecl &VD,
332+
llvm::FunctionCallee Dtor,
333+
llvm::Constant *Addr) {
334+
// Create a function which calls the destructor.
335+
llvm::Function *dtorStub = createAtExitStub(VD, Dtor, Addr);
336+
CGM.AddGlobalDtor(dtorStub);
337+
}
338+
330339
void CodeGenFunction::registerGlobalDtorWithAtExit(llvm::Constant *dtorStub) {
331340
// extern "C" int atexit(void (*f)(void));
332341
assert(dtorStub->getType() ==
@@ -519,10 +528,6 @@ CodeGenModule::EmitCXXGlobalVarDeclInitFunc(const VarDecl *D,
519528
D->hasAttr<CUDASharedAttr>()))
520529
return;
521530

522-
if (getLangOpts().OpenMP &&
523-
getOpenMPRuntime().emitDeclareTargetVarDefinition(D, Addr, PerformInit))
524-
return;
525-
526531
// Check if we've already initialized this decl.
527532
auto I = DelayedCXXInitPosition.find(D);
528533
if (I != DelayedCXXInitPosition.end() && I->second == ~0U)

clang/lib/CodeGen/CGOpenMPRuntime.cpp

Lines changed: 0 additions & 130 deletions
Original file line numberDiff line numberDiff line change
@@ -1747,136 +1747,6 @@ llvm::Function *CGOpenMPRuntime::emitThreadPrivateVarDefinition(
17471747
return nullptr;
17481748
}
17491749

1750-
bool CGOpenMPRuntime::emitDeclareTargetVarDefinition(const VarDecl *VD,
1751-
llvm::GlobalVariable *Addr,
1752-
bool PerformInit) {
1753-
if (CGM.getLangOpts().OMPTargetTriples.empty() &&
1754-
!CGM.getLangOpts().OpenMPIsTargetDevice)
1755-
return false;
1756-
std::optional<OMPDeclareTargetDeclAttr::MapTypeTy> Res =
1757-
OMPDeclareTargetDeclAttr::isDeclareTargetDeclaration(VD);
1758-
if (!Res || *Res == OMPDeclareTargetDeclAttr::MT_Link ||
1759-
((*Res == OMPDeclareTargetDeclAttr::MT_To ||
1760-
*Res == OMPDeclareTargetDeclAttr::MT_Enter) &&
1761-
HasRequiresUnifiedSharedMemory))
1762-
return CGM.getLangOpts().OpenMPIsTargetDevice;
1763-
VD = VD->getDefinition(CGM.getContext());
1764-
assert(VD && "Unknown VarDecl");
1765-
1766-
if (!DeclareTargetWithDefinition.insert(CGM.getMangledName(VD)).second)
1767-
return CGM.getLangOpts().OpenMPIsTargetDevice;
1768-
1769-
QualType ASTTy = VD->getType();
1770-
SourceLocation Loc = VD->getCanonicalDecl()->getBeginLoc();
1771-
1772-
// Produce the unique prefix to identify the new target regions. We use
1773-
// the source location of the variable declaration which we know to not
1774-
// conflict with any target region.
1775-
llvm::TargetRegionEntryInfo EntryInfo =
1776-
getEntryInfoFromPresumedLoc(CGM, OMPBuilder, Loc, VD->getName());
1777-
SmallString<128> Buffer, Out;
1778-
OMPBuilder.OffloadInfoManager.getTargetRegionEntryFnName(Buffer, EntryInfo);
1779-
1780-
const Expr *Init = VD->getAnyInitializer();
1781-
if (CGM.getLangOpts().CPlusPlus && PerformInit) {
1782-
llvm::Constant *Ctor;
1783-
llvm::Constant *ID;
1784-
if (CGM.getLangOpts().OpenMPIsTargetDevice) {
1785-
// Generate function that re-emits the declaration's initializer into
1786-
// the threadprivate copy of the variable VD
1787-
CodeGenFunction CtorCGF(CGM);
1788-
1789-
const CGFunctionInfo &FI = CGM.getTypes().arrangeNullaryFunction();
1790-
llvm::FunctionType *FTy = CGM.getTypes().GetFunctionType(FI);
1791-
llvm::Function *Fn = CGM.CreateGlobalInitOrCleanUpFunction(
1792-
FTy, Twine(Buffer, "_ctor"), FI, Loc, false,
1793-
llvm::GlobalValue::WeakODRLinkage);
1794-
Fn->setVisibility(llvm::GlobalValue::ProtectedVisibility);
1795-
if (CGM.getTriple().isAMDGCN())
1796-
Fn->setCallingConv(llvm::CallingConv::AMDGPU_KERNEL);
1797-
auto NL = ApplyDebugLocation::CreateEmpty(CtorCGF);
1798-
CtorCGF.StartFunction(GlobalDecl(), CGM.getContext().VoidTy, Fn, FI,
1799-
FunctionArgList(), Loc, Loc);
1800-
auto AL = ApplyDebugLocation::CreateArtificial(CtorCGF);
1801-
llvm::Constant *AddrInAS0 = Addr;
1802-
if (Addr->getAddressSpace() != 0)
1803-
AddrInAS0 = llvm::ConstantExpr::getAddrSpaceCast(
1804-
Addr, llvm::PointerType::get(CGM.getLLVMContext(), 0));
1805-
CtorCGF.EmitAnyExprToMem(Init,
1806-
Address(AddrInAS0, Addr->getValueType(),
1807-
CGM.getContext().getDeclAlign(VD)),
1808-
Init->getType().getQualifiers(),
1809-
/*IsInitializer=*/true);
1810-
CtorCGF.FinishFunction();
1811-
Ctor = Fn;
1812-
ID = Fn;
1813-
} else {
1814-
Ctor = new llvm::GlobalVariable(
1815-
CGM.getModule(), CGM.Int8Ty, /*isConstant=*/true,
1816-
llvm::GlobalValue::PrivateLinkage,
1817-
llvm::Constant::getNullValue(CGM.Int8Ty), Twine(Buffer, "_ctor"));
1818-
ID = Ctor;
1819-
}
1820-
1821-
// Register the information for the entry associated with the constructor.
1822-
Out.clear();
1823-
auto CtorEntryInfo = EntryInfo;
1824-
CtorEntryInfo.ParentName = Twine(Buffer, "_ctor").toStringRef(Out);
1825-
OMPBuilder.OffloadInfoManager.registerTargetRegionEntryInfo(
1826-
CtorEntryInfo, Ctor, ID,
1827-
llvm::OffloadEntriesInfoManager::OMPTargetRegionEntryCtor);
1828-
}
1829-
if (VD->getType().isDestructedType() != QualType::DK_none) {
1830-
llvm::Constant *Dtor;
1831-
llvm::Constant *ID;
1832-
if (CGM.getLangOpts().OpenMPIsTargetDevice) {
1833-
// Generate function that emits destructor call for the threadprivate
1834-
// copy of the variable VD
1835-
CodeGenFunction DtorCGF(CGM);
1836-
1837-
const CGFunctionInfo &FI = CGM.getTypes().arrangeNullaryFunction();
1838-
llvm::FunctionType *FTy = CGM.getTypes().GetFunctionType(FI);
1839-
llvm::Function *Fn = CGM.CreateGlobalInitOrCleanUpFunction(
1840-
FTy, Twine(Buffer, "_dtor"), FI, Loc, false,
1841-
llvm::GlobalValue::WeakODRLinkage);
1842-
Fn->setVisibility(llvm::GlobalValue::ProtectedVisibility);
1843-
if (CGM.getTriple().isAMDGCN())
1844-
Fn->setCallingConv(llvm::CallingConv::AMDGPU_KERNEL);
1845-
auto NL = ApplyDebugLocation::CreateEmpty(DtorCGF);
1846-
DtorCGF.StartFunction(GlobalDecl(), CGM.getContext().VoidTy, Fn, FI,
1847-
FunctionArgList(), Loc, Loc);
1848-
// Create a scope with an artificial location for the body of this
1849-
// function.
1850-
auto AL = ApplyDebugLocation::CreateArtificial(DtorCGF);
1851-
llvm::Constant *AddrInAS0 = Addr;
1852-
if (Addr->getAddressSpace() != 0)
1853-
AddrInAS0 = llvm::ConstantExpr::getAddrSpaceCast(
1854-
Addr, llvm::PointerType::get(CGM.getLLVMContext(), 0));
1855-
DtorCGF.emitDestroy(Address(AddrInAS0, Addr->getValueType(),
1856-
CGM.getContext().getDeclAlign(VD)),
1857-
ASTTy, DtorCGF.getDestroyer(ASTTy.isDestructedType()),
1858-
DtorCGF.needsEHCleanup(ASTTy.isDestructedType()));
1859-
DtorCGF.FinishFunction();
1860-
Dtor = Fn;
1861-
ID = Fn;
1862-
} else {
1863-
Dtor = new llvm::GlobalVariable(
1864-
CGM.getModule(), CGM.Int8Ty, /*isConstant=*/true,
1865-
llvm::GlobalValue::PrivateLinkage,
1866-
llvm::Constant::getNullValue(CGM.Int8Ty), Twine(Buffer, "_dtor"));
1867-
ID = Dtor;
1868-
}
1869-
// Register the information for the entry associated with the destructor.
1870-
Out.clear();
1871-
auto DtorEntryInfo = EntryInfo;
1872-
DtorEntryInfo.ParentName = Twine(Buffer, "_dtor").toStringRef(Out);
1873-
OMPBuilder.OffloadInfoManager.registerTargetRegionEntryInfo(
1874-
DtorEntryInfo, Dtor, ID,
1875-
llvm::OffloadEntriesInfoManager::OMPTargetRegionEntryDtor);
1876-
}
1877-
return CGM.getLangOpts().OpenMPIsTargetDevice;
1878-
}
1879-
18801750
void CGOpenMPRuntime::emitDeclareTargetFunction(const FunctionDecl *FD,
18811751
llvm::GlobalValue *GV) {
18821752
std::optional<OMPDeclareTargetDeclAttr *> ActiveAttr =

clang/lib/CodeGen/CGOpenMPRuntime.h

Lines changed: 0 additions & 8 deletions
Original file line numberDiff line numberDiff line change
@@ -1089,14 +1089,6 @@ class CGOpenMPRuntime {
10891089
SourceLocation Loc, bool PerformInit,
10901090
CodeGenFunction *CGF = nullptr);
10911091

1092-
/// Emit a code for initialization of declare target variable.
1093-
/// \param VD Declare target variable.
1094-
/// \param Addr Address of the global variable \a VD.
1095-
/// \param PerformInit true if initialization expression is not constant.
1096-
virtual bool emitDeclareTargetVarDefinition(const VarDecl *VD,
1097-
llvm::GlobalVariable *Addr,
1098-
bool PerformInit);
1099-
11001092
/// Emit code for handling declare target functions in the runtime.
11011093
/// \param FD Declare target function.
11021094
/// \param Addr Address of the global \a FD.

clang/lib/CodeGen/CodeGenFunction.h

Lines changed: 5 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -4536,6 +4536,11 @@ class CodeGenFunction : public CodeGenTypeCache {
45364536
void registerGlobalDtorWithAtExit(const VarDecl &D, llvm::FunctionCallee fn,
45374537
llvm::Constant *addr);
45384538

4539+
/// Registers the dtor using 'llvm.global_dtors' for platforms that do not
4540+
/// support an 'atexit()' function.
4541+
void registerGlobalDtorWithLLVM(const VarDecl &D, llvm::FunctionCallee fn,
4542+
llvm::Constant *addr);
4543+
45394544
/// Call atexit() with function dtorStub.
45404545
void registerGlobalDtorWithAtExit(llvm::Constant *dtorStub);
45414546

clang/lib/CodeGen/CodeGenModule.h

Lines changed: 7 additions & 7 deletions
Original file line numberDiff line numberDiff line change
@@ -1570,6 +1570,13 @@ class CodeGenModule : public CodeGenTypeCache {
15701570
const VarDecl *D,
15711571
ForDefinition_t IsForDefinition = NotForDefinition);
15721572

1573+
// FIXME: Hardcoding priority here is gross.
1574+
void AddGlobalCtor(llvm::Function *Ctor, int Priority = 65535,
1575+
unsigned LexOrder = ~0U,
1576+
llvm::Constant *AssociatedData = nullptr);
1577+
void AddGlobalDtor(llvm::Function *Dtor, int Priority = 65535,
1578+
bool IsDtorAttrFunc = false);
1579+
15731580
private:
15741581
llvm::Constant *GetOrCreateLLVMFunction(
15751582
StringRef MangledName, llvm::Type *Ty, GlobalDecl D, bool ForVTable,
@@ -1641,13 +1648,6 @@ class CodeGenModule : public CodeGenTypeCache {
16411648
void EmitPointerToInitFunc(const VarDecl *VD, llvm::GlobalVariable *Addr,
16421649
llvm::Function *InitFunc, InitSegAttr *ISA);
16431650

1644-
// FIXME: Hardcoding priority here is gross.
1645-
void AddGlobalCtor(llvm::Function *Ctor, int Priority = 65535,
1646-
unsigned LexOrder = ~0U,
1647-
llvm::Constant *AssociatedData = nullptr);
1648-
void AddGlobalDtor(llvm::Function *Dtor, int Priority = 65535,
1649-
bool IsDtorAttrFunc = false);
1650-
16511651
/// EmitCtorList - Generates a global array of functions and priorities using
16521652
/// the given list and name. This array will have appending linkage and is
16531653
/// suitable for use as a LLVM constructor or destructor array. Clears Fns.

clang/lib/CodeGen/ItaniumCXXABI.cpp

Lines changed: 8 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -2794,6 +2794,14 @@ void ItaniumCXXABI::registerGlobalDtor(CodeGenFunction &CGF, const VarDecl &D,
27942794
if (D.isNoDestroy(CGM.getContext()))
27952795
return;
27962796

2797+
// OpenMP offloading supports C++ constructors and destructors but we do not
2798+
// always have 'atexit' available. Instead lower these to use the LLVM global
2799+
// destructors which we can handle directly in the runtime. Note that this is
2800+
// not strictly 1-to-1 with using `atexit` because we no longer tear down
2801+
// globals in reverse order of when they were constructed.
2802+
if (!CGM.getLangOpts().hasAtExit() && !D.isStaticLocal())
2803+
return CGF.registerGlobalDtorWithLLVM(D, dtor, addr);
2804+
27972805
// emitGlobalDtorWithCXAAtExit will emit a call to either __cxa_thread_atexit
27982806
// or __cxa_atexit depending on whether this VarDecl is a thread-local storage
27992807
// or not. CXAAtExit controls only __cxa_atexit, so use it if it is enabled.

clang/test/Headers/amdgcn_openmp_device_math_constexpr.cpp

Lines changed: 34 additions & 14 deletions
Original file line numberDiff line numberDiff line change
@@ -35,7 +35,7 @@ const float constexpr_fmaxf_f32 = fmaxf(2.0f, -4.0f);
3535

3636

3737
#pragma omp end declare target
38-
// CHECK-LABEL: define {{[^@]+}}@{{__omp_offloading_[0-9a-z]+_[0-9a-z]+}}_constexpr_fabsf_f32_l14_ctor
38+
// CHECK-LABEL: define {{[^@]+}}@__cxx_global_var_init
3939
// CHECK-SAME: () #[[ATTR0:[0-9]+]] {
4040
// CHECK-NEXT: entry:
4141
// CHECK-NEXT: [[RETVAL_I:%.*]] = alloca float, align 4, addrspace(5)
@@ -49,7 +49,7 @@ const float constexpr_fmaxf_f32 = fmaxf(2.0f, -4.0f);
4949
// CHECK-NEXT: ret void
5050
//
5151
//
52-
// CHECK-LABEL: define {{[^@]+}}@{{__omp_offloading_[0-9a-z]+_[0-9a-z]+}}_constexpr_fabs_f32_l15_ctor
52+
// CHECK-LABEL: define {{[^@]+}}@__cxx_global_var_init.1
5353
// CHECK-SAME: () #[[ATTR0]] {
5454
// CHECK-NEXT: entry:
5555
// CHECK-NEXT: [[RETVAL_I_I:%.*]] = alloca float, align 4, addrspace(5)
@@ -69,7 +69,7 @@ const float constexpr_fmaxf_f32 = fmaxf(2.0f, -4.0f);
6969
// CHECK-NEXT: ret void
7070
//
7171
//
72-
// CHECK-LABEL: define {{[^@]+}}@{{__omp_offloading_[0-9a-z]+_[0-9a-z]+}}_constexpr_sinf_f32_l17_ctor
72+
// CHECK-LABEL: define {{[^@]+}}@__cxx_global_var_init.2
7373
// CHECK-SAME: () #[[ATTR0]] {
7474
// CHECK-NEXT: entry:
7575
// CHECK-NEXT: [[RETVAL_I:%.*]] = alloca float, align 4, addrspace(5)
@@ -83,7 +83,7 @@ const float constexpr_fmaxf_f32 = fmaxf(2.0f, -4.0f);
8383
// CHECK-NEXT: ret void
8484
//
8585
//
86-
// CHECK-LABEL: define {{[^@]+}}@{{__omp_offloading_[0-9a-z]+_[0-9a-z]+}}_constexpr_sin_f32_l18_ctor
86+
// CHECK-LABEL: define {{[^@]+}}@__cxx_global_var_init.3
8787
// CHECK-SAME: () #[[ATTR0]] {
8888
// CHECK-NEXT: entry:
8989
// CHECK-NEXT: [[RETVAL_I_I:%.*]] = alloca float, align 4, addrspace(5)
@@ -103,7 +103,7 @@ const float constexpr_fmaxf_f32 = fmaxf(2.0f, -4.0f);
103103
// CHECK-NEXT: ret void
104104
//
105105
//
106-
// CHECK-LABEL: define {{[^@]+}}@{{__omp_offloading_[0-9a-z]+_[0-9a-z]+}}_constexpr_cosf_f32_l20_ctor
106+
// CHECK-LABEL: define {{[^@]+}}@__cxx_global_var_init.4
107107
// CHECK-SAME: () #[[ATTR0]] {
108108
// CHECK-NEXT: entry:
109109
// CHECK-NEXT: [[RETVAL_I:%.*]] = alloca float, align 4, addrspace(5)
@@ -117,7 +117,7 @@ const float constexpr_fmaxf_f32 = fmaxf(2.0f, -4.0f);
117117
// CHECK-NEXT: ret void
118118
//
119119
//
120-
// CHECK-LABEL: define {{[^@]+}}@{{__omp_offloading_[0-9a-z]+_[0-9a-z]+}}_constexpr_cos_f32_l21_ctor
120+
// CHECK-LABEL: define {{[^@]+}}@__cxx_global_var_init.5
121121
// CHECK-SAME: () #[[ATTR0]] {
122122
// CHECK-NEXT: entry:
123123
// CHECK-NEXT: [[RETVAL_I_I:%.*]] = alloca float, align 4, addrspace(5)
@@ -137,7 +137,7 @@ const float constexpr_fmaxf_f32 = fmaxf(2.0f, -4.0f);
137137
// CHECK-NEXT: ret void
138138
//
139139
//
140-
// CHECK-LABEL: define {{[^@]+}}@{{__omp_offloading_[0-9a-z]+_[0-9a-z]+}}_constexpr_fmaf_f32_l23_ctor
140+
// CHECK-LABEL: define {{[^@]+}}@__cxx_global_var_init.6
141141
// CHECK-SAME: () #[[ATTR0]] {
142142
// CHECK-NEXT: entry:
143143
// CHECK-NEXT: [[RETVAL_I:%.*]] = alloca float, align 4, addrspace(5)
@@ -159,7 +159,7 @@ const float constexpr_fmaxf_f32 = fmaxf(2.0f, -4.0f);
159159
// CHECK-NEXT: ret void
160160
//
161161
//
162-
// CHECK-LABEL: define {{[^@]+}}@{{__omp_offloading_[0-9a-z]+_[0-9a-z]+}}_constexpr_fma_f32_l24_ctor
162+
// CHECK-LABEL: define {{[^@]+}}@__cxx_global_var_init.7
163163
// CHECK-SAME: () #[[ATTR0]] {
164164
// CHECK-NEXT: entry:
165165
// CHECK-NEXT: [[RETVAL_I_I:%.*]] = alloca float, align 4, addrspace(5)
@@ -195,7 +195,7 @@ const float constexpr_fmaxf_f32 = fmaxf(2.0f, -4.0f);
195195
// CHECK-NEXT: ret void
196196
//
197197
//
198-
// CHECK-LABEL: define {{[^@]+}}@{{__omp_offloading_[0-9a-z]+_[0-9a-z]+}}_constexpr_min_f32_l27_ctor
198+
// CHECK-LABEL: define {{[^@]+}}@__cxx_global_var_init.8
199199
// CHECK-SAME: () #[[ATTR0]] {
200200
// CHECK-NEXT: entry:
201201
// CHECK-NEXT: [[RETVAL_I:%.*]] = alloca float, align 4, addrspace(5)
@@ -213,7 +213,7 @@ const float constexpr_fmaxf_f32 = fmaxf(2.0f, -4.0f);
213213
// CHECK-NEXT: ret void
214214
//
215215
//
216-
// CHECK-LABEL: define {{[^@]+}}@{{__omp_offloading_[0-9a-z]+_[0-9a-z]+}}_constexpr_max_f32_l28_ctor
216+
// CHECK-LABEL: define {{[^@]+}}@__cxx_global_var_init.9
217217
// CHECK-SAME: () #[[ATTR0]] {
218218
// CHECK-NEXT: entry:
219219
// CHECK-NEXT: [[RETVAL_I:%.*]] = alloca float, align 4, addrspace(5)
@@ -231,23 +231,23 @@ const float constexpr_fmaxf_f32 = fmaxf(2.0f, -4.0f);
231231
// CHECK-NEXT: ret void
232232
//
233233
//
234-
// CHECK-LABEL: define {{[^@]+}}@{{__omp_offloading_[0-9a-z]+_[0-9a-z]+}}_constexpr_fmin_f32_l30_ctor
234+
// CHECK-LABEL: define {{[^@]+}}@__cxx_global_var_init.10
235235
// CHECK-SAME: () #[[ATTR0]] {
236236
// CHECK-NEXT: entry:
237237
// CHECK-NEXT: [[CALL:%.*]] = call noundef float @_Z4fminff(float noundef 2.000000e+00, float noundef -4.000000e+00) #[[ATTR4:[0-9]+]]
238238
// CHECK-NEXT: store float [[CALL]], ptr addrspacecast (ptr addrspace(1) @_ZL18constexpr_fmin_f32 to ptr), align 4
239239
// CHECK-NEXT: ret void
240240
//
241241
//
242-
// CHECK-LABEL: define {{[^@]+}}@{{__omp_offloading_[0-9a-z]+_[0-9a-z]+}}_constexpr_fmax_f32_l31_ctor
242+
// CHECK-LABEL: define {{[^@]+}}@__cxx_global_var_init.11
243243
// CHECK-SAME: () #[[ATTR0]] {
244244
// CHECK-NEXT: entry:
245245
// CHECK-NEXT: [[CALL:%.*]] = call noundef float @_Z4fmaxff(float noundef 2.000000e+00, float noundef -4.000000e+00) #[[ATTR4]]
246246
// CHECK-NEXT: store float [[CALL]], ptr addrspacecast (ptr addrspace(1) @_ZL18constexpr_fmax_f32 to ptr), align 4
247247
// CHECK-NEXT: ret void
248248
//
249249
//
250-
// CHECK-LABEL: define {{[^@]+}}@{{__omp_offloading_[0-9a-z]+_[0-9a-z]+}}_constexpr_fminf_f32_l33_ctor
250+
// CHECK-LABEL: define {{[^@]+}}@__cxx_global_var_init.12
251251
// CHECK-SAME: () #[[ATTR0]] {
252252
// CHECK-NEXT: entry:
253253
// CHECK-NEXT: [[RETVAL_I:%.*]] = alloca float, align 4, addrspace(5)
@@ -265,7 +265,7 @@ const float constexpr_fmaxf_f32 = fmaxf(2.0f, -4.0f);
265265
// CHECK-NEXT: ret void
266266
//
267267
//
268-
// CHECK-LABEL: define {{[^@]+}}@{{__omp_offloading_[0-9a-z]+_[0-9a-z]+}}_constexpr_fmaxf_f32_l34_ctor
268+
// CHECK-LABEL: define {{[^@]+}}@__cxx_global_var_init.13
269269
// CHECK-SAME: () #[[ATTR0]] {
270270
// CHECK-NEXT: entry:
271271
// CHECK-NEXT: [[RETVAL_I:%.*]] = alloca float, align 4, addrspace(5)
@@ -282,3 +282,23 @@ const float constexpr_fmaxf_f32 = fmaxf(2.0f, -4.0f);
282282
// CHECK-NEXT: store float [[TMP2]], ptr addrspacecast (ptr addrspace(1) @_ZL19constexpr_fmaxf_f32 to ptr), align 4
283283
// CHECK-NEXT: ret void
284284
//
285+
//
286+
// CHECK-LABEL: define {{[^@]+}}@_GLOBAL__sub_I_amdgcn_openmp_device_math_constexpr.cpp
287+
// CHECK-SAME: () #[[ATTR0]] {
288+
// CHECK-NEXT: entry:
289+
// CHECK-NEXT: call void @__cxx_global_var_init()
290+
// CHECK-NEXT: call void @__cxx_global_var_init.1()
291+
// CHECK-NEXT: call void @__cxx_global_var_init.2()
292+
// CHECK-NEXT: call void @__cxx_global_var_init.3()
293+
// CHECK-NEXT: call void @__cxx_global_var_init.4()
294+
// CHECK-NEXT: call void @__cxx_global_var_init.5()
295+
// CHECK-NEXT: call void @__cxx_global_var_init.6()
296+
// CHECK-NEXT: call void @__cxx_global_var_init.7()
297+
// CHECK-NEXT: call void @__cxx_global_var_init.8()
298+
// CHECK-NEXT: call void @__cxx_global_var_init.9()
299+
// CHECK-NEXT: call void @__cxx_global_var_init.10()
300+
// CHECK-NEXT: call void @__cxx_global_var_init.11()
301+
// CHECK-NEXT: call void @__cxx_global_var_init.12()
302+
// CHECK-NEXT: call void @__cxx_global_var_init.13()
303+
// CHECK-NEXT: ret void
304+
//

0 commit comments

Comments
 (0)