[AMDGPU][Attributor] Make `AAAMDFlatWorkGroupSize` honor existing attribute #114357

shiltian · 2024-10-31T05:05:14Z

If a function has amdgpu-flat-work-group-size, honor it in initialize by
taking its value directly; otherwise, it uses the default range as a starting
point. We will no longer manipulate the known range, which can cause issues
because the known range is a "throttle" to the assumed range such that the
assumed range can't get widened properly in updateImpl if the known range is
not set properly for whatever reasons. Another benefit of not touching the known
range is, if we indicate pessimistic state, it also invalidates the AA such that
manifest will not be called. Since we honor the attribute, we don't want and
will not add any half-baked attribute added to a function.

shiltian · 2024-10-31T05:05:28Z

This stack of pull requests is managed by Graphite. Learn more about stacking.

llvm/lib/Target/AMDGPU/AMDGPUAttributor.cpp

llvmbot · 2024-10-31T05:11:24Z

@llvm/pr-subscribers-backend-amdgpu

Author: Shilei Tian (shiltian)

Changes

If a function has amdgpu-flat-work-group-size, honor it in initialize by
taking its value directly, set it to known range, indicate a pessimistic fixed
point such that the known range is propagated to the assumed range; otherwise,
it simply does nothing. We will no longer clamp (real clamp, instead of the
union one in IntegerRangeState) the known range, which can cause issues
because the known range is a "throttle" to the assumed range such that the
assumed range can't get widened properly in updateImpl. Another benefit of not
touching the known range in initialize is, if we indicate pessimistic state in
updateImpl, it is also invalid, such that manifest will not be called. Since
we honor the attribute, we don't want any half-baked attribute added to a
function.

Full diff: https://github.com/llvm/llvm-project/pull/114357.diff

1 Files Affected:

(modified) llvm/lib/Target/AMDGPU/AMDGPUAttributor.cpp (+54-11)

diff --git a/llvm/lib/Target/AMDGPU/AMDGPUAttributor.cpp b/llvm/lib/Target/AMDGPU/AMDGPUAttributor.cpp
index 6a69b9d2bfc716..53e44bb5e094fd 100644
--- a/llvm/lib/Target/AMDGPU/AMDGPUAttributor.cpp
+++ b/llvm/lib/Target/AMDGPU/AMDGPUAttributor.cpp
@@ -168,7 +168,15 @@ class AMDGPUInformationCache : public InformationCache {
     return ST.supportsGetDoorbellID();
   }
 
-  std::pair<unsigned, unsigned> getFlatWorkGroupSizes(const Function &F) {
+  std::optional<std::pair<unsigned, unsigned>>
+  getFlatWorkGroupSizeAttr(const Function &F) const {
+    Attribute A = F.getFnAttribute("amdgpu-flat-work-group-size");
+    if (!A.isStringAttribute())
+      return std::nullopt;
+    return getFlatWorkGroupSizes(F);
+  }
+
+  std::pair<unsigned, unsigned> getFlatWorkGroupSizes(const Function &F) const {
     const GCNSubtarget &ST = TM.getSubtarget<GCNSubtarget>(F);
     return ST.getFlatWorkGroupSizes(F);
   }
@@ -707,8 +715,7 @@ struct AAAMDSizeRangeAttribute
   /// See AbstractAttribute::trackStatistics()
   void trackStatistics() const override {}
 
-  template <class AttributeImpl>
-  ChangeStatus updateImplImpl(Attributor &A) {
+  template <class AttributeImpl> ChangeStatus updateImplImpl(Attributor &A) {
     ChangeStatus Change = ChangeStatus::UNCHANGED;
 
     auto CheckCallSite = [&](AbstractCallSite CS) {
@@ -728,12 +735,43 @@ struct AAAMDSizeRangeAttribute
     };
 
     bool AllCallSitesKnown = true;
-    if (!A.checkForAllCallSites(CheckCallSite, *this, true, AllCallSitesKnown))
+    if (!A.checkForAllCallSites(CheckCallSite, *this,
+                                /*RequireAllCallSites=*/true,
+                                AllCallSitesKnown))
       return indicatePessimisticFixpoint();
 
     return Change;
   }
 
+  /// Clamp the assumed range to the default value ([Min, Max]) and emit the
+  /// attribute if it is not same as default.
+  ChangeStatus
+  emitAttributeIfNotDefaultAfterClamp(Attributor &A,
+                                      std::pair<unsigned, unsigned> Default) {
+    auto [Min, Max] = Default;
+    unsigned Lower = getAssumed().getLower().getZExtValue();
+    unsigned Upper = getAssumed().getUpper().getZExtValue();
+
+    // Clamp the range to the default value.
+    if (Lower < Min)
+      Lower = Min;
+    if (Upper > Max + 1)
+      Upper = Max + 1;
+
+    // No manifest if the value is same as default after clamp.
+    if (Lower == Min && Upper == Max + 1)
+      return ChangeStatus::UNCHANGED;
+
+    Function *F = getAssociatedFunction();
+    LLVMContext &Ctx = F->getContext();
+    SmallString<10> Buffer;
+    raw_svector_ostream OS(Buffer);
+    OS << Lower << ',' << Upper - 1;
+    return A.manifestAttrs(getIRPosition(),
+                           {Attribute::get(Ctx, AttrName, OS.str())},
+                           /* ForceReplace=*/true);
+  }
+
   ChangeStatus emitAttributeIfNotDefault(Attributor &A, unsigned Min,
                                          unsigned Max) {
     // Don't add the attribute if it's the implied default.
@@ -767,11 +805,17 @@ struct AAAMDFlatWorkGroupSize : public AAAMDSizeRangeAttribute {
 
   void initialize(Attributor &A) override {
     Function *F = getAssociatedFunction();
+
     auto &InfoCache = static_cast<AMDGPUInformationCache &>(A.getInfoCache());
-    unsigned MinGroupSize, MaxGroupSize;
-    std::tie(MinGroupSize, MaxGroupSize) = InfoCache.getFlatWorkGroupSizes(*F);
-    intersectKnown(
-        ConstantRange(APInt(32, MinGroupSize), APInt(32, MaxGroupSize + 1)));
+    std::optional<std::pair<unsigned, unsigned>> Attr =
+        InfoCache.getFlatWorkGroupSizeAttr(*F);
+
+    if (Attr.has_value()) {
+      auto [Min, Max] = *Attr;
+      intersectKnown(ConstantRange(APInt(32, Min), APInt(32, Max + 1)));
+      indicatePessimisticFixpoint();
+      return;
+    }
 
     if (AMDGPU::isEntryFunctionCC(F->getCallingConv()))
       indicatePessimisticFixpoint();
@@ -788,9 +832,8 @@ struct AAAMDFlatWorkGroupSize : public AAAMDSizeRangeAttribute {
   ChangeStatus manifest(Attributor &A) override {
     Function *F = getAssociatedFunction();
     auto &InfoCache = static_cast<AMDGPUInformationCache &>(A.getInfoCache());
-    unsigned Min, Max;
-    std::tie(Min, Max) = InfoCache.getMaximumFlatWorkGroupRange(*F);
-    return emitAttributeIfNotDefault(A, Min, Max);
+    auto [Min, Max] = InfoCache.getMaximumFlatWorkGroupRange(*F);
+    return emitAttributeIfNotDefaultAfterClamp(A, {Min, Max});
   }
 
   /// See AbstractAttribute::getName()

arsenm

Test changes?

llvm/lib/Target/AMDGPU/AMDGPUAttributor.cpp

choikwa · 2024-11-02T05:15:01Z

llvm/test/CodeGen/AMDGPU/inline-attr.ll

-; NOINFS: attributes #0 = { nounwind "amdgpu-waves-per-eu"="4,10" "no-infs-fp-math"="true" "uniform-work-group-size"="false" }
-; NOINFS: attributes #1 = { nounwind "less-precise-fpmad"="false" "no-infs-fp-math"="true" "no-nans-fp-math"="false" "uniform-work-group-size"="false" "unsafe-fp-math"="false" }
+; NOINFS: attributes #0 = { nounwind "no-infs-fp-math"="true" "uniform-work-group-size"="false" "unsafe-fp-math"="true" }
+; NOINFS: attributes #1 = { nounwind "less-precise-fpmad"="false" "no-infs-fp-math"="true" "no-nans-fp-math"="false" "uniform-work-group-size"="false" "unsafe-fp-math"="true" }


Should this change "unsafe-fp-math"="true"?

No, it should not and I don't see why this is changed...

shiltian · 2024-11-05T17:49:41Z

bump

choikwa · 2024-11-12T17:00:04Z

LGTM

shiltian · 2024-12-09T16:00:31Z

@choikwa you might forget to accept it.

shiltian · 2024-12-10T17:48:11Z

ping, I prefer not to update tests again and again...

shiltian · 2024-12-11T21:38:11Z

Merge activity

Dec 11, 4:38 PM EST: A user started a stack merge that includes this pull request via Graphite.
Dec 11, 4:40 PM EST: Graphite rebased this pull request as part of a merge.
Dec 11, 4:42 PM EST: Graphite rebased this pull request as part of a merge.
Dec 11, 4:45 PM EST: Graphite rebased this pull request as part of a merge.
Dec 11, 4:47 PM EST: A user merged this pull request with Graphite.

…ribute If a function has `amdgpu-flat-work-group-size`, honor it in `initialize` by taking its value directly; otherwise, it uses the default range as a starting point. We will no longer manipulate the known range, which can cause issues because the known range is a "throttle" to the assumed range such that the assumed range can't get widened properly in `updateImpl` if the known range is not set properly for whatever reasons. Another benefit of not touching the known range is, if we indicate pessimistic state, it also invalidates the AA such that `manifest` will not be called. Since we honor the attribute, we don't want and will not add any half-baked attribute added to a function.

arsenm · 2024-12-12T11:18:02Z

llvm/lib/Target/AMDGPU/AMDGPUAttributor.cpp

+    auto R = AMDGPU::getIntegerPairAttribute(F, "amdgpu-flat-work-group-size");
+    if (!R)
+      return std::nullopt;
+    return std::make_pair(R->first, *(R->second));


Extra parens?

arsenm · 2024-12-12T11:20:53Z

llvm/lib/Target/AMDGPU/Utils/AMDGPUBaseInfo.h

+/// \returns \p std::nullopt and emits error if one of the requested values
+/// cannot be converted to integer, or \p OnlyFirstRequired is false and
+/// "second" value is not present.
+std::optional<std::pair<unsigned, std::optional<unsigned>>>


Not sure why the second entry is optional. This is spreading a parsing detail to users of the value. Users should be able to just treat it as-if it always has 2 components, the one entry forms are just a compatibility thing with old IR

Our front end can still emit "X,". I think we could potentially fix this in the front end and then remove the 2nd optional entry.

That should never happen. But as a parsing function, that should be abstracted from the attributor's usage

jtramm · 2025-01-14T23:17:00Z

Looks like this PR resulted in about a 2x slowdown on MI250 for the most expensive kernel in OpenMC, resulting in an overall 20% performance degradation. Any ideas what might have caused it?

shiltian · 2025-01-15T00:16:27Z

You might have device-libs that still come with half-baked values. To verify it, you could use llvm-dis to convert it to .ll file, remove all attributes amdgpu-flat-work-group-size, convert it back to .bc file, and put it back. After that, compile everything of OpenMC.

jtramm · 2025-01-15T15:59:25Z

Ah, interesting! Which file(s) should I be converting to .ll? And how do I remove attributes in a .ll file -- just delete the entire line or change to something else? Any specific commands to give to llvm-dis for converting between file types?

arsenm · 2025-01-15T16:15:03Z

Just rebuild the device libraries. They should be version locked to the llvm commit you are using

jtramm · 2025-01-15T16:18:33Z

Here's what my current cmake line looks like for LLVM:

PROJECTS="clang;lld"
RUNTIMES="openmp;offload"
TARGETS="all"
USE_CCACHE=OFF

cmake \
    -DCMAKE_C_COMPILER=clang \
    -DCMAKE_CXX_COMPILER=clang++ \
    -DLLVM_ENABLE_PROJECTS=${PROJECTS}  \
    -DLLVM_ENABLE_RUNTIMES=${RUNTIMES}   \
    -DLLVM_TARGETS_TO_BUILD=${TARGETS}   \
    -DLLVM_ENABLE_ASSERTIONS=ON \
    -DLIBOMPTARGET_ENABLE_DEBUG=ON \
    -DLLVM_OPTIMIZED_TABLEGEN=ON \
    -DBUILD_SHARED_LIBS=ON \
    -DLLVM_CCACHE_BUILD=${USE_CCACHE} \
    -DLLVM_APPEND_VC_REV=OFF \
    -DCMAKE_BUILD_TYPE=RelWithDebInfo \
    -DCMAKE_EXPORT_COMPILE_COMMANDS=ON \
    -DCMAKE_CXX_FLAGS="-Wno-address" \
    -DCMAKE_INSTALL_PREFIX=../llvm-install \
    ../llvm-project/llvm

make -j128 install

What would I need to change to specifically rebuild the device libraries, or am I already doing that?

shiltian · 2025-01-15T16:21:22Z

@jtramm You are building vanilla LLVM but the device-libs are in AMD's fork. You can simply use the LLVM you build to build device-libs. Here is the command I usually use to build device-libs:

cmake -G Ninja -DCMAKE_BUILD_TYPE=Release -DCMAKE_EXPORT_COMPILE_COMMANDS=ON -DLLVM_ROOT=... -DClang_ROOT=... -B $BUILD_ROOT/device-libs/release -S llvm-project/amd/device-libs

jtramm · 2025-01-15T16:25:34Z

I don't understand -- are you saying I cannot use vanilla LLVM to compile for AMD anymore, that I need AMD's fork now instead?

shiltian · 2025-01-15T16:35:09Z

No, vanilla LLVM doesn't have device-libs. That is only available in AMD's fork. For this particular case, you will have to build device-libs because the device-libs in your system have half-baked values that can cause performance regression. That's why I suggested to simply just remove the attributes from those /opt/rocm/amdgcn/bitcode/*.bc files.

You can still build vanilla LLVM, and use that to build device-libs. The only thing you want out of AMD's fork is the device-libs, which sits in amd/device-libs.

To make it simple, can you first verify if amdgpu-flat-work-group-size or amdgpu-waves-per-eu exist in any /opt/rocm/amdgcn/bitcode/*.bc files? If there is no, then you don't need to build device-libs. That would be a separate issue.

jtramm · 2025-01-15T16:42:37Z

The rocm install on this cluster seems to be in a different location, but the .bc files don't appear to have either of those attributes in them:

jtramm@amdgpu04:/soft/compilers/rocm/rocm-6.3.0/amdgcn/bitcode$ ls
asanrtl.bc                          oclc_isa_version_1010.bc          oclc_isa_version_10-3-generic.bc  oclc_isa_version_600.bc  oclc_isa_version_805.bc  oclc_isa_version_942.bc
hip.bc                              oclc_isa_version_1011.bc          oclc_isa_version_1100.bc          oclc_isa_version_601.bc  oclc_isa_version_810.bc  oclc_isa_version_9-generic.bc
ockl.bc                             oclc_isa_version_1012.bc          oclc_isa_version_1101.bc          oclc_isa_version_602.bc  oclc_isa_version_900.bc  oclc_unsafe_math_off.bc
oclc_abi_version_400.bc             oclc_isa_version_1013.bc          oclc_isa_version_1102.bc          oclc_isa_version_700.bc  oclc_isa_version_902.bc  oclc_unsafe_math_on.bc
oclc_abi_version_500.bc             oclc_isa_version_10-1-generic.bc  oclc_isa_version_1103.bc          oclc_isa_version_701.bc  oclc_isa_version_904.bc  oclc_wavefrontsize64_off.bc
oclc_abi_version_600.bc             oclc_isa_version_1030.bc          oclc_isa_version_1150.bc          oclc_isa_version_702.bc  oclc_isa_version_906.bc  oclc_wavefrontsize64_on.bc
oclc_correctly_rounded_sqrt_off.bc  oclc_isa_version_1031.bc          oclc_isa_version_1151.bc          oclc_isa_version_703.bc  oclc_isa_version_908.bc  ocml.bc
oclc_correctly_rounded_sqrt_on.bc   oclc_isa_version_1032.bc          oclc_isa_version_1152.bc          oclc_isa_version_704.bc  oclc_isa_version_909.bc  opencl.bc
oclc_daz_opt_off.bc                 oclc_isa_version_1033.bc          oclc_isa_version_11-generic.bc    oclc_isa_version_705.bc  oclc_isa_version_90a.bc
oclc_daz_opt_on.bc                  oclc_isa_version_1034.bc          oclc_isa_version_1200.bc          oclc_isa_version_801.bc  oclc_isa_version_90c.bc
oclc_finite_only_off.bc             oclc_isa_version_1035.bc          oclc_isa_version_1201.bc          oclc_isa_version_802.bc  oclc_isa_version_940.bc
oclc_finite_only_on.bc              oclc_isa_version_1036.bc          oclc_isa_version_12-generic.bc    oclc_isa_version_803.bc  oclc_isa_version_941.bc
jtramm@amdgpu04:/soft/compilers/rocm/rocm-6.3.0/amdgcn/bitcode$ grep amdgpu-flat-work-group-size *.bc
jtramm@amdgpu04:/soft/compilers/rocm/rocm-6.3.0/amdgcn/bitcode$ grep amdgpu-waves-per-eu *.bc
jtramm@amdgpu04:/soft/compilers/rocm/rocm-6.3.0/amdgcn/bitcode$

jtramm · 2025-01-15T16:46:51Z

Ah, it looks like the .bc files are binary files. Is there a specific program I need to use to check them for attributes?

shiltian · 2025-01-15T16:47:07Z

Can you disassemble the .bc to .ll and then check again? Directly checking .bc will not work because they are compressed/encoded. You can use /soft/compilers/rocm/rocm-6.3.0/llvm/bin/llvm-dis xxx.bc -o xxx.ll.

This should replaces llvm#114357. The problem this resolves is that AAAMDWavesPerEU used "assumed" information from AAAMDFlatWorkGroupSize to derive "known" information. This is generally not valid. The cannonical way is to use "assumed"/"known" information to improve "assumed"/"known" information, respectively. What happend before was that the new "known" information (derived from "assumed") was manifested even though it should not have been. That is how we ended up with external functions that had "amdgpu-waves-per-eu" set. This won't happen if we only set the assumed information as the invalidation, caused by unknown callers, will fallback to the "unset" known information. In addition to the above, we will identify invalid waves-per-eu values, emit a warning, and replace the invalid values with valid ones. This happens also for kernels. See `llvm/test/CodeGen/AMDGPU/propagate-waves-per-eu.ll` This should hopefully fix llvm#123092.

shiltian requested review from arsenm and choikwa October 31, 2024 05:05

shiltian commented Oct 31, 2024

View reviewed changes

llvm/lib/Target/AMDGPU/AMDGPUAttributor.cpp Show resolved Hide resolved

shiltian commented Oct 31, 2024

View reviewed changes

llvm/lib/Target/AMDGPU/AMDGPUAttributor.cpp Outdated Show resolved Hide resolved

shiltian commented Oct 31, 2024

View reviewed changes

llvm/lib/Target/AMDGPU/AMDGPUAttributor.cpp Outdated Show resolved Hide resolved

shiltian added the backend:AMDGPU label Oct 31, 2024

arsenm reviewed Oct 31, 2024

View reviewed changes

llvm/lib/Target/AMDGPU/AMDGPUAttributor.cpp Outdated Show resolved Hide resolved

llvm/lib/Target/AMDGPU/AMDGPUAttributor.cpp Outdated Show resolved Hide resolved

llvm/lib/Target/AMDGPU/AMDGPUAttributor.cpp Outdated Show resolved Hide resolved

shiltian force-pushed the users/shiltian/honor-attributes branch 3 times, most recently from 19e0cce to f2314c9 Compare October 31, 2024 15:46

shiltian marked this pull request as ready for review October 31, 2024 16:00

shiltian changed the title ~~[WIP][AMDGPU][Attributor] Make AAAMDFlatWorkGroupSize honor existing attribute~~ [AMDGPU][Attributor] Make AAAMDFlatWorkGroupSize honor existing attribute Oct 31, 2024

arsenm reviewed Oct 31, 2024

View reviewed changes

llvm/lib/Target/AMDGPU/AMDGPUAttributor.cpp Outdated Show resolved Hide resolved

shiltian force-pushed the users/shiltian/honor-attributes branch 4 times, most recently from 642636d to f1aea50 Compare October 31, 2024 17:54

shiltian mentioned this pull request Oct 31, 2024

[AMDGPU][Attributor] Make AAAMDWavesPerEU honor existing attribute #114438

Merged

arsenm reviewed Oct 31, 2024

View reviewed changes

llvm/lib/Target/AMDGPU/AMDGPUAttributor.cpp Outdated Show resolved Hide resolved

shiltian force-pushed the users/shiltian/honor-attributes branch 4 times, most recently from c35f519 to 2803f73 Compare November 2, 2024 03:33

choikwa reviewed Nov 2, 2024

View reviewed changes

llvm/lib/Target/AMDGPU/AMDGPUAttributor.cpp Outdated Show resolved Hide resolved

choikwa reviewed Nov 2, 2024

View reviewed changes

shiltian force-pushed the users/shiltian/honor-attributes branch 3 times, most recently from fe201bc to b6b851f Compare November 4, 2024 00:35

shiltian mentioned this pull request Nov 5, 2024

AMDGPU: Propagate amdgpu-max-num-workgroups attribute #113018

Merged

shiltian force-pushed the users/shiltian/honor-attributes branch 2 times, most recently from 43322a4 to 2d1641d Compare December 10, 2024 17:47

shiltian force-pushed the users/shiltian/honor-attributes branch from 2d1641d to d949823 Compare December 11, 2024 16:59

choikwa approved these changes Dec 11, 2024

View reviewed changes

shiltian force-pushed the users/shiltian/honor-attributes branch 2 times, most recently from 181386d to 768a316 Compare December 11, 2024 21:41

shiltian force-pushed the users/shiltian/honor-attributes branch from 768a316 to a062b29 Compare December 11, 2024 21:45

shiltian merged commit 7dbd6cd into main Dec 11, 2024
5 of 7 checks passed

shiltian deleted the users/shiltian/honor-attributes branch December 11, 2024 21:47

arsenm reviewed Dec 12, 2024

View reviewed changes

jtramm mentioned this pull request Jan 15, 2025

[AMDGPU][Attributor] OpenMC Performance Regression on AMD GPU #123092

Closed

[AMDGPU][Attributor] Make AAAMDFlatWorkGroupSize honor existing attribute #114357

[AMDGPU][Attributor] Make AAAMDFlatWorkGroupSize honor existing attribute #114357

Uh oh!

Conversation

shiltian commented Oct 31, 2024 • edited by arsenm Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

shiltian commented Oct 31, 2024 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

llvmbot commented Oct 31, 2024

Uh oh!

arsenm left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

choikwa Nov 2, 2024

Choose a reason for hiding this comment

Uh oh!

shiltian Nov 2, 2024

Choose a reason for hiding this comment

Uh oh!

shiltian commented Nov 5, 2024

Uh oh!

choikwa commented Nov 12, 2024

Uh oh!

shiltian commented Dec 9, 2024

Uh oh!

shiltian commented Dec 10, 2024

Uh oh!

shiltian commented Dec 11, 2024 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Merge activity

Uh oh!

Uh oh!

arsenm Dec 12, 2024

Choose a reason for hiding this comment

Uh oh!

arsenm Dec 12, 2024

Choose a reason for hiding this comment

Uh oh!

shiltian Dec 12, 2024

Choose a reason for hiding this comment

Uh oh!

arsenm Dec 13, 2024

Choose a reason for hiding this comment

Uh oh!

jtramm commented Jan 14, 2025

Uh oh!

shiltian commented Jan 15, 2025

Uh oh!

jtramm commented Jan 15, 2025

Uh oh!

arsenm commented Jan 15, 2025

Uh oh!

jtramm commented Jan 15, 2025

Uh oh!

shiltian commented Jan 15, 2025

Uh oh!

jtramm commented Jan 15, 2025

Uh oh!

shiltian commented Jan 15, 2025

Uh oh!

jtramm commented Jan 15, 2025

Uh oh!

jtramm commented Jan 15, 2025

Uh oh!

shiltian commented Jan 15, 2025

Uh oh!

Uh oh!

[AMDGPU][Attributor] Make `AAAMDFlatWorkGroupSize` honor existing attribute #114357

[AMDGPU][Attributor] Make `AAAMDFlatWorkGroupSize` honor existing attribute #114357

shiltian commented Oct 31, 2024 •

edited by arsenm

Loading

shiltian commented Oct 31, 2024 •

edited

Loading

shiltian commented Dec 11, 2024 •

edited

Loading