[HIP] diagnose -mwavefrontsize64 for gfx10+ #140185

yxsamliu · 2025-05-16T04:09:39Z

wavefront size 64 is not fully supported for HIP on gfx10+ and it is not tested. As such, currently HIP header contains an pragma error to diagnose such use case by detecting predefined macro __AMDGCN_WAVEFRONT_SIZE. However, since clang will remove it and replace it with a builtin function, there is no good way to diagnose such invalid use of wavefront size 64 with gfx10+ in HIP header. Therefore this patch diagnoses uses of -mwavefrontsize64 with gfx10+ in HIPAMD toolchain.

This patch also introduces option -mforce-unsafe-wavefrontsize64 to allow power users to force wavefront size 64 on gfx10+, with an assumption that they understand its risk.

wavefront size 64 is not fully supported for HIP on gfx10+ and it is not tested. As such, currently HIP header contains an pragma erro to diagnose such use case by detecting predefined macro __AMDGCN_WAVEFRONT_SIZE. However, since clang will remove it and replace it with a builtin function, there is no good way to diagnose such invalid use of wavefront size 64 with gfx10+ in HIP header. Therefore this patch diagnoses uses of -mwavefrontsize64 with gfx10+ in HIPAMD toolchain. This patch also introduces option -mforce-unsafe-wavefrontsize64 to allow power users to force wavefront size 64 for gfx10+, with an assumption that they understand the its risk.

llvmbot · 2025-05-16T04:10:11Z

@llvm/pr-subscribers-clang-driver

@llvm/pr-subscribers-clang

Author: Yaxun (Sam) Liu (yxsamliu)

Changes

wavefront size 64 is not fully supported for HIP on gfx10+ and it is not tested. As such, currently HIP header contains an pragma error to diagnose such use case by detecting predefined macro __AMDGCN_WAVEFRONT_SIZE. However, since clang will remove it and replace it with a builtin function, there is no good way to diagnose such invalid use of wavefront size 64 with gfx10+ in HIP header. Therefore this patch diagnoses uses of -mwavefrontsize64 with gfx10+ in HIPAMD toolchain.

This patch also introduces option -mforce-unsafe-wavefrontsize64 to allow power users to force wavefront size 64 on gfx10+, with an assumption that they understand its risk.

Full diff: https://github.com/llvm/llvm-project/pull/140185.diff

7 Files Affected:

(modified) clang/include/clang/Basic/DiagnosticDriverKinds.td (+3)
(modified) clang/include/clang/Driver/Options.td (+2)
(modified) clang/lib/Driver/ToolChains/AMDGPU.cpp (+25-11)
(modified) clang/lib/Driver/ToolChains/AMDGPU.h (+7-2)
(modified) clang/lib/Driver/ToolChains/HIPAMD.cpp (+3)
(modified) clang/test/Driver/hip-macros.hip (+2-2)
(modified) clang/test/Driver/hip-toolchain-features.hip (+7-1)

diff --git a/clang/include/clang/Basic/DiagnosticDriverKinds.td b/clang/include/clang/Basic/DiagnosticDriverKinds.td
index 4da8f80345ddc..42c6f11e5cbdb 100644
--- a/clang/include/clang/Basic/DiagnosticDriverKinds.td
+++ b/clang/include/clang/Basic/DiagnosticDriverKinds.td
@@ -200,6 +200,9 @@ def err_drv_opt_unsupported_input_type : Error<
   "'%0' invalid for input of type %1">;
 def err_drv_amdgpu_ieee_without_no_honor_nans : Error<
   "invalid argument '-mno-amdgpu-ieee' only allowed with relaxed NaN handling">;
+def err_drv_wave64_requires_force_unsafe : Error<
+  "'-mwavefrontsize64' is not fully supported for HIP on AMD GPU architecture "
+  "gfx10 and above. Use '-mforce-unsafe-wavefrontsize64' to override">;
 def err_drv_argument_not_allowed_with : Error<
   "invalid argument '%0' not allowed with '%1'">;
 def err_drv_cannot_open_randomize_layout_seed_file : Error<
diff --git a/clang/include/clang/Driver/Options.td b/clang/include/clang/Driver/Options.td
index bd8df8f6a749a..41da9c3bff3bf 100644
--- a/clang/include/clang/Driver/Options.td
+++ b/clang/include/clang/Driver/Options.td
@@ -5257,6 +5257,8 @@ defm tgsplit : SimpleMFlag<"tgsplit", "Enable", "Disable",
 defm wavefrontsize64 : SimpleMFlag<"wavefrontsize64",
   "Specify wavefront size 64", "Specify wavefront size 32",
   " mode (AMDGPU only)">;
+defm force_unsafe_wavefrontsize64 : SimpleMFlag<"force-unsafe-wavefrontsize64",
+  "Force", "Do not force", " wavefront size 64 for gfx10+ GPUs (AMDGPU only)">;
 defm amdgpu_precise_memory_op
     : SimpleMFlag<"amdgpu-precise-memory-op", "Enable", "Disable",
                   " precise memory mode (AMDGPU only)">;
diff --git a/clang/lib/Driver/ToolChains/AMDGPU.cpp b/clang/lib/Driver/ToolChains/AMDGPU.cpp
index 35ca019795ddc..d7ce1ca1ca4e7 100644
--- a/clang/lib/Driver/ToolChains/AMDGPU.cpp
+++ b/clang/lib/Driver/ToolChains/AMDGPU.cpp
@@ -606,6 +606,7 @@ void amdgpu::getAMDGPUTargetFeatures(const Driver &D,
   // Add target ID features to -target-feature options. No diagnostics should
   // be emitted here since invalid target ID is diagnosed at other places.
   StringRef TargetID;
+  StringRef GpuArch;
   if (Args.hasArg(options::OPT_mcpu_EQ))
     TargetID = Args.getLastArgValue(options::OPT_mcpu_EQ);
   else if (Args.hasArg(options::OPT_march_EQ))
@@ -614,7 +615,7 @@ void amdgpu::getAMDGPUTargetFeatures(const Driver &D,
     llvm::StringMap<bool> FeatureMap;
     auto OptionalGpuArch = parseTargetID(Triple, TargetID, &FeatureMap);
     if (OptionalGpuArch) {
-      StringRef GpuArch = *OptionalGpuArch;
+      GpuArch = *OptionalGpuArch;
       // Iterate through all possible target ID features for the given GPU.
       // If it is mapped to true, add +feature.
       // If it is mapped to false, add -feature.
@@ -628,9 +629,10 @@ void amdgpu::getAMDGPUTargetFeatures(const Driver &D,
       }
     }
   }
-
-  if (Args.hasFlag(options::OPT_mwavefrontsize64,
-                   options::OPT_mno_wavefrontsize64, false))
+  if (toolchains::AMDGPUToolChain::isWave64(
+          D, Args, llvm::AMDGPU::parseArchAMDGCN(GpuArch),
+          /*DiagInvalid=*/false,
+          /*Explicit=*/true))
     Features.push_back("+wavefrontsize64");
 
   if (Args.hasFlag(options::OPT_mamdgpu_precise_memory_op,
@@ -767,15 +769,27 @@ llvm::DenormalMode AMDGPUToolChain::getDefaultDenormalModeForType(
                llvm::DenormalMode::getIEEE();
 }
 
-bool AMDGPUToolChain::isWave64(const llvm::opt::ArgList &DriverArgs,
-                               llvm::AMDGPU::GPUKind Kind) {
+bool AMDGPUToolChain::isWave64(const Driver &D,
+                               const llvm::opt::ArgList &DriverArgs,
+                               llvm::AMDGPU::GPUKind Kind, bool DiagInvalid,
+                               bool Explicit) {
   const unsigned ArchAttr = llvm::AMDGPU::getArchAttrAMDGCN(Kind);
   bool HasWave32 = (ArchAttr & llvm::AMDGPU::FEATURE_WAVE32);
 
-  return !HasWave32 || DriverArgs.hasFlag(
-    options::OPT_mwavefrontsize64, options::OPT_mno_wavefrontsize64, false);
-}
+  bool EnableWave64 = DriverArgs.hasFlag(
+      options::OPT_mwavefrontsize64, options::OPT_mno_wavefrontsize64, false);
+  bool ForceUnsafeWave64 =
+      DriverArgs.hasFlag(options::OPT_mforce_unsafe_wavefrontsize64,
+                         options::OPT_mno_force_unsafe_wavefrontsize64, false);
+
+  if (DiagInvalid && HasWave32 && EnableWave64 && !ForceUnsafeWave64)
+    D.Diag(diag::err_drv_wave64_requires_force_unsafe);
 
+  if (Explicit)
+    return EnableWave64 || ForceUnsafeWave64;
+
+  return !HasWave32 || EnableWave64 || ForceUnsafeWave64;
+}
 
 /// ROCM Toolchain
 ROCMToolChain::ROCMToolChain(const Driver &D, const llvm::Triple &Triple,
@@ -884,7 +898,7 @@ void ROCMToolChain::addClangTargetOptions(
                                                 ABIVer))
     return;
 
-  bool Wave64 = isWave64(DriverArgs, Kind);
+  bool Wave64 = isWave64(getDriver(), DriverArgs, Kind);
   // TODO: There are way too many flags that change this. Do we need to check
   // them all?
   bool DAZ = DriverArgs.hasArg(options::OPT_cl_denorms_are_zero) ||
@@ -1012,7 +1026,7 @@ ROCMToolChain::getCommonDeviceLibNames(const llvm::opt::ArgList &DriverArgs,
   bool CorrectSqrt = DriverArgs.hasFlag(
       options::OPT_fhip_fp32_correctly_rounded_divide_sqrt,
       options::OPT_fno_hip_fp32_correctly_rounded_divide_sqrt, true);
-  bool Wave64 = isWave64(DriverArgs, Kind);
+  bool Wave64 = isWave64(getDriver(), DriverArgs, Kind);
 
   // GPU Sanitizer currently only supports ASan and is enabled through host
   // ASan.
diff --git a/clang/lib/Driver/ToolChains/AMDGPU.h b/clang/lib/Driver/ToolChains/AMDGPU.h
index 08bd4fa556f78..d9d0fac71eb91 100644
--- a/clang/lib/Driver/ToolChains/AMDGPU.h
+++ b/clang/lib/Driver/ToolChains/AMDGPU.h
@@ -89,8 +89,13 @@ class LLVM_LIBRARY_VISIBILITY AMDGPUToolChain : public Generic_ELF {
       const llvm::opt::ArgList &DriverArgs, const JobAction &JA,
       const llvm::fltSemantics *FPType = nullptr) const override;
 
-  static bool isWave64(const llvm::opt::ArgList &DriverArgs,
-                       llvm::AMDGPU::GPUKind Kind);
+  /// Return whether the GPU of kind \p Kind has wavefront size 64 by driver
+  /// arguments \p DriverArgs. When \p Explicit is true, only return true
+  /// if wavefront size is set to 64 by arguments explicitly. When
+  /// \p DiagInvalid is true, diagnose invalid -mwavefrontsize64 arguments.
+  static bool isWave64(const Driver &D, const llvm::opt::ArgList &DriverArgs,
+                       llvm::AMDGPU::GPUKind Kind, bool DiagInvalid = false,
+                       bool Explicit = false);
   /// Needed for using lto.
   bool HasNativeLLVMSupport() const override {
     return true;
diff --git a/clang/lib/Driver/ToolChains/HIPAMD.cpp b/clang/lib/Driver/ToolChains/HIPAMD.cpp
index abda4eb453387..44f1a3e915283 100644
--- a/clang/lib/Driver/ToolChains/HIPAMD.cpp
+++ b/clang/lib/Driver/ToolChains/HIPAMD.cpp
@@ -229,6 +229,9 @@ void HIPAMDToolChain::addClangTargetOptions(
 
   assert(DeviceOffloadingKind == Action::OFK_HIP &&
          "Only HIP offloading kinds are supported for GPUs.");
+  const StringRef GpuArch = getGPUArch(DriverArgs);
+  auto Kind = llvm::AMDGPU::parseArchAMDGCN(GpuArch);
+  (void)isWave64(getDriver(), DriverArgs, Kind, /*DiagInvalid=*/true);
 
   CC1Args.append({"-fcuda-is-device", "-fno-threadsafe-statics"});
 
diff --git a/clang/test/Driver/hip-macros.hip b/clang/test/Driver/hip-macros.hip
index bd93f9985a774..37fdafc53af85 100644
--- a/clang/test/Driver/hip-macros.hip
+++ b/clang/test/Driver/hip-macros.hip
@@ -2,7 +2,7 @@
 // RUN: %clang -E -dM --offload-arch=gfx906 -mwavefrontsize64 \
 // RUN:   --cuda-device-only -nogpuinc -nogpulib \
 // RUN:   %s 2>&1 | FileCheck --check-prefixes=WAVE64 %s
-// RUN: %clang -E -dM --offload-arch=gfx1010 -mwavefrontsize64 \
+// RUN: %clang -E -dM --offload-arch=gfx1010 -mforce-unsafe-wavefrontsize64 \
 // RUN:   --cuda-device-only -nogpuinc -nogpulib \
 // RUN:   %s 2>&1 | FileCheck --check-prefixes=WAVE64 %s
 // RUN: %clang -E -dM --offload-arch=gfx906 -mwavefrontsize64 \
@@ -16,7 +16,7 @@
 // RUN:   -mwavefrontsize64 %s 2>&1 | FileCheck --check-prefixes=WAVE64 %s
 // RUN: %clang -E -dM --offload-arch=gfx1010 -mno-wavefrontsize64 \
 // RUN:   --cuda-device-only -nogpuinc -nogpulib \
-// RUN:   -mwavefrontsize64 %s 2>&1 | FileCheck --check-prefixes=WAVE64 %s
+// RUN:   -mforce-unsafe-wavefrontsize64 %s 2>&1 | FileCheck --check-prefixes=WAVE64 %s
 // WAVE64-DAG: #define __AMDGCN_WAVEFRONT_SIZE__ 64
 // WAVE32-DAG: #define __AMDGCN_WAVEFRONT_SIZE__ 32
 // WAVE64-DAG: #define __AMDGCN_WAVEFRONT_SIZE 64
diff --git a/clang/test/Driver/hip-toolchain-features.hip b/clang/test/Driver/hip-toolchain-features.hip
index dbc007ac1335b..1de21d9a0ff06 100644
--- a/clang/test/Driver/hip-toolchain-features.hip
+++ b/clang/test/Driver/hip-toolchain-features.hip
@@ -67,9 +67,15 @@
 // DUP-NOT: "-target-feature" "{{.*}}wavefrontsize64"
 // DUP: {{.*}}lld{{.*}} "-plugin-opt=-mattr=+cumode"
 
-// RUN: %clang -### --target=x86_64-linux-gnu -fgpu-rdc -nogpulib \
+// RUN: not %clang -### --target=x86_64-linux-gnu -fgpu-rdc -nogpulib \
 // RUN:   -nogpuinc --offload-arch=gfx1010 --no-offload-new-driver %s \
 // RUN:   -mno-wavefrontsize64 -mwavefrontsize64 2>&1 \
+// RUN:   | FileCheck %s -check-prefix=WAVE64-ERR
+// WAVE64-ERR: error: '-mwavefrontsize64' is not fully supported for HIP on AMD GPU architecture gfx10 and above. Use '-mforce-unsafe-wavefrontsize64' to override
+
+// RUN: %clang -### --target=x86_64-linux-gnu -fgpu-rdc -nogpulib \
+// RUN:   -nogpuinc --offload-arch=gfx1010 --no-offload-new-driver %s \
+// RUN:   -mno-wavefrontsize64 -mforce-unsafe-wavefrontsize64 2>&1 \
 // RUN:   | FileCheck %s -check-prefix=WAVE64
 // WAVE64: {{.*}}clang{{.*}} "-target-feature" "+wavefrontsize64"
 // WAVE64-NOT: "-target-feature" "{{.*}}wavefrontsize16"

llvmbot · 2025-05-16T04:10:11Z

@llvm/pr-subscribers-backend-amdgpu

Author: Yaxun (Sam) Liu (yxsamliu)

Changes

wavefront size 64 is not fully supported for HIP on gfx10+ and it is not tested. As such, currently HIP header contains an pragma error to diagnose such use case by detecting predefined macro __AMDGCN_WAVEFRONT_SIZE. However, since clang will remove it and replace it with a builtin function, there is no good way to diagnose such invalid use of wavefront size 64 with gfx10+ in HIP header. Therefore this patch diagnoses uses of -mwavefrontsize64 with gfx10+ in HIPAMD toolchain.

This patch also introduces option -mforce-unsafe-wavefrontsize64 to allow power users to force wavefront size 64 on gfx10+, with an assumption that they understand its risk.

Full diff: https://github.com/llvm/llvm-project/pull/140185.diff

7 Files Affected:

(modified) clang/include/clang/Basic/DiagnosticDriverKinds.td (+3)
(modified) clang/include/clang/Driver/Options.td (+2)
(modified) clang/lib/Driver/ToolChains/AMDGPU.cpp (+25-11)
(modified) clang/lib/Driver/ToolChains/AMDGPU.h (+7-2)
(modified) clang/lib/Driver/ToolChains/HIPAMD.cpp (+3)
(modified) clang/test/Driver/hip-macros.hip (+2-2)
(modified) clang/test/Driver/hip-toolchain-features.hip (+7-1)

diff --git a/clang/include/clang/Basic/DiagnosticDriverKinds.td b/clang/include/clang/Basic/DiagnosticDriverKinds.td
index 4da8f80345ddc..42c6f11e5cbdb 100644
--- a/clang/include/clang/Basic/DiagnosticDriverKinds.td
+++ b/clang/include/clang/Basic/DiagnosticDriverKinds.td
@@ -200,6 +200,9 @@ def err_drv_opt_unsupported_input_type : Error<
   "'%0' invalid for input of type %1">;
 def err_drv_amdgpu_ieee_without_no_honor_nans : Error<
   "invalid argument '-mno-amdgpu-ieee' only allowed with relaxed NaN handling">;
+def err_drv_wave64_requires_force_unsafe : Error<
+  "'-mwavefrontsize64' is not fully supported for HIP on AMD GPU architecture "
+  "gfx10 and above. Use '-mforce-unsafe-wavefrontsize64' to override">;
 def err_drv_argument_not_allowed_with : Error<
   "invalid argument '%0' not allowed with '%1'">;
 def err_drv_cannot_open_randomize_layout_seed_file : Error<
diff --git a/clang/include/clang/Driver/Options.td b/clang/include/clang/Driver/Options.td
index bd8df8f6a749a..41da9c3bff3bf 100644
--- a/clang/include/clang/Driver/Options.td
+++ b/clang/include/clang/Driver/Options.td
@@ -5257,6 +5257,8 @@ defm tgsplit : SimpleMFlag<"tgsplit", "Enable", "Disable",
 defm wavefrontsize64 : SimpleMFlag<"wavefrontsize64",
   "Specify wavefront size 64", "Specify wavefront size 32",
   " mode (AMDGPU only)">;
+defm force_unsafe_wavefrontsize64 : SimpleMFlag<"force-unsafe-wavefrontsize64",
+  "Force", "Do not force", " wavefront size 64 for gfx10+ GPUs (AMDGPU only)">;
 defm amdgpu_precise_memory_op
     : SimpleMFlag<"amdgpu-precise-memory-op", "Enable", "Disable",
                   " precise memory mode (AMDGPU only)">;
diff --git a/clang/lib/Driver/ToolChains/AMDGPU.cpp b/clang/lib/Driver/ToolChains/AMDGPU.cpp
index 35ca019795ddc..d7ce1ca1ca4e7 100644
--- a/clang/lib/Driver/ToolChains/AMDGPU.cpp
+++ b/clang/lib/Driver/ToolChains/AMDGPU.cpp
@@ -606,6 +606,7 @@ void amdgpu::getAMDGPUTargetFeatures(const Driver &D,
   // Add target ID features to -target-feature options. No diagnostics should
   // be emitted here since invalid target ID is diagnosed at other places.
   StringRef TargetID;
+  StringRef GpuArch;
   if (Args.hasArg(options::OPT_mcpu_EQ))
     TargetID = Args.getLastArgValue(options::OPT_mcpu_EQ);
   else if (Args.hasArg(options::OPT_march_EQ))
@@ -614,7 +615,7 @@ void amdgpu::getAMDGPUTargetFeatures(const Driver &D,
     llvm::StringMap<bool> FeatureMap;
     auto OptionalGpuArch = parseTargetID(Triple, TargetID, &FeatureMap);
     if (OptionalGpuArch) {
-      StringRef GpuArch = *OptionalGpuArch;
+      GpuArch = *OptionalGpuArch;
       // Iterate through all possible target ID features for the given GPU.
       // If it is mapped to true, add +feature.
       // If it is mapped to false, add -feature.
@@ -628,9 +629,10 @@ void amdgpu::getAMDGPUTargetFeatures(const Driver &D,
       }
     }
   }
-
-  if (Args.hasFlag(options::OPT_mwavefrontsize64,
-                   options::OPT_mno_wavefrontsize64, false))
+  if (toolchains::AMDGPUToolChain::isWave64(
+          D, Args, llvm::AMDGPU::parseArchAMDGCN(GpuArch),
+          /*DiagInvalid=*/false,
+          /*Explicit=*/true))
     Features.push_back("+wavefrontsize64");
 
   if (Args.hasFlag(options::OPT_mamdgpu_precise_memory_op,
@@ -767,15 +769,27 @@ llvm::DenormalMode AMDGPUToolChain::getDefaultDenormalModeForType(
                llvm::DenormalMode::getIEEE();
 }
 
-bool AMDGPUToolChain::isWave64(const llvm::opt::ArgList &DriverArgs,
-                               llvm::AMDGPU::GPUKind Kind) {
+bool AMDGPUToolChain::isWave64(const Driver &D,
+                               const llvm::opt::ArgList &DriverArgs,
+                               llvm::AMDGPU::GPUKind Kind, bool DiagInvalid,
+                               bool Explicit) {
   const unsigned ArchAttr = llvm::AMDGPU::getArchAttrAMDGCN(Kind);
   bool HasWave32 = (ArchAttr & llvm::AMDGPU::FEATURE_WAVE32);
 
-  return !HasWave32 || DriverArgs.hasFlag(
-    options::OPT_mwavefrontsize64, options::OPT_mno_wavefrontsize64, false);
-}
+  bool EnableWave64 = DriverArgs.hasFlag(
+      options::OPT_mwavefrontsize64, options::OPT_mno_wavefrontsize64, false);
+  bool ForceUnsafeWave64 =
+      DriverArgs.hasFlag(options::OPT_mforce_unsafe_wavefrontsize64,
+                         options::OPT_mno_force_unsafe_wavefrontsize64, false);
+
+  if (DiagInvalid && HasWave32 && EnableWave64 && !ForceUnsafeWave64)
+    D.Diag(diag::err_drv_wave64_requires_force_unsafe);
 
+  if (Explicit)
+    return EnableWave64 || ForceUnsafeWave64;
+
+  return !HasWave32 || EnableWave64 || ForceUnsafeWave64;
+}
 
 /// ROCM Toolchain
 ROCMToolChain::ROCMToolChain(const Driver &D, const llvm::Triple &Triple,
@@ -884,7 +898,7 @@ void ROCMToolChain::addClangTargetOptions(
                                                 ABIVer))
     return;
 
-  bool Wave64 = isWave64(DriverArgs, Kind);
+  bool Wave64 = isWave64(getDriver(), DriverArgs, Kind);
   // TODO: There are way too many flags that change this. Do we need to check
   // them all?
   bool DAZ = DriverArgs.hasArg(options::OPT_cl_denorms_are_zero) ||
@@ -1012,7 +1026,7 @@ ROCMToolChain::getCommonDeviceLibNames(const llvm::opt::ArgList &DriverArgs,
   bool CorrectSqrt = DriverArgs.hasFlag(
       options::OPT_fhip_fp32_correctly_rounded_divide_sqrt,
       options::OPT_fno_hip_fp32_correctly_rounded_divide_sqrt, true);
-  bool Wave64 = isWave64(DriverArgs, Kind);
+  bool Wave64 = isWave64(getDriver(), DriverArgs, Kind);
 
   // GPU Sanitizer currently only supports ASan and is enabled through host
   // ASan.
diff --git a/clang/lib/Driver/ToolChains/AMDGPU.h b/clang/lib/Driver/ToolChains/AMDGPU.h
index 08bd4fa556f78..d9d0fac71eb91 100644
--- a/clang/lib/Driver/ToolChains/AMDGPU.h
+++ b/clang/lib/Driver/ToolChains/AMDGPU.h
@@ -89,8 +89,13 @@ class LLVM_LIBRARY_VISIBILITY AMDGPUToolChain : public Generic_ELF {
       const llvm::opt::ArgList &DriverArgs, const JobAction &JA,
       const llvm::fltSemantics *FPType = nullptr) const override;
 
-  static bool isWave64(const llvm::opt::ArgList &DriverArgs,
-                       llvm::AMDGPU::GPUKind Kind);
+  /// Return whether the GPU of kind \p Kind has wavefront size 64 by driver
+  /// arguments \p DriverArgs. When \p Explicit is true, only return true
+  /// if wavefront size is set to 64 by arguments explicitly. When
+  /// \p DiagInvalid is true, diagnose invalid -mwavefrontsize64 arguments.
+  static bool isWave64(const Driver &D, const llvm::opt::ArgList &DriverArgs,
+                       llvm::AMDGPU::GPUKind Kind, bool DiagInvalid = false,
+                       bool Explicit = false);
   /// Needed for using lto.
   bool HasNativeLLVMSupport() const override {
     return true;
diff --git a/clang/lib/Driver/ToolChains/HIPAMD.cpp b/clang/lib/Driver/ToolChains/HIPAMD.cpp
index abda4eb453387..44f1a3e915283 100644
--- a/clang/lib/Driver/ToolChains/HIPAMD.cpp
+++ b/clang/lib/Driver/ToolChains/HIPAMD.cpp
@@ -229,6 +229,9 @@ void HIPAMDToolChain::addClangTargetOptions(
 
   assert(DeviceOffloadingKind == Action::OFK_HIP &&
          "Only HIP offloading kinds are supported for GPUs.");
+  const StringRef GpuArch = getGPUArch(DriverArgs);
+  auto Kind = llvm::AMDGPU::parseArchAMDGCN(GpuArch);
+  (void)isWave64(getDriver(), DriverArgs, Kind, /*DiagInvalid=*/true);
 
   CC1Args.append({"-fcuda-is-device", "-fno-threadsafe-statics"});
 
diff --git a/clang/test/Driver/hip-macros.hip b/clang/test/Driver/hip-macros.hip
index bd93f9985a774..37fdafc53af85 100644
--- a/clang/test/Driver/hip-macros.hip
+++ b/clang/test/Driver/hip-macros.hip
@@ -2,7 +2,7 @@
 // RUN: %clang -E -dM --offload-arch=gfx906 -mwavefrontsize64 \
 // RUN:   --cuda-device-only -nogpuinc -nogpulib \
 // RUN:   %s 2>&1 | FileCheck --check-prefixes=WAVE64 %s
-// RUN: %clang -E -dM --offload-arch=gfx1010 -mwavefrontsize64 \
+// RUN: %clang -E -dM --offload-arch=gfx1010 -mforce-unsafe-wavefrontsize64 \
 // RUN:   --cuda-device-only -nogpuinc -nogpulib \
 // RUN:   %s 2>&1 | FileCheck --check-prefixes=WAVE64 %s
 // RUN: %clang -E -dM --offload-arch=gfx906 -mwavefrontsize64 \
@@ -16,7 +16,7 @@
 // RUN:   -mwavefrontsize64 %s 2>&1 | FileCheck --check-prefixes=WAVE64 %s
 // RUN: %clang -E -dM --offload-arch=gfx1010 -mno-wavefrontsize64 \
 // RUN:   --cuda-device-only -nogpuinc -nogpulib \
-// RUN:   -mwavefrontsize64 %s 2>&1 | FileCheck --check-prefixes=WAVE64 %s
+// RUN:   -mforce-unsafe-wavefrontsize64 %s 2>&1 | FileCheck --check-prefixes=WAVE64 %s
 // WAVE64-DAG: #define __AMDGCN_WAVEFRONT_SIZE__ 64
 // WAVE32-DAG: #define __AMDGCN_WAVEFRONT_SIZE__ 32
 // WAVE64-DAG: #define __AMDGCN_WAVEFRONT_SIZE 64
diff --git a/clang/test/Driver/hip-toolchain-features.hip b/clang/test/Driver/hip-toolchain-features.hip
index dbc007ac1335b..1de21d9a0ff06 100644
--- a/clang/test/Driver/hip-toolchain-features.hip
+++ b/clang/test/Driver/hip-toolchain-features.hip
@@ -67,9 +67,15 @@
 // DUP-NOT: "-target-feature" "{{.*}}wavefrontsize64"
 // DUP: {{.*}}lld{{.*}} "-plugin-opt=-mattr=+cumode"
 
-// RUN: %clang -### --target=x86_64-linux-gnu -fgpu-rdc -nogpulib \
+// RUN: not %clang -### --target=x86_64-linux-gnu -fgpu-rdc -nogpulib \
 // RUN:   -nogpuinc --offload-arch=gfx1010 --no-offload-new-driver %s \
 // RUN:   -mno-wavefrontsize64 -mwavefrontsize64 2>&1 \
+// RUN:   | FileCheck %s -check-prefix=WAVE64-ERR
+// WAVE64-ERR: error: '-mwavefrontsize64' is not fully supported for HIP on AMD GPU architecture gfx10 and above. Use '-mforce-unsafe-wavefrontsize64' to override
+
+// RUN: %clang -### --target=x86_64-linux-gnu -fgpu-rdc -nogpulib \
+// RUN:   -nogpuinc --offload-arch=gfx1010 --no-offload-new-driver %s \
+// RUN:   -mno-wavefrontsize64 -mforce-unsafe-wavefrontsize64 2>&1 \
 // RUN:   | FileCheck %s -check-prefix=WAVE64
 // WAVE64: {{.*}}clang{{.*}} "-target-feature" "+wavefrontsize64"
 // WAVE64-NOT: "-target-feature" "{{.*}}wavefrontsize16"

arsenm · 2025-05-16T07:42:33Z

I don't think we should be introducing an additional option that behaves identically to another option just differing by a warning. This also isn't really a problem with the language, but the library support. The actual codegen should work just fine. Is there some other existing mechanism to just silence the warning?

yxsamliu · 2025-05-16T13:00:49Z

I don't think we should be introducing an additional option that behaves identically to another option just differing by a warning. This also isn't really a problem with the language, but the library support. The actual codegen should work just fine. Is there some other existing mechanism to just silence the warning?

We could emit a warning for -mwavefrontsize64 for HIP on gfx10+, which can be disabled by -Wno-unsupported-wave64. Then we do not need introduce -mforce-unsafe-wavefrontsize64.

@b-sumner are you OK with this?

b-sumner · 2025-05-16T14:06:07Z

I don't think we should be introducing an additional option that behaves identically to another option just differing by a warning. This also isn't really a problem with the language, but the library support. The actual codegen should work just fine. Is there some other existing mechanism to just silence the warning?

We could emit a warning for -mwavefrontsize64 for HIP on gfx10+, which can be disabled by -Wno-unsupported-wave64. Then we do not need introduce -mforce-unsafe-wavefrontsize64.

@b-sumner are you OK with this?

I am not. I think we need to emit an error for -mwavefrontsize64 for HIP on gfx10+. If we do this as if -Werror=unsafe-wavefrontsize64 is the default, then users can use -Wno-error=unsafe-wavefrontsize64 I suppose. This seems like a weird use of -Wno-error though.

yxsamliu · 2025-05-16T14:58:45Z

I don't think we should be introducing an additional option that behaves identically to another option just differing by a warning. This also isn't really a problem with the language, but the library support. The actual codegen should work just fine. Is there some other existing mechanism to just silence the warning?

We could emit a warning for -mwavefrontsize64 for HIP on gfx10+, which can be disabled by -Wno-unsupported-wave64. Then we do not need introduce -mforce-unsafe-wavefrontsize64.
@b-sumner are you OK with this?

I am not. I think we need to emit an error for -mwavefrontsize64 for HIP on gfx10+. If we do this as if -Werror=unsafe-wavefrontsize64 is the default, then users can use -Wno-error=unsafe-wavefrontsize64 I suppose. This seems like a weird use of -Wno-error though.

We can make the warning an error by default, and users can use -Wno-error=unsafe-wavefrontsize64 to disable it. There is precedence of this kind of use case. Is that OK?

b-sumner · 2025-05-16T15:20:00Z

I don't think we should be introducing an additional option that behaves identically to another option just differing by a warning. This also isn't really a problem with the language, but the library support. The actual codegen should work just fine. Is there some other existing mechanism to just silence the warning?

We could emit a warning for -mwavefrontsize64 for HIP on gfx10+, which can be disabled by -Wno-unsupported-wave64. Then we do not need introduce -mforce-unsafe-wavefrontsize64.
@b-sumner are you OK with this?

I am not. I think we need to emit an error for -mwavefrontsize64 for HIP on gfx10+. If we do this as if -Werror=unsafe-wavefrontsize64 is the default, then users can use -Wno-error=unsafe-wavefrontsize64 I suppose. This seems like a weird use of -Wno-error though.

We can make the warning an error by default, and users can use -Wno-error=unsafe-wavefrontsize64 to disable it. There is precedence of this kind of use case. Is that OK?

That sounds good to me!

arsenm · 2025-05-16T15:28:00Z

I am not ok with this being an error by default.

arsenm · 2025-05-16T15:31:07Z

It's really a library problem, not a codegen issue. The library should still be emitting the warning. You should be able to write wave64 code unperturbed due to hip standard library bugs

b-sumner · 2025-05-16T15:43:45Z

I am not ok with this being an error by default.

I think compilation must fail when an unsupported and untested option is specified. And that includes when the target is SPIR-V.

arsenm · 2025-05-16T17:31:25Z

I am not ok with this being an error by default.

I think compilation must fail when an unsupported and untested option is specified. And that includes when the target is SPIR-V.

I would call it a supported and used option. The unsupported and untested pieces are in the HIP headers, and bugs in the downstream libraries should not infect the compiler. Wave64 codegen should always work. Regressing the option which already exists to require jumping through hoops to continue using is unreasonable.

The option should not work for SPIRV, since it doesn't apply for the machine.

In short I don't think the compiler should be doing anything here. The warning should be isolated to the problematic HIP headers

yxsamliu · 2025-05-20T17:10:36Z

I am not ok with this being an error by default.

I think compilation must fail when an unsupported and untested option is specified. And that includes when the target is SPIR-V.

I would call it a supported and used option. The unsupported and untested pieces are in the HIP headers, and bugs in the downstream libraries should not infect the compiler. Wave64 codegen should always work. Regressing the option which already exists to require jumping through hoops to continue using is unreasonable.

The option should not work for SPIRV, since it doesn't apply for the machine.

In short I don't think the compiler should be doing anything here. The warning should be isolated to the problematic HIP headers

In this case, we intentionally removed predefined macros for wave 64 so that it is impossible to know it is wave 64 in HIP headers at compilation time. It is also too expensive to do assert at run time in device functions that are not working with wave 64. The only feasible way to diagnose that seems to be in compiler, and we could instead of saying "amdgpu target does not work with wave 64", we could say "HIP is not supported with wave 64 due to lack of runtime support of certain device functions".

yxsamliu requested review from arsenm, Artem-B, b-sumner and AlexVlx May 16, 2025 04:09

llvmbot added clang Clang issues not falling into any other category backend:AMDGPU clang:driver 'clang' and 'clang++' user-facing binaries. Not 'clang-cl' clang:frontend Language frontend issues, e.g. anything involving "Sema" labels May 16, 2025

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

[HIP] diagnose -mwavefrontsize64 for gfx10+ #140185

[HIP] diagnose -mwavefrontsize64 for gfx10+ #140185

yxsamliu commented May 16, 2025

Uh oh!

llvmbot commented May 16, 2025 •

edited

Loading

Uh oh!

llvmbot commented May 16, 2025

Uh oh!

arsenm commented May 16, 2025

Uh oh!

yxsamliu commented May 16, 2025

Uh oh!

b-sumner commented May 16, 2025

Uh oh!

yxsamliu commented May 16, 2025

Uh oh!

b-sumner commented May 16, 2025

Uh oh!

arsenm commented May 16, 2025

Uh oh!

arsenm commented May 16, 2025

Uh oh!

b-sumner commented May 16, 2025

Uh oh!

arsenm commented May 16, 2025

Uh oh!

yxsamliu commented May 20, 2025

Uh oh!

Uh oh!

[HIP] diagnose -mwavefrontsize64 for gfx10+ #140185

Are you sure you want to change the base?

[HIP] diagnose -mwavefrontsize64 for gfx10+ #140185

Conversation

yxsamliu commented May 16, 2025

Uh oh!

llvmbot commented May 16, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

llvmbot commented May 16, 2025

Uh oh!

arsenm commented May 16, 2025

Uh oh!

yxsamliu commented May 16, 2025

Uh oh!

b-sumner commented May 16, 2025

Uh oh!

yxsamliu commented May 16, 2025

Uh oh!

b-sumner commented May 16, 2025

Uh oh!

arsenm commented May 16, 2025

Uh oh!

arsenm commented May 16, 2025

Uh oh!

b-sumner commented May 16, 2025

Uh oh!

arsenm commented May 16, 2025

Uh oh!

yxsamliu commented May 20, 2025

Uh oh!

Uh oh!

llvmbot commented May 16, 2025 •

edited

Loading