Skip to content

[CUDA] Increment VTable index for device thunks #124989

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 1 commit into from
Feb 20, 2025

Conversation

gandhi56
Copy link
Contributor

@gandhi56 gandhi56 commented Jan 29, 2025

Currently, the clang frontend incorrectly emits the callee instead of the thunk for the callee in the VTable. This is the case because the thunk index is not incremented when their callees cannot be emitted. This patch fixes this bug.

@gandhi56 gandhi56 requested review from AlexVlx and yxsamliu January 29, 2025 21:20
@gandhi56 gandhi56 marked this pull request as ready for review January 29, 2025 21:22
@llvmbot llvmbot added clang Clang issues not falling into any other category clang:codegen IR generation bugs: mangling, exceptions, etc. labels Jan 29, 2025
@llvmbot
Copy link
Member

llvmbot commented Jan 29, 2025

@llvm/pr-subscribers-clang-codegen

@llvm/pr-subscribers-clang

Author: Anshil Gandhi (gandhi56)

Changes

Full diff: https://github.com/llvm/llvm-project/pull/124989.diff

2 Files Affected:

  • (modified) clang/lib/CodeGen/CGVTables.cpp (+9-4)
  • (added) clang/test/CodeGenCUDA/increment-index-for-thunks.cu (+41)
diff --git a/clang/lib/CodeGen/CGVTables.cpp b/clang/lib/CodeGen/CGVTables.cpp
index 7f729d359b82b3..c5c4f522de94da 100644
--- a/clang/lib/CodeGen/CGVTables.cpp
+++ b/clang/lib/CodeGen/CGVTables.cpp
@@ -772,6 +772,10 @@ void CodeGenVTables::addVTableComponent(ConstantArrayBuilder &builder,
   case VTableComponent::CK_DeletingDtorPointer: {
     GlobalDecl GD = component.getGlobalDecl();
 
+    const bool IsThunk =
+        nextVTableThunkIndex < layout.vtable_thunks().size() &&
+        layout.vtable_thunks()[nextVTableThunkIndex].first == componentIndex;
+
     if (CGM.getLangOpts().CUDA) {
       // Emit NULL for methods we can't codegen on this
       // side. Otherwise we'd end up with vtable with unresolved
@@ -783,9 +787,12 @@ void CodeGenVTables::addVTableComponent(ConstantArrayBuilder &builder,
           CGM.getLangOpts().CUDAIsDevice
               ? MD->hasAttr<CUDADeviceAttr>()
               : (MD->hasAttr<CUDAHostAttr>() || !MD->hasAttr<CUDADeviceAttr>());
-      if (!CanEmitMethod)
+      if (!CanEmitMethod) {
+        if (IsThunk)
+          nextVTableThunkIndex++;
         return builder.add(
             llvm::ConstantExpr::getNullValue(CGM.GlobalsInt8PtrTy));
+      }
       // Method is acceptable, continue processing as usual.
     }
 
@@ -831,9 +838,7 @@ void CodeGenVTables::addVTableComponent(ConstantArrayBuilder &builder,
       fnPtr = DeletedVirtualFn;
 
     // Thunks.
-    } else if (nextVTableThunkIndex < layout.vtable_thunks().size() &&
-               layout.vtable_thunks()[nextVTableThunkIndex].first ==
-                   componentIndex) {
+    } else if (IsThunk) {
       auto &thunkInfo = layout.vtable_thunks()[nextVTableThunkIndex].second;
 
       nextVTableThunkIndex++;
diff --git a/clang/test/CodeGenCUDA/increment-index-for-thunks.cu b/clang/test/CodeGenCUDA/increment-index-for-thunks.cu
new file mode 100644
index 00000000000000..80631338dbdb8a
--- /dev/null
+++ b/clang/test/CodeGenCUDA/increment-index-for-thunks.cu
@@ -0,0 +1,41 @@
+// RUN: %clang_cc1 -fcuda-is-device -triple amdgcn-amd-amdhsa -target-cpu gfx942 \
+// RUN:   -emit-llvm -xhip %s -o - | FileCheck %s --check-prefix=GCN
+// RUN: %clang_cc1 -fcuda-is-device -triple spirv64-amd-amdhsa \
+// RUN:   -emit-llvm -xhip %s -o - | FileCheck %s --check-prefix=SPIRV
+
+// GCN: @_ZTV1C = linkonce_odr unnamed_addr addrspace(1) constant { [5 x ptr addrspace(1)], [4 x ptr addrspace(1)] } { [5 x ptr addrspace(1)] [ptr addrspace(1) null, ptr addrspace(1) null, ptr addrspace(1) addrspacecast (ptr @_ZN1B2f2Ev to ptr addrspace(1)), ptr addrspace(1) null, ptr addrspace(1) addrspacecast (ptr @_ZN1C2f1Ev to ptr addrspace(1))], [4 x ptr addrspace(1)] [ptr addrspace(1) inttoptr (i64 -8 to ptr addrspace(1)), ptr addrspace(1) null, ptr addrspace(1) null, ptr addrspace(1) addrspacecast (ptr @_ZThn8_N1C2f1Ev to ptr addrspace(1))] }, comdat, align 8
+// GCN: @_ZTV1B = linkonce_odr unnamed_addr addrspace(1) constant { [3 x ptr addrspace(1)] } { [3 x ptr addrspace(1)] [ptr addrspace(1) null, ptr addrspace(1) null, ptr addrspace(1) addrspacecast (ptr @_ZN1B2f2Ev to ptr addrspace(1))] }, comdat, align 8
+// GCN: @_ZTV1A = linkonce_odr unnamed_addr addrspace(1) constant { [4 x ptr addrspace(1)] } { [4 x ptr addrspace(1)] [ptr addrspace(1) null, ptr addrspace(1) null, ptr addrspace(1) null, ptr addrspace(1) addrspacecast (ptr @__cxa_pure_virtual to ptr addrspace(1))] }, comdat, align 8
+// GCN: @__hip_cuid_ = addrspace(1) global i8 0
+// GCN: @llvm.compiler.used = appending addrspace(1) global [1 x ptr] [ptr addrspacecast (ptr addrspace(1) @__hip_cuid_ to ptr)], section "llvm.metadata"
+// GCN: @__oclc_ABI_version = weak_odr hidden local_unnamed_addr addrspace(4) constant i32 500
+
+// SPIRV: @_ZTV1C = linkonce_odr unnamed_addr addrspace(1) constant { [5 x ptr addrspace(1)], [4 x ptr addrspace(1)] } { [5 x ptr addrspace(1)] [ptr addrspace(1) null, ptr addrspace(1) null, ptr addrspace(1) addrspacecast (ptr addrspace(4) @_ZN1B2f2Ev to ptr addrspace(1)), ptr addrspace(1) null, ptr addrspace(1) addrspacecast (ptr addrspace(4) @_ZN1C2f1Ev to ptr addrspace(1))], [4 x ptr addrspace(1)] [ptr addrspace(1) inttoptr (i64 -8 to ptr addrspace(1)), ptr addrspace(1) null, ptr addrspace(1) null, ptr addrspace(1) addrspacecast (ptr addrspace(4) @_ZThn8_N1C2f1Ev to ptr addrspace(1))] }, comdat, align 8
+// SPIRV: @_ZTV1B = linkonce_odr unnamed_addr addrspace(1) constant { [3 x ptr addrspace(1)] } { [3 x ptr addrspace(1)] [ptr addrspace(1) null, ptr addrspace(1) null, ptr addrspace(1) addrspacecast (ptr addrspace(4) @_ZN1B2f2Ev to ptr addrspace(1))] }, comdat, align 8
+// SPIRV: @_ZTV1A = linkonce_odr unnamed_addr addrspace(1) constant { [4 x ptr addrspace(1)] } { [4 x ptr addrspace(1)] [ptr addrspace(1) null, ptr addrspace(1) null, ptr addrspace(1) null, ptr addrspace(1) addrspacecast (ptr addrspace(4) @__cxa_pure_virtual to ptr addrspace(1))] }, comdat, align 8
+// SPIRV: @__hip_cuid_ = addrspace(1) global i8 0
+// SPIRV: @llvm.compiler.used = appending addrspace(1) global [1 x ptr addrspace(4)] [ptr addrspace(4) addrspacecast (ptr addrspace(1) @__hip_cuid_ to ptr addrspace(4))], section "llvm.metadata"
+
+struct A {
+  __attribute__((device)) A() { }
+  virtual void neither_device_nor_host_f() = 0 ;
+  __attribute__((device)) virtual void f1() = 0;
+ 
+};
+ 
+struct B {
+  __attribute__((device)) B() { }
+  __attribute__((device)) virtual void f2() { };
+};
+ 
+struct C : public B, public A {
+  __attribute__((device)) C() : B(), A() { }
+ 
+   virtual void neither_device_nor_host_f() override { }
+  __attribute__((device)) virtual void f1() override { }
+ 
+};
+ 
+__attribute__((device)) void test() {
+  C obj;
+}

@gandhi56 gandhi56 requested a review from bcahoon January 29, 2025 21:26
@bcahoon bcahoon requested a review from Artem-B January 31, 2025 02:46
@yxsamliu
Copy link
Collaborator

yxsamliu commented Feb 4, 2025

_ No description provided. _

Pls add a commit message about the issue this PR is addressing and a summary of what it does.

@gandhi56 gandhi56 self-assigned this Feb 4, 2025
@Artem-B
Copy link
Member

Artem-B commented Feb 4, 2025

I'm out of my depth here and will leave it up to @yxsamliu.

@Artem-B Artem-B removed their request for review February 4, 2025 18:46
@gandhi56 gandhi56 requested a review from yxsamliu February 10, 2025 14:44
@yxsamliu yxsamliu requested a review from rjmccall February 10, 2025 18:04
@yxsamliu
Copy link
Collaborator

I'm out of my depth here and will leave it up to @yxsamliu.

It seems reasonable to me, but I would like @rjmccall to take a look.

@gandhi56
Copy link
Contributor Author

Internal tests have passed.

@gandhi56
Copy link
Contributor Author

ping

Copy link
Collaborator

@yxsamliu yxsamliu left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM. Thanks

@gandhi56
Copy link
Contributor Author

Appreciate the review, @yxsamliu

gandhi56 added a commit that referenced this pull request Feb 20, 2025
llvm-sync bot pushed a commit to arm/arm-toolchain that referenced this pull request Feb 20, 2025
Currently, the clang frontend incorrectly emits
the callee instead of the thunk for the callee
in the VTable. This is the case because the
thunk index is not incremented when their
callees cannot be emitted. This patch fixes
this bug.
@gandhi56 gandhi56 force-pushed the chai/increment-vtable-idx-for-thunks branch from c5cab77 to 8783d00 Compare February 20, 2025 04:44
@gandhi56 gandhi56 merged commit 95000fd into llvm:main Feb 20, 2025
5 of 7 checks passed
@gandhi56 gandhi56 deleted the chai/increment-vtable-idx-for-thunks branch February 20, 2025 15:47
searlmc1 pushed a commit to ROCm/llvm-project that referenced this pull request Mar 26, 2025
searlmc1 pushed a commit to ROCm/llvm-project that referenced this pull request Mar 26, 2025
Currently, the clang frontend incorrectly emits the callee instead of
the thunk for the callee in the VTable. This is the case because the
thunk index is not incremented when their callees cannot be emitted.
This patch fixes the bug.
searlmc1 pushed a commit to ROCm/llvm-project that referenced this pull request Mar 26, 2025
rocm-ci pushed a commit to ROCm/llvm-project that referenced this pull request Apr 11, 2025
rocm-ci pushed a commit to ROCm/llvm-project that referenced this pull request Apr 11, 2025
Currently, the clang frontend incorrectly emits the callee instead of
the thunk for the callee in the VTable. This is the case because the
thunk index is not incremented when their callees cannot be emitted.
This patch fixes the bug.
rocm-ci pushed a commit to ROCm/llvm-project that referenced this pull request Apr 11, 2025
jrbyrnes pushed a commit to jrbyrnes/llvm-project that referenced this pull request May 27, 2025
Currently, the clang frontend incorrectly emits the callee instead of
the thunk for the callee in the VTable. This is the case because the
thunk index is not incremented when their callees cannot be emitted.
This patch fixes the bug.
jrbyrnes pushed a commit to jrbyrnes/llvm-project that referenced this pull request May 27, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
clang:codegen IR generation bugs: mangling, exceptions, etc. clang Clang issues not falling into any other category
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants