Skip to content

[CUDA][HIP] make trivial ctor/dtor host device #72394

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 1 commit into from
Nov 16, 2023

Conversation

yxsamliu
Copy link
Collaborator

Make trivial ctor/dtor implicitly host device functions so that they can be used to initialize file-scope
device variables to match nvcc behavior.

Fixes: #72261

Fixes: SWDEV-432412

@yxsamliu yxsamliu requested a review from Artem-B November 15, 2023 14:28
@llvmbot llvmbot added clang Clang issues not falling into any other category clang:frontend Language frontend issues, e.g. anything involving "Sema" labels Nov 15, 2023
@llvmbot
Copy link
Member

llvmbot commented Nov 15, 2023

@llvm/pr-subscribers-clang

Author: Yaxun (Sam) Liu (yxsamliu)

Changes

Make trivial ctor/dtor implicitly host device functions so that they can be used to initialize file-scope
device variables to match nvcc behavior.

Fixes: #72261

Fixes: SWDEV-432412


Full diff: https://github.com/llvm/llvm-project/pull/72394.diff

10 Files Affected:

  • (modified) clang/include/clang/Sema/Sema.h (+4)
  • (modified) clang/lib/Sema/SemaCUDA.cpp (+20)
  • (modified) clang/lib/Sema/SemaDecl.cpp (+3)
  • (modified) clang/test/SemaCUDA/call-host-fn-from-device.cu (+1-1)
  • (modified) clang/test/SemaCUDA/default-ctor.cu (+1-1)
  • (modified) clang/test/SemaCUDA/implicit-member-target-collision-cxx11.cu (+2-1)
  • (modified) clang/test/SemaCUDA/implicit-member-target-collision.cu (+2-1)
  • (modified) clang/test/SemaCUDA/implicit-member-target-inherited.cu (+6-4)
  • (modified) clang/test/SemaCUDA/implicit-member-target.cu (+4-2)
  • (added) clang/test/SemaCUDA/trivial-ctor-dtor.cu (+40)
diff --git a/clang/include/clang/Sema/Sema.h b/clang/include/clang/Sema/Sema.h
index a35a3c2c26c22ad..44dcbbf7605a557 100644
--- a/clang/include/clang/Sema/Sema.h
+++ b/clang/include/clang/Sema/Sema.h
@@ -13450,6 +13450,10 @@ class Sema final {
   void maybeAddCUDAHostDeviceAttrs(FunctionDecl *FD,
                                    const LookupResult &Previous);
 
+  /// May add implicit CUDAHostAttr and CUDADeviceAttr attributes to a
+  /// trivial cotr/dtor that does not have host and device attributes.
+  void maybeAddCUDAHostDeviceAttrsToTrivialCtorDtor(FunctionDecl *FD);
+
   /// May add implicit CUDAConstantAttr attribute to VD, depending on VD
   /// and current compilation settings.
   void MaybeAddCUDAConstantAttr(VarDecl *VD);
diff --git a/clang/lib/Sema/SemaCUDA.cpp b/clang/lib/Sema/SemaCUDA.cpp
index 318174f7be8fa95..c376ab56dbef0e8 100644
--- a/clang/lib/Sema/SemaCUDA.cpp
+++ b/clang/lib/Sema/SemaCUDA.cpp
@@ -772,6 +772,26 @@ void Sema::maybeAddCUDAHostDeviceAttrs(FunctionDecl *NewD,
   NewD->addAttr(CUDADeviceAttr::CreateImplicit(Context));
 }
 
+// If a trivial ctor/dtor has no host/device
+// attributes, make it implicitly host device function.
+void Sema::maybeAddCUDAHostDeviceAttrsToTrivialCtorDtor(FunctionDecl *FD) {
+  auto IsTrivialCtor = [&](auto *D) {
+    if (auto *CD = dyn_cast<CXXConstructorDecl>(D))
+      return isEmptyCudaConstructor(SourceLocation(), CD);
+    return false;
+  };
+  auto IsTrivialDtor = [&](auto *D) {
+    if (auto *DD = dyn_cast<CXXDestructorDecl>(D))
+      return isEmptyCudaDestructor(SourceLocation(), DD);
+    return false;
+  };
+  if ((IsTrivialCtor(FD) || IsTrivialDtor(FD)) &&
+      !FD->hasAttr<CUDAHostAttr>() && !FD->hasAttr<CUDADeviceAttr>()) {
+    FD->addAttr(CUDAHostAttr::CreateImplicit(Context));
+    FD->addAttr(CUDADeviceAttr::CreateImplicit(Context));
+  }
+}
+
 // TODO: `__constant__` memory may be a limited resource for certain targets.
 // A safeguard may be needed at the end of compilation pipeline if
 // `__constant__` memory usage goes beyond limit.
diff --git a/clang/lib/Sema/SemaDecl.cpp b/clang/lib/Sema/SemaDecl.cpp
index 3876eb501083acb..a6cd0bb9ea2a829 100644
--- a/clang/lib/Sema/SemaDecl.cpp
+++ b/clang/lib/Sema/SemaDecl.cpp
@@ -16232,6 +16232,9 @@ Decl *Sema::ActOnFinishFunctionBody(Decl *dcl, Stmt *Body,
   if (FD && !FD->isDeleted())
     checkTypeSupport(FD->getType(), FD->getLocation(), FD);
 
+  if (LangOpts.CUDA)
+    maybeAddCUDAHostDeviceAttrsToTrivialCtorDtor(FD);
+
   return dcl;
 }
 
diff --git a/clang/test/SemaCUDA/call-host-fn-from-device.cu b/clang/test/SemaCUDA/call-host-fn-from-device.cu
index acdd291b664579b..203f4fcbdf1efa0 100644
--- a/clang/test/SemaCUDA/call-host-fn-from-device.cu
+++ b/clang/test/SemaCUDA/call-host-fn-from-device.cu
@@ -12,7 +12,7 @@ extern "C" void host_fn() {}
 struct Dummy {};
 
 struct S {
-  S() {}
+  S() { x = 1; }
   // expected-note@-1 2 {{'S' declared here}}
   ~S() { host_fn(); }
   // expected-note@-1 {{'~S' declared here}}
diff --git a/clang/test/SemaCUDA/default-ctor.cu b/clang/test/SemaCUDA/default-ctor.cu
index cbad7a1774c1501..31971fe6b3863c7 100644
--- a/clang/test/SemaCUDA/default-ctor.cu
+++ b/clang/test/SemaCUDA/default-ctor.cu
@@ -25,7 +25,7 @@ __device__ void fd() {
   InD ind;
   InH inh; // expected-error{{no matching constructor for initialization of 'InH'}}
   InHD inhd;
-  Out out; // expected-error{{no matching constructor for initialization of 'Out'}}
+  Out out;
   OutD outd;
   OutH outh; // expected-error{{no matching constructor for initialization of 'OutH'}}
   OutHD outhd;
diff --git a/clang/test/SemaCUDA/implicit-member-target-collision-cxx11.cu b/clang/test/SemaCUDA/implicit-member-target-collision-cxx11.cu
index 06015ed0d6d8edc..0ee2e0963e40d59 100644
--- a/clang/test/SemaCUDA/implicit-member-target-collision-cxx11.cu
+++ b/clang/test/SemaCUDA/implicit-member-target-collision-cxx11.cu
@@ -6,7 +6,8 @@
 // Test 1: collision between two bases
 
 struct A1_with_host_ctor {
-  A1_with_host_ctor() {}
+  int x;
+  A1_with_host_ctor() { x = 1; }
 };
 
 struct B1_with_device_ctor {
diff --git a/clang/test/SemaCUDA/implicit-member-target-collision.cu b/clang/test/SemaCUDA/implicit-member-target-collision.cu
index a50fddaa4615b22..060443c639924fb 100644
--- a/clang/test/SemaCUDA/implicit-member-target-collision.cu
+++ b/clang/test/SemaCUDA/implicit-member-target-collision.cu
@@ -6,7 +6,8 @@
 // Test 1: collision between two bases
 
 struct A1_with_host_ctor {
-  A1_with_host_ctor() {}
+  int x;
+  A1_with_host_ctor() { int x = 1; }
 };
 
 struct B1_with_device_ctor {
diff --git a/clang/test/SemaCUDA/implicit-member-target-inherited.cu b/clang/test/SemaCUDA/implicit-member-target-inherited.cu
index 2178172ed01930d..8784135c0d6b66e 100644
--- a/clang/test/SemaCUDA/implicit-member-target-inherited.cu
+++ b/clang/test/SemaCUDA/implicit-member-target-inherited.cu
@@ -6,10 +6,11 @@
 // Test 1: infer inherited default ctor to be host.
 
 struct A1_with_host_ctor {
-  A1_with_host_ctor() {}
+  A1_with_host_ctor() { x = 1; }
+  int x;
 };
-// expected-note@-3 {{candidate constructor (the implicit copy constructor) not viable}}
-// expected-note@-4 {{candidate constructor (the implicit move constructor) not viable}}
+// expected-note@-4 {{candidate constructor (the implicit copy constructor) not viable}}
+// expected-note@-5 {{candidate constructor (the implicit move constructor) not viable}}
 
 // The inherited default constructor is inferred to be host, so we'll encounter
 // an error when calling it from a __device__ function, but not from a __host__
@@ -83,7 +84,8 @@ void hostfoo3() {
 // Test 4: infer inherited default ctor from a field, not a base
 
 struct A4_with_host_ctor {
-  A4_with_host_ctor() {}
+  int x;
+  A4_with_host_ctor() { int x = 1; }
 };
 
 struct B4_with_inherited_host_ctor : A4_with_host_ctor{
diff --git a/clang/test/SemaCUDA/implicit-member-target.cu b/clang/test/SemaCUDA/implicit-member-target.cu
index d87e69624043419..2d260c64636ac84 100644
--- a/clang/test/SemaCUDA/implicit-member-target.cu
+++ b/clang/test/SemaCUDA/implicit-member-target.cu
@@ -6,7 +6,8 @@
 // Test 1: infer default ctor to be host.
 
 struct A1_with_host_ctor {
-  A1_with_host_ctor() {}
+  int x;
+  A1_with_host_ctor() { x = 1; }
 };
 
 // The implicit default constructor is inferred to be host because it only needs
@@ -75,7 +76,8 @@ void hostfoo3() {
 // Test 4: infer default ctor from a field, not a base
 
 struct A4_with_host_ctor {
-  A4_with_host_ctor() {}
+  int x;
+  A4_with_host_ctor() { int x = 1; }
 };
 
 struct B4_with_implicit_default_ctor {
diff --git a/clang/test/SemaCUDA/trivial-ctor-dtor.cu b/clang/test/SemaCUDA/trivial-ctor-dtor.cu
new file mode 100644
index 000000000000000..c7c0d33fe4c2d2e
--- /dev/null
+++ b/clang/test/SemaCUDA/trivial-ctor-dtor.cu
@@ -0,0 +1,40 @@
+// RUN: %clang_cc1 -isystem %S/Inputs  -fsyntax-only -verify %s
+// RUN: %clang_cc1 -isystem %S/Inputs -fcuda-is-device -fsyntax-only -verify %s
+
+#include <cuda.h>
+
+// Check trivial ctor/dtor
+struct A {
+  int x;
+  A() {}
+  ~A() {}
+};
+
+__device__ A a;
+
+// Check trivial ctor/dtor of template class
+template<typename T>
+struct TA {
+  T x;
+  TA() {}
+  ~TA() {}
+};
+
+__device__ TA<int> ta;
+
+// Check non-trivial ctor/dtor in parent template class
+template<typename T>
+struct TB {
+  T x;
+  TB() { x = 1; }
+  ~TB() {}
+};
+
+template<typename T>
+struct TC : TB<T> {
+  T x;
+  TC() {}
+  ~TC() {}
+};
+
+__device__ TC<int> tc; //expected-error {{dynamic initialization is not supported for __device__, __constant__, __shared__, and __managed__ variables}}

Copy link
Member

@Artem-B Artem-B left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM with a couple of nits.

Make trivial ctor/dtor implicitly host device functions
so that they can be used to initialize file-scope
device variables to match nvcc behavior.

Fixes: llvm#72261

Fixes: SWDEV-432412
@yxsamliu yxsamliu merged commit 27e6e4a into llvm:main Nov 16, 2023
@Artem-B
Copy link
Member

Artem-B commented Nov 19, 2023

We've found a problem with the patch. https://godbolt.org/z/jcKo34vzG

template <typename T>
class C {
    explicit C() {};
};

template <> C<int>::C() {};
<source>:6:21: error: __host__ function 'C' cannot overload __host__ __device__ function 'C'
    6 | template <> C<int>::C() {};
      |                     ^
<source>:3:14: note: previous declaration is here
    3 |     explicit C() {};

@yxsamliu
Copy link
Collaborator Author

We've found a problem with the patch. https://godbolt.org/z/jcKo34vzG

template <typename T>
class C {
    explicit C() {};
};

template <> C<int>::C() {};
<source>:6:21: error: __host__ function 'C' cannot overload __host__ __device__ function 'C'
    6 | template <> C<int>::C() {};
      |                     ^
<source>:3:14: note: previous declaration is here
    3 |     explicit C() {};

I will take a look. Thanks.

@yxsamliu
Copy link
Collaborator Author

We've found a problem with the patch. https://godbolt.org/z/jcKo34vzG

template <typename T>
class C {
    explicit C() {};
};

template <> C<int>::C() {};
<source>:6:21: error: __host__ function 'C' cannot overload __host__ __device__ function 'C'
    6 | template <> C<int>::C() {};
      |                     ^
<source>:3:14: note: previous declaration is here
    3 |     explicit C() {};

fix by #72815

sr-tream pushed a commit to sr-tream/llvm-project that referenced this pull request Nov 20, 2023
Make trivial ctor/dtor implicitly host device functions so that they can
be used to initialize file-scope
device variables to match nvcc behavior.

Fixes: llvm#72261

Fixes: SWDEV-432412
yxsamliu added a commit to yxsamliu/llvm-project that referenced this pull request Nov 20, 2023
When deciding whether a previous function declaration is an
overload or override, implicit host/device attrs should
not be considered.

This fixes the failure for the following code:

`template <typename T>
class C {
    explicit C() {};
};

template <> C<int>::C() {};
`

The issue was introduced by llvm#72394

sine the template specialization is treated as overload
due to implicit host/device attrs are considered for
overload/override differentiation.
zahiraam pushed a commit to zahiraam/llvm-project that referenced this pull request Nov 20, 2023
Make trivial ctor/dtor implicitly host device functions so that they can
be used to initialize file-scope
device variables to match nvcc behavior.

Fixes: llvm#72261

Fixes: SWDEV-432412
yxsamliu added a commit that referenced this pull request Nov 20, 2023
When deciding whether a previous function declaration is an overload or
override, implicit host/device attrs should not be considered.

This fixes the failure for the following code:

`template <typename T>
class C {
    explicit C() {};
};

template <> C<int>::C() {};
`

The issue was introduced by
#72394

sine the template specialization is treated as overload due to implicit
host/device attrs are considered for overload/override differentiation.
@bgra8
Copy link
Contributor

bgra8 commented Nov 21, 2023

@yxsamliu we found another problem with the patch.

https://godbolt.org/z/5M9eexKKM

#include <memory>

template <class T>
class Abc {
 public:
  Abc();
  ~Abc();

 private:
  struct Impl;
  std::unique_ptr<Impl> impl_;
};

template <class T>
struct Abc<T>::Impl {
    int x;
};

template <class T>
Abc<T>::Abc() : impl_{std::make_unique<Impl>()} {}

template <class T>
Abc<T>::~Abc() {}

template class Abc<int>;
<source>:23:9: error: reference to __host__ function '~unique_ptr' in __host__ __device__ function
   23 | Abc<T>::~Abc() {}
      |         ^
<source>:25:16: note: in instantiation of member function 'Abc<int>::~Abc' requested here
   25 | template class Abc<int>;
      |                ^
/opt/compiler-explorer/clang-trunk-20231121/bin/../include/c++/v1/__memory/unique_ptr.h:263:59: note: '~unique_ptr' declared here
  263 |   _LIBCPP_INLINE_VISIBILITY _LIBCPP_CONSTEXPR_SINCE_CXX23 ~unique_ptr() { reset(); }
      |                                                           ^
1 error generated when compiling for sm_86.
Compiler returned: 1

@yxsamliu
Copy link
Collaborator Author

Thanks for reporting. A reduced testcase is https://godbolt.org/z/MY84az9xh

`template
struct ptr {
~ptr() { static int x = 1;}
};

template
struct Abc : ptr {
public:
Abc();
~Abc() {}
};

template
class Abc;
`

clang thinks Abc::~Abc() is trivial but it is not. It could be due to clang does not check base class for templates. Probably should only make instantiated ctor/dtor implicit.

@alexfh
Copy link
Contributor

alexfh commented Nov 22, 2023

@yxsamliu What's the plan here? This issue is blocking us. If there is no obvious fix very soon, we need to revert this.

@yxsamliu
Copy link
Collaborator Author

@yxsamliu What's the plan here? This issue is blocking us. If there is no obvious fix very soon, we need to revert this.

I will fix it.

@yxsamliu
Copy link
Collaborator Author

@yxsamliu What's the plan here? This issue is blocking us. If there is no obvious fix very soon, we need to revert this.

fix by #73140

yxsamliu added a commit that referenced this pull request Nov 23, 2023
This reverts commit 27e6e4a.

This patch is reverted due to regression. A testcase is:

`template <class T>
struct ptr {
    ~ptr() { static int x = 1;}
};

template <class T>
struct Abc : ptr<T> {
 public:
  Abc();
  ~Abc() {}
};

template
class Abc<int>;
`
searlmc1 pushed a commit to ROCm/llvm-project that referenced this pull request Nov 27, 2023
Make trivial ctor/dtor implicitly host device functions
so that they can be used to initialize file-scope
device variables to match nvcc behavior.

Fixes: llvm#72261

Fixes: SWDEV-432412

cherry-pick of: llvm#72394

Change-Id: Ieac37f8fa35e035a84bae9c5d29c6f5acab6a766
rocm-ci pushed a commit to ROCm/llvm-project that referenced this pull request Dec 15, 2023
Make trivial ctor/dtor implicitly host device functions
so that they can be used to initialize file-scope
device variables to match nvcc behavior.

Fixes: llvm#72261

Fixes: SWDEV-432412

cherry-pick of: llvm#72394

[CUDA][HIP] ignore implicit host/device attr for override

When deciding whether a previous function declaration is an
overload or override, implicit host/device attrs should
not be considered.

This fixes the failure for the following code:

`template <typename T>
class C {
    explicit C() {};
};

template <> C<int>::C() {};
`

The issue was introduced by llvm#72394

sine the template specialization is treated as overload
due to implicit host/device attrs are considered for
overload/override differentiation.

cherry-pick of llvm#72815

Change-Id: Ie896cc0e4d5d82d5af48e08a996a3b392285d9ee
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
clang:frontend Language frontend issues, e.g. anything involving "Sema" clang Clang issues not falling into any other category
Projects
None yet
Development

Successfully merging this pull request may close these issues.

[CUDA][HIP] file-scope device variable not allowed with trivial ctor
5 participants