Skip to content

[LoongArch] Align stack objects passed to memory intrinsics #101309

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 1 commit into from
Aug 2, 2024

Conversation

heiher
Copy link
Member

@heiher heiher commented Jul 31, 2024

Memcpy, and other memory intrinsics, typically try to use wider load/store if the source and destination addresses are aligned. In CodeGenPrepare, look for calls to memory intrinsics and, if the object is on the stack, align it to 4-byte (32-bit) or 8-byte (64-bit) boundaries if it is large enough that we expect memcpy to use wider load/store instructions to copy it.

Fixes #101295

@llvmbot llvmbot added clang Clang issues not falling into any other category clang:frontend Language frontend issues, e.g. anything involving "Sema" backend:loongarch labels Jul 31, 2024
@llvmbot
Copy link
Member

llvmbot commented Jul 31, 2024

@llvm/pr-subscribers-backend-loongarch

@llvm/pr-subscribers-clang

Author: hev (heiher)

Changes

Fixes #101295


Full diff: https://github.com/llvm/llvm-project/pull/101309.diff

3 Files Affected:

  • (modified) clang/lib/Basic/Targets/LoongArch.cpp (+17)
  • (modified) clang/lib/Basic/Targets/LoongArch.h (+3)
  • (modified) clang/test/CodeGen/LoongArch/align.c (+10-10)
diff --git a/clang/lib/Basic/Targets/LoongArch.cpp b/clang/lib/Basic/Targets/LoongArch.cpp
index cb3fd12c48ddb..63c2ed987442f 100644
--- a/clang/lib/Basic/Targets/LoongArch.cpp
+++ b/clang/lib/Basic/Targets/LoongArch.cpp
@@ -118,6 +118,23 @@ LoongArchTargetInfo::getGCCRegAliases() const {
   return llvm::ArrayRef(GCCRegAliases);
 }
 
+unsigned LoongArchTargetInfo::getMinGlobalAlign(uint64_t TypeSize,
+                                                bool HasNonWeakDef) const {
+  unsigned Align = TargetInfo::getMinGlobalAlign(TypeSize, HasNonWeakDef);
+
+  if (HasFeatureLASX && TypeSize >= 512) {       // TypeSize >= 64 bytes
+    Align = std::max(Align, 256u);               // align type at least 32 bytes
+  } else if (HasFeatureLSX && TypeSize >= 256) { // TypeSize >= 32 bytes
+    Align = std::max(Align, 128u);               // align type at least 16 bytes
+  } else if (TypeSize >= 64) {                   // TypeSize >= 8 bytes
+    Align = std::max(Align, 64u);                // align type at least 8 butes
+  } else if (TypeSize >= 16) {                   // TypeSize >= 2 bytes
+    Align = std::max(Align, 32u);                // align type at least 4 bytes
+  }
+
+  return Align;
+}
+
 bool LoongArchTargetInfo::validateAsmConstraint(
     const char *&Name, TargetInfo::ConstraintInfo &Info) const {
   // See the GCC definitions here:
diff --git a/clang/lib/Basic/Targets/LoongArch.h b/clang/lib/Basic/Targets/LoongArch.h
index c668ca7eca047..42aab7fe1aa49 100644
--- a/clang/lib/Basic/Targets/LoongArch.h
+++ b/clang/lib/Basic/Targets/LoongArch.h
@@ -82,6 +82,9 @@ class LLVM_LIBRARY_VISIBILITY LoongArchTargetInfo : public TargetInfo {
 
   ArrayRef<TargetInfo::GCCRegAlias> getGCCRegAliases() const override;
 
+  unsigned getMinGlobalAlign(uint64_t TypeSize,
+                             bool HasNonWeakDef) const override;
+
   bool validateAsmConstraint(const char *&Name,
                              TargetInfo::ConstraintInfo &Info) const override;
   std::string convertConstraint(const char *&Constraint) const override;
diff --git a/clang/test/CodeGen/LoongArch/align.c b/clang/test/CodeGen/LoongArch/align.c
index 1b171b74529d1..b353a98678a06 100644
--- a/clang/test/CodeGen/LoongArch/align.c
+++ b/clang/test/CodeGen/LoongArch/align.c
@@ -7,28 +7,28 @@
 #include <stdint.h>
 
 char *s1 = "1234";
-// LA32: @.str{{.*}} ={{.*}} constant [5 x i8] c"1234\00", align 1
-// LA64: @.str{{.*}} ={{.*}} constant [5 x i8] c"1234\00", align 1
+// LA32: @.str{{.*}} ={{.*}} constant [5 x i8] c"1234\00", align 4
+// LA64: @.str{{.*}} ={{.*}} constant [5 x i8] c"1234\00", align 4
 
 char *s2 = "12345678abcd";
-// LA32: @.str{{.*}} ={{.*}} constant [13 x i8] c"12345678abcd\00", align 1
-// LA64: @.str{{.*}} ={{.*}} constant [13 x i8] c"12345678abcd\00", align 1
+// LA32: @.str{{.*}} ={{.*}} constant [13 x i8] c"12345678abcd\00", align 8
+// LA64: @.str{{.*}} ={{.*}} constant [13 x i8] c"12345678abcd\00", align 8
 
 char *s3 = "123456789012345678901234567890ab";
-// LA32: @.str{{.*}} ={{.*}} constant [33 x i8] c"1234{{.*}}ab\00", align 1
-// LA64: @.str{{.*}} ={{.*}} constant [33 x i8] c"1234{{.*}}ab\00", align 1
+// LA32: @.str{{.*}} ={{.*}} constant [33 x i8] c"1234{{.*}}ab\00", align 16
+// LA64: @.str{{.*}} ={{.*}} constant [33 x i8] c"1234{{.*}}ab\00", align 16
 
 char *s4 = "123456789012345678901234567890123456789012345678901234567890abcdef";
-// LA32: @.str{{.*}} ={{.*}} constant [67 x i8] c"1234{{.*}}cdef\00", align 1
-// LA64: @.str{{.*}} ={{.*}} constant [67 x i8] c"1234{{.*}}cdef\00", align 1
+// LA32: @.str{{.*}} ={{.*}} constant [67 x i8] c"1234{{.*}}cdef\00", align 32
+// LA64: @.str{{.*}} ={{.*}} constant [67 x i8] c"1234{{.*}}cdef\00", align 32
 
 int8_t a;
 // LA32: @a ={{.*}} global i8 0, align 1
 // LA64: @a ={{.*}} global i8 0, align 1
 
 int16_t b;
-// LA32: @b ={{.*}} global i16 0, align 2
-// LA64: @b ={{.*}} global i16 0, align 2
+// LA32: @b ={{.*}} global i16 0, align 4
+// LA64: @b ={{.*}} global i16 0, align 4
 
 int32_t c;
 // LA32: @c ={{.*}} global i32 0, align 4

@s-barannikov
Copy link
Contributor

s-barannikov commented Jul 31, 2024

This looks wrong. Preferred and ABI alignments should be set in Triple.
ADD: Sorry, I meant DataLayout, of course.

@heiher
Copy link
Member Author

heiher commented Jul 31, 2024

This looks wrong. Preferred and ABI alignments should be set in Triple. ADD: Sorry, I meant DataLayout, of course.

Thanks for pointing this out. How does data layout affect the alignment of constant string global variable symbols?

@s-barannikov
Copy link
Contributor

This looks wrong. Preferred and ABI alignments should be set in Triple. ADD: Sorry, I meant DataLayout, of course.

Thanks for pointing this out. How does data layout affect the alignment of constant string global variable symbols?

They will be aligned to the preferred alignment specified as Y in the "-a:X:Y" part of the data layout, IIRC.

@s-barannikov s-barannikov requested a review from nikic July 31, 2024 15:20
@heiher
Copy link
Member Author

heiher commented Jul 31, 2024

This looks wrong. Preferred and ABI alignments should be set in Triple. ADD: Sorry, I meant DataLayout, of course.

Thanks for pointing this out. How does data layout affect the alignment of constant string global variable symbols?

They will be aligned to the preferred alignment specified as Y in the "-a:X:Y" part of the data layout, IIRC.

Thanks for the explanation. I understand that data layout can set default and preferred alignment for all globals. This patch chooses alignment based on the size of global objects. Can data layout do that?

@s-barannikov
Copy link
Contributor

This looks wrong. Preferred and ABI alignments should be set in Triple. ADD: Sorry, I meant DataLayout, of course.

Thanks for pointing this out. How does data layout affect the alignment of constant string global variable symbols?

They will be aligned to the preferred alignment specified as Y in the "-a:X:Y" part of the data layout, IIRC.

Thanks for the explanation. I understand that data layout can set default and preferred alignment for all globals. This patch chooses alignment based on the size of global objects. Can data layout do that?

You can specify ABI and preferred alignments for "first class" types, and separately for aggregates. It is all documented here.

@nikic
Copy link
Contributor

nikic commented Jul 31, 2024

I think for your use case you want to implement the shouldAlignPointerArgs() hook. It exists to allow raising GV/Alloca alignment for objects used inside memcpy and similar. Check out the ARM backend for an example.

Memcpy, and other memory intrinsics, typically try to use wider load/store
if the source and destination addresses are aligned. In CodeGenPrepare, look
for calls to memory intrinsics and, if the object is on the stack, align it to
4-byte (32-bit) or 8-byte (64-bit) boundaries if it is large enough that we
expect memcpy to use wider load/store instructions to copy it.

Fixes #101295
@heiher
Copy link
Member Author

heiher commented Aug 1, 2024

@nikic @s-barannikov Thanks for your guidance and help.

@heiher heiher changed the title [clang][LoongArch] Align global symbol by size [LoongArch] Align stack objects passed to memory intrinsics Aug 1, 2024
Copy link
Contributor

@nikic nikic left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This looks good to me, but probably a loongarch maintainer should take a look as well.

@SixWeining SixWeining requested a review from wangleiat August 1, 2024 10:07
Copy link
Contributor

@wangleiat wangleiat left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, thanks.

@heiher heiher merged commit 8b26c02 into llvm:main Aug 2, 2024
7 checks passed
@heiher heiher deleted the align-global branch August 2, 2024 03:28
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
backend:loongarch clang:frontend Language frontend issues, e.g. anything involving "Sema" clang Clang issues not falling into any other category
Projects
None yet
Development

Successfully merging this pull request may close these issues.

[clang][LoongArch] Too many instructions are generated for const-string copy with -mstrict-align
5 participants