-
Notifications
You must be signed in to change notification settings - Fork 13.5k
[LoongArch] Align stack objects passed to memory intrinsics #101309
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
@llvm/pr-subscribers-backend-loongarch @llvm/pr-subscribers-clang Author: hev (heiher) ChangesFixes #101295 Full diff: https://github.com/llvm/llvm-project/pull/101309.diff 3 Files Affected:
diff --git a/clang/lib/Basic/Targets/LoongArch.cpp b/clang/lib/Basic/Targets/LoongArch.cpp
index cb3fd12c48ddb..63c2ed987442f 100644
--- a/clang/lib/Basic/Targets/LoongArch.cpp
+++ b/clang/lib/Basic/Targets/LoongArch.cpp
@@ -118,6 +118,23 @@ LoongArchTargetInfo::getGCCRegAliases() const {
return llvm::ArrayRef(GCCRegAliases);
}
+unsigned LoongArchTargetInfo::getMinGlobalAlign(uint64_t TypeSize,
+ bool HasNonWeakDef) const {
+ unsigned Align = TargetInfo::getMinGlobalAlign(TypeSize, HasNonWeakDef);
+
+ if (HasFeatureLASX && TypeSize >= 512) { // TypeSize >= 64 bytes
+ Align = std::max(Align, 256u); // align type at least 32 bytes
+ } else if (HasFeatureLSX && TypeSize >= 256) { // TypeSize >= 32 bytes
+ Align = std::max(Align, 128u); // align type at least 16 bytes
+ } else if (TypeSize >= 64) { // TypeSize >= 8 bytes
+ Align = std::max(Align, 64u); // align type at least 8 butes
+ } else if (TypeSize >= 16) { // TypeSize >= 2 bytes
+ Align = std::max(Align, 32u); // align type at least 4 bytes
+ }
+
+ return Align;
+}
+
bool LoongArchTargetInfo::validateAsmConstraint(
const char *&Name, TargetInfo::ConstraintInfo &Info) const {
// See the GCC definitions here:
diff --git a/clang/lib/Basic/Targets/LoongArch.h b/clang/lib/Basic/Targets/LoongArch.h
index c668ca7eca047..42aab7fe1aa49 100644
--- a/clang/lib/Basic/Targets/LoongArch.h
+++ b/clang/lib/Basic/Targets/LoongArch.h
@@ -82,6 +82,9 @@ class LLVM_LIBRARY_VISIBILITY LoongArchTargetInfo : public TargetInfo {
ArrayRef<TargetInfo::GCCRegAlias> getGCCRegAliases() const override;
+ unsigned getMinGlobalAlign(uint64_t TypeSize,
+ bool HasNonWeakDef) const override;
+
bool validateAsmConstraint(const char *&Name,
TargetInfo::ConstraintInfo &Info) const override;
std::string convertConstraint(const char *&Constraint) const override;
diff --git a/clang/test/CodeGen/LoongArch/align.c b/clang/test/CodeGen/LoongArch/align.c
index 1b171b74529d1..b353a98678a06 100644
--- a/clang/test/CodeGen/LoongArch/align.c
+++ b/clang/test/CodeGen/LoongArch/align.c
@@ -7,28 +7,28 @@
#include <stdint.h>
char *s1 = "1234";
-// LA32: @.str{{.*}} ={{.*}} constant [5 x i8] c"1234\00", align 1
-// LA64: @.str{{.*}} ={{.*}} constant [5 x i8] c"1234\00", align 1
+// LA32: @.str{{.*}} ={{.*}} constant [5 x i8] c"1234\00", align 4
+// LA64: @.str{{.*}} ={{.*}} constant [5 x i8] c"1234\00", align 4
char *s2 = "12345678abcd";
-// LA32: @.str{{.*}} ={{.*}} constant [13 x i8] c"12345678abcd\00", align 1
-// LA64: @.str{{.*}} ={{.*}} constant [13 x i8] c"12345678abcd\00", align 1
+// LA32: @.str{{.*}} ={{.*}} constant [13 x i8] c"12345678abcd\00", align 8
+// LA64: @.str{{.*}} ={{.*}} constant [13 x i8] c"12345678abcd\00", align 8
char *s3 = "123456789012345678901234567890ab";
-// LA32: @.str{{.*}} ={{.*}} constant [33 x i8] c"1234{{.*}}ab\00", align 1
-// LA64: @.str{{.*}} ={{.*}} constant [33 x i8] c"1234{{.*}}ab\00", align 1
+// LA32: @.str{{.*}} ={{.*}} constant [33 x i8] c"1234{{.*}}ab\00", align 16
+// LA64: @.str{{.*}} ={{.*}} constant [33 x i8] c"1234{{.*}}ab\00", align 16
char *s4 = "123456789012345678901234567890123456789012345678901234567890abcdef";
-// LA32: @.str{{.*}} ={{.*}} constant [67 x i8] c"1234{{.*}}cdef\00", align 1
-// LA64: @.str{{.*}} ={{.*}} constant [67 x i8] c"1234{{.*}}cdef\00", align 1
+// LA32: @.str{{.*}} ={{.*}} constant [67 x i8] c"1234{{.*}}cdef\00", align 32
+// LA64: @.str{{.*}} ={{.*}} constant [67 x i8] c"1234{{.*}}cdef\00", align 32
int8_t a;
// LA32: @a ={{.*}} global i8 0, align 1
// LA64: @a ={{.*}} global i8 0, align 1
int16_t b;
-// LA32: @b ={{.*}} global i16 0, align 2
-// LA64: @b ={{.*}} global i16 0, align 2
+// LA32: @b ={{.*}} global i16 0, align 4
+// LA64: @b ={{.*}} global i16 0, align 4
int32_t c;
// LA32: @c ={{.*}} global i32 0, align 4
|
This looks wrong. Preferred and ABI alignments should be set in |
Thanks for pointing this out. How does data layout affect the alignment of constant string global variable symbols? |
They will be aligned to the preferred alignment specified as Y in the "-a:X:Y" part of the data layout, IIRC. |
Thanks for the explanation. I understand that data layout can set default and preferred alignment for all globals. This patch chooses alignment based on the size of global objects. Can data layout do that? |
You can specify ABI and preferred alignments for "first class" types, and separately for aggregates. It is all documented here. |
I think for your use case you want to implement the shouldAlignPointerArgs() hook. It exists to allow raising GV/Alloca alignment for objects used inside memcpy and similar. Check out the ARM backend for an example. |
Memcpy, and other memory intrinsics, typically try to use wider load/store if the source and destination addresses are aligned. In CodeGenPrepare, look for calls to memory intrinsics and, if the object is on the stack, align it to 4-byte (32-bit) or 8-byte (64-bit) boundaries if it is large enough that we expect memcpy to use wider load/store instructions to copy it. Fixes #101295
@nikic @s-barannikov Thanks for your guidance and help. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This looks good to me, but probably a loongarch maintainer should take a look as well.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM, thanks.
Memcpy, and other memory intrinsics, typically try to use wider load/store if the source and destination addresses are aligned. In CodeGenPrepare, look for calls to memory intrinsics and, if the object is on the stack, align it to 4-byte (32-bit) or 8-byte (64-bit) boundaries if it is large enough that we expect memcpy to use wider load/store instructions to copy it.
Fixes #101295