-
Notifications
You must be signed in to change notification settings - Fork 13.5k
[InterleavedLoadCombine] Bail out on non-byte-sized vector element type #90705
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
Vectors are always tightly packed, and elements of non-byte-sized usually do not have a well-defined (byte) offset. Fixes llvm#90695.
@llvm/pr-subscribers-backend-aarch64 Author: Nikita Popov (nikic) ChangesVectors are always tightly packed, and elements of non-byte-sized usually do not have a well-defined (byte) offset. Fixes #90695. Full diff: https://github.com/llvm/llvm-project/pull/90705.diff 2 Files Affected:
diff --git a/llvm/lib/CodeGen/InterleavedLoadCombinePass.cpp b/llvm/lib/CodeGen/InterleavedLoadCombinePass.cpp
index e5f164b182723f..a9b59e738c00bf 100644
--- a/llvm/lib/CodeGen/InterleavedLoadCombinePass.cpp
+++ b/llvm/lib/CodeGen/InterleavedLoadCombinePass.cpp
@@ -877,6 +877,9 @@ struct VectorInfo {
if (LI->isAtomic())
return false;
+ if (!DL.typeSizeEqualsStoreSize(Result.VTy->getElementType()))
+ return false;
+
// Get the base polynomial
computePolynomialFromPointer(*LI->getPointerOperand(), Offset, BasePtr, DL);
diff --git a/llvm/test/CodeGen/AArch64/interleaved-load-combine-pr90695.ll b/llvm/test/CodeGen/AArch64/interleaved-load-combine-pr90695.ll
new file mode 100644
index 00000000000000..ee75b3a083f713
--- /dev/null
+++ b/llvm/test/CodeGen/AArch64/interleaved-load-combine-pr90695.ll
@@ -0,0 +1,19 @@
+; NOTE: Assertions have been autogenerated by utils/update_test_checks.py UTC_ARGS: --version 4
+; RUN: opt -S -passes=interleaved-load-combine < %s | FileCheck %s
+
+target triple = "aarch64-unknown-windows-gnu"
+
+; Make sure we don't crash on loads of vectors of non-byte-sized types.
+define <4 x i1> @test(ptr %p) {
+; CHECK-LABEL: define <4 x i1> @test(
+; CHECK-SAME: ptr [[P:%.*]]) {
+; CHECK-NEXT: entry:
+; CHECK-NEXT: [[LOAD:%.*]] = load <2 x i1>, ptr [[P]], align 1
+; CHECK-NEXT: [[SHUF:%.*]] = shufflevector <2 x i1> [[LOAD]], <2 x i1> zeroinitializer, <4 x i32> <i32 0, i32 1, i32 2, i32 2>
+; CHECK-NEXT: ret <4 x i1> [[SHUF]]
+;
+entry:
+ %load = load <2 x i1>, ptr %p, align 1
+ %shuf = shufflevector <2 x i1> %load, <2 x i1> zeroinitializer, <4 x i32> <i32 0, i32 1, i32 2, i32 2>
+ ret <4 x i1> %shuf
+}
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM, thanks!
; NOTE: Assertions have been autogenerated by utils/update_test_checks.py UTC_ARGS: --version 4 | ||
; RUN: opt -S -passes=interleaved-load-combine < %s | FileCheck %s | ||
|
||
target triple = "aarch64-unknown-windows-gnu" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
never seen this triple before :)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
…pe (llvm#90705) Vectors are always tightly packed, and elements of non-byte-sized usually do not have a well-defined (byte) offset. Fixes llvm#90695. (cherry picked from commit d484c4d)
…pe (llvm#90705) Vectors are always tightly packed, and elements of non-byte-sized usually do not have a well-defined (byte) offset. Fixes llvm#90695. (cherry picked from commit d484c4d)
Vectors are always tightly packed, and elements of non-byte-sized usually do not have a well-defined (byte) offset.
Fixes #90695.