-
Notifications
You must be signed in to change notification settings - Fork 13.6k
[AArch64] Handle XAR with v1i64 operand types #141754
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
@llvm/pr-subscribers-backend-aarch64 Author: David Green (davemgreen) ChangesWhen converting ROTR(XOR(a, b)) to XAR(a, b), or ROTR(a, a) to XAR(a, zero) we were not handling v1i64 types, meaning illegal copies get generated. This addresses that by generating insert_subreg and extract_subreg for v1i64 types to keep the values with the correct types. Fixes #141746 Full diff: https://github.com/llvm/llvm-project/pull/141754.diff 2 Files Affected:
diff --git a/llvm/lib/Target/AArch64/AArch64ISelDAGToDAG.cpp b/llvm/lib/Target/AArch64/AArch64ISelDAGToDAG.cpp
index 2eb8c6008db0f..39a9df2a73275 100644
--- a/llvm/lib/Target/AArch64/AArch64ISelDAGToDAG.cpp
+++ b/llvm/lib/Target/AArch64/AArch64ISelDAGToDAG.cpp
@@ -4637,15 +4637,38 @@ bool AArch64DAGToDAGISel::trySelectXAR(SDNode *N) {
if (!IsXOROperand) {
SDValue Zero = CurDAG->getTargetConstant(0, DL, MVT::i64);
- SDNode *MOV = CurDAG->getMachineNode(AArch64::MOVIv2d_ns, DL, VT, Zero);
+ SDNode *MOV = CurDAG->getMachineNode(AArch64::MOVIv2d_ns, DL, MVT::v2i64, Zero);
SDValue MOVIV = SDValue(MOV, 0);
R1 = N1->getOperand(0);
R2 = MOVIV;
}
+ // If the input is a v1i64, widen to a v2i64 to use XAR.
+ assert((VT == MVT::v1i64 || VT == MVT::v2i64) && "Unexpected XAR type!");
+ if (VT == MVT::v1i64) {
+ EVT SVT = MVT::v2i64;
+ SDValue Undef =
+ SDValue(CurDAG->getMachineNode(AArch64::IMPLICIT_DEF, DL, SVT), 0);
+ SDValue DSub = CurDAG->getTargetConstant(AArch64::dsub, DL, MVT::i32);
+ R1 = SDValue(CurDAG->getMachineNode(AArch64::INSERT_SUBREG, DL, SVT, Undef,
+ R1, DSub),
+ 0);
+ if (R2.getValueType() == MVT::v1i64)
+ R2 = SDValue(CurDAG->getMachineNode(AArch64::INSERT_SUBREG, DL, SVT,
+ Undef, R2, DSub),
+ 0);
+ }
+
SDValue Ops[] = {R1, R2, Imm};
- CurDAG->SelectNodeTo(N, AArch64::XAR, N0.getValueType(), Ops);
+ SDNode *XAR =
+ CurDAG->getMachineNode(AArch64::XAR, DL, R1.getValueType(), Ops);
+ if (VT == MVT::v1i64) {
+ SDValue DSub = CurDAG->getTargetConstant(AArch64::dsub, DL, MVT::i32);
+ XAR = CurDAG->getMachineNode(AArch64::EXTRACT_SUBREG, DL, VT,
+ SDValue(XAR, 0), DSub);
+ }
+ ReplaceNode(N, XAR);
return true;
}
diff --git a/llvm/test/CodeGen/AArch64/xar.ll b/llvm/test/CodeGen/AArch64/xar.ll
index e15cb6a696aa5..d682f4f4a1bfb 100644
--- a/llvm/test/CodeGen/AArch64/xar.ll
+++ b/llvm/test/CodeGen/AArch64/xar.ll
@@ -19,6 +19,26 @@ define <2 x i64> @xar(<2 x i64> %x, <2 x i64> %y) {
ret <2 x i64> %b
}
+define <1 x i64> @xar_v1i64(<1 x i64> %a, <1 x i64> %b) {
+; SHA3-LABEL: xar_v1i64:
+; SHA3: // %bb.0:
+; SHA3-NEXT: // kill: def $d0 killed $d0 def $q0
+; SHA3-NEXT: // kill: def $d1 killed $d1 def $q1
+; SHA3-NEXT: xar v0.2d, v0.2d, v1.2d, #63
+; SHA3-NEXT: // kill: def $d0 killed $d0 killed $q0
+; SHA3-NEXT: ret
+;
+; NOSHA3-LABEL: xar_v1i64:
+; NOSHA3: // %bb.0:
+; NOSHA3-NEXT: eor v1.8b, v0.8b, v1.8b
+; NOSHA3-NEXT: shl d0, d1, #1
+; NOSHA3-NEXT: usra d0, d1, #63
+; NOSHA3-NEXT: ret
+ %v.val = xor <1 x i64> %a, %b
+ %fshl = tail call <1 x i64> @llvm.fshl.v1i64(<1 x i64> %v.val, <1 x i64> %v.val, <1 x i64> splat (i64 1))
+ ret <1 x i64> %fshl
+}
+
define <2 x i64> @xar_instead_of_or1(<2 x i64> %r) {
; SHA3-LABEL: xar_instead_of_or1:
; SHA3: // %bb.0: // %entry
@@ -37,6 +57,25 @@ entry:
ret <2 x i64> %or
}
+define <1 x i64> @xar_instead_of_or_v1i64(<1 x i64> %v.val) {
+; SHA3-LABEL: xar_instead_of_or_v1i64:
+; SHA3: // %bb.0:
+; SHA3-NEXT: movi v1.2d, #0000000000000000
+; SHA3-NEXT: // kill: def $d0 killed $d0 def $q0
+; SHA3-NEXT: xar v0.2d, v0.2d, v1.2d, #63
+; SHA3-NEXT: // kill: def $d0 killed $d0 killed $q0
+; SHA3-NEXT: ret
+;
+; NOSHA3-LABEL: xar_instead_of_or_v1i64:
+; NOSHA3: // %bb.0:
+; NOSHA3-NEXT: shl d1, d0, #1
+; NOSHA3-NEXT: usra d1, d0, #63
+; NOSHA3-NEXT: fmov d0, d1
+; NOSHA3-NEXT: ret
+ %fshl = tail call <1 x i64> @llvm.fshl.v1i64(<1 x i64> %v.val, <1 x i64> %v.val, <1 x i64> splat (i64 1))
+ ret <1 x i64> %fshl
+}
+
define <4 x i32> @xar_instead_of_or2(<4 x i32> %r) {
; SHA3-LABEL: xar_instead_of_or2:
; SHA3: // %bb.0: // %entry
|
@Rajveer100 FYI too - not that you caused the issue here, let us know if you spot any problems. |
✅ With the latest revision this PR passed the C/C++ code formatter. |
d7b0c20
to
943db8f
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
quick fix! LGTM cheers
Sure! |
When converting ROTR(XOR(a, b)) to XAR(a, b), or ROTR(a, a) to XAR(a, zero) we were not handling v1i64 types, meaning illegal copies get generated. This addresses that by generating insert_subreg and extract_subreg for v1i64 types to keep the values with the correct types. Fixes llvm#141746
410005c
to
38aa26b
Compare
When converting ROTR(XOR(a, b)) to XAR(a, b), or ROTR(a, a) to XAR(a, zero) we were not handling v1i64 types, meaning illegal copies get generated. This addresses that by generating insert_subreg and extract_subreg for v1i64 to keep the values with the correct types. Fixes #141746
When converting ROTR(XOR(a, b)) to XAR(a, b), or ROTR(a, a) to XAR(a, zero) we were not handling v1i64 types, meaning illegal copies get generated. This addresses that by generating insert_subreg and extract_subreg for v1i64 to keep the values with the correct types. Fixes llvm#141746
When converting ROTR(XOR(a, b)) to XAR(a, b), or ROTR(a, a) to XAR(a, zero) we were not handling v1i64 types, meaning illegal copies get generated. This addresses that by generating insert_subreg and extract_subreg for v1i64 types to keep the values with the correct types.
Fixes #141746