Skip to content

[AArch64] Handle XAR with v1i64 operand types #141754

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 2 commits into from
May 29, 2025

Conversation

davemgreen
Copy link
Collaborator

When converting ROTR(XOR(a, b)) to XAR(a, b), or ROTR(a, a) to XAR(a, zero) we were not handling v1i64 types, meaning illegal copies get generated. This addresses that by generating insert_subreg and extract_subreg for v1i64 types to keep the values with the correct types.

Fixes #141746

@llvmbot
Copy link
Member

llvmbot commented May 28, 2025

@llvm/pr-subscribers-backend-aarch64

Author: David Green (davemgreen)

Changes

When converting ROTR(XOR(a, b)) to XAR(a, b), or ROTR(a, a) to XAR(a, zero) we were not handling v1i64 types, meaning illegal copies get generated. This addresses that by generating insert_subreg and extract_subreg for v1i64 types to keep the values with the correct types.

Fixes #141746


Full diff: https://github.com/llvm/llvm-project/pull/141754.diff

2 Files Affected:

  • (modified) llvm/lib/Target/AArch64/AArch64ISelDAGToDAG.cpp (+25-2)
  • (modified) llvm/test/CodeGen/AArch64/xar.ll (+39)
diff --git a/llvm/lib/Target/AArch64/AArch64ISelDAGToDAG.cpp b/llvm/lib/Target/AArch64/AArch64ISelDAGToDAG.cpp
index 2eb8c6008db0f..39a9df2a73275 100644
--- a/llvm/lib/Target/AArch64/AArch64ISelDAGToDAG.cpp
+++ b/llvm/lib/Target/AArch64/AArch64ISelDAGToDAG.cpp
@@ -4637,15 +4637,38 @@ bool AArch64DAGToDAGISel::trySelectXAR(SDNode *N) {
 
   if (!IsXOROperand) {
     SDValue Zero = CurDAG->getTargetConstant(0, DL, MVT::i64);
-    SDNode *MOV = CurDAG->getMachineNode(AArch64::MOVIv2d_ns, DL, VT, Zero);
+    SDNode *MOV = CurDAG->getMachineNode(AArch64::MOVIv2d_ns, DL, MVT::v2i64, Zero);
     SDValue MOVIV = SDValue(MOV, 0);
     R1 = N1->getOperand(0);
     R2 = MOVIV;
   }
 
+  // If the input is a v1i64, widen to a v2i64 to use XAR.
+  assert((VT == MVT::v1i64 || VT == MVT::v2i64) && "Unexpected XAR type!");
+  if (VT == MVT::v1i64) {
+    EVT SVT = MVT::v2i64;
+    SDValue Undef =
+        SDValue(CurDAG->getMachineNode(AArch64::IMPLICIT_DEF, DL, SVT), 0);
+    SDValue DSub = CurDAG->getTargetConstant(AArch64::dsub, DL, MVT::i32);
+    R1 = SDValue(CurDAG->getMachineNode(AArch64::INSERT_SUBREG, DL, SVT, Undef,
+                                        R1, DSub),
+                 0);
+    if (R2.getValueType() == MVT::v1i64)
+      R2 = SDValue(CurDAG->getMachineNode(AArch64::INSERT_SUBREG, DL, SVT,
+                                          Undef, R2, DSub),
+                   0);
+  }
+
   SDValue Ops[] = {R1, R2, Imm};
-  CurDAG->SelectNodeTo(N, AArch64::XAR, N0.getValueType(), Ops);
+  SDNode *XAR =
+      CurDAG->getMachineNode(AArch64::XAR, DL, R1.getValueType(), Ops);
 
+  if (VT == MVT::v1i64) {
+    SDValue DSub = CurDAG->getTargetConstant(AArch64::dsub, DL, MVT::i32);
+    XAR = CurDAG->getMachineNode(AArch64::EXTRACT_SUBREG, DL, VT,
+                                 SDValue(XAR, 0), DSub);
+  }
+  ReplaceNode(N, XAR);
   return true;
 }
 
diff --git a/llvm/test/CodeGen/AArch64/xar.ll b/llvm/test/CodeGen/AArch64/xar.ll
index e15cb6a696aa5..d682f4f4a1bfb 100644
--- a/llvm/test/CodeGen/AArch64/xar.ll
+++ b/llvm/test/CodeGen/AArch64/xar.ll
@@ -19,6 +19,26 @@ define <2 x i64> @xar(<2 x i64> %x, <2 x i64> %y) {
     ret <2 x i64> %b
 }
 
+define <1 x i64> @xar_v1i64(<1 x i64> %a, <1 x i64> %b) {
+; SHA3-LABEL: xar_v1i64:
+; SHA3:       // %bb.0:
+; SHA3-NEXT:    // kill: def $d0 killed $d0 def $q0
+; SHA3-NEXT:    // kill: def $d1 killed $d1 def $q1
+; SHA3-NEXT:    xar v0.2d, v0.2d, v1.2d, #63
+; SHA3-NEXT:    // kill: def $d0 killed $d0 killed $q0
+; SHA3-NEXT:    ret
+;
+; NOSHA3-LABEL: xar_v1i64:
+; NOSHA3:       // %bb.0:
+; NOSHA3-NEXT:    eor v1.8b, v0.8b, v1.8b
+; NOSHA3-NEXT:    shl d0, d1, #1
+; NOSHA3-NEXT:    usra d0, d1, #63
+; NOSHA3-NEXT:    ret
+  %v.val = xor <1 x i64> %a, %b
+  %fshl = tail call <1 x i64> @llvm.fshl.v1i64(<1 x i64> %v.val, <1 x i64> %v.val, <1 x i64> splat (i64 1))
+  ret <1 x i64> %fshl
+}
+
 define <2 x i64> @xar_instead_of_or1(<2 x i64> %r) {
 ; SHA3-LABEL: xar_instead_of_or1:
 ; SHA3:       // %bb.0: // %entry
@@ -37,6 +57,25 @@ entry:
   ret <2 x i64> %or
 }
 
+define <1 x i64> @xar_instead_of_or_v1i64(<1 x i64> %v.val) {
+; SHA3-LABEL: xar_instead_of_or_v1i64:
+; SHA3:       // %bb.0:
+; SHA3-NEXT:    movi v1.2d, #0000000000000000
+; SHA3-NEXT:    // kill: def $d0 killed $d0 def $q0
+; SHA3-NEXT:    xar v0.2d, v0.2d, v1.2d, #63
+; SHA3-NEXT:    // kill: def $d0 killed $d0 killed $q0
+; SHA3-NEXT:    ret
+;
+; NOSHA3-LABEL: xar_instead_of_or_v1i64:
+; NOSHA3:       // %bb.0:
+; NOSHA3-NEXT:    shl d1, d0, #1
+; NOSHA3-NEXT:    usra d1, d0, #63
+; NOSHA3-NEXT:    fmov d0, d1
+; NOSHA3-NEXT:    ret
+  %fshl = tail call <1 x i64> @llvm.fshl.v1i64(<1 x i64> %v.val, <1 x i64> %v.val, <1 x i64> splat (i64 1))
+  ret <1 x i64> %fshl
+}
+
 define <4 x i32> @xar_instead_of_or2(<4 x i32> %r) {
 ; SHA3-LABEL: xar_instead_of_or2:
 ; SHA3:       // %bb.0: // %entry

@davemgreen
Copy link
Collaborator Author

@Rajveer100 FYI too - not that you caused the issue here, let us know if you spot any problems.

Copy link

github-actions bot commented May 28, 2025

✅ With the latest revision this PR passed the C/C++ code formatter.

Copy link
Collaborator

@c-rhodes c-rhodes left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

quick fix! LGTM cheers

@Rajveer100
Copy link
Contributor

@Rajveer100 FYI too - not that you caused the issue here, let us know if you spot any problems.

Sure!

davemgreen and others added 2 commits May 29, 2025 10:13
When converting ROTR(XOR(a, b)) to XAR(a, b), or ROTR(a, a) to XAR(a, zero) we
were not handling v1i64 types, meaning illegal copies get generated. This
addresses that by generating insert_subreg and extract_subreg for v1i64 types
to keep the values with the correct types.

Fixes llvm#141746
@davemgreen davemgreen merged commit 32837f3 into llvm:main May 29, 2025
6 of 11 checks passed
@davemgreen davemgreen deleted the gh-a64-xarv1i64 branch May 29, 2025 09:22
svkeerthy pushed a commit that referenced this pull request May 29, 2025
When converting ROTR(XOR(a, b)) to XAR(a, b), or ROTR(a, a) to XAR(a, zero)
we were not handling v1i64 types, meaning illegal copies get generated. This
addresses that by generating insert_subreg and extract_subreg for v1i64 to
keep the values with the correct types.

Fixes #141746
google-yfyang pushed a commit to google-yfyang/llvm-project that referenced this pull request May 29, 2025
When converting ROTR(XOR(a, b)) to XAR(a, b), or ROTR(a, a) to XAR(a, zero)
we were not handling v1i64 types, meaning illegal copies get generated. This
addresses that by generating insert_subreg and extract_subreg for v1i64 to
keep the values with the correct types.

Fixes llvm#141746
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

[AArch64] llvm.fshl with <1 x i64> asserts with unimplemented reg-to-reg copy
4 participants