[RISCV] Use BuildPairF64 and SplitF64 for bitcast i64<->f64 on rv32 regardless of Zfa. #85982

topperc · 2024-03-20T18:21:10Z

Previously we used BuildPairF64 and SplitF64 only if Zfa was supported since they will select register file moves that are only available with Zfa.

We recently changed the handling of BuildPairF64/SplitF64 for Zdinx to not go through memory so we should use that for bitcast.

That leaves the D without Zfa case that does need to go through memory. Previously we let type legalization expand to loads and stores using a new stack temporary created for each bitcast. After this patch we will create the loads ands stores in the custom inserter and share the same stack slot for all. This also allows DAGCombiner to optimize when bitcast is mixed with BuildPairF64/SplitF64.

…egardless of Zfa. Previously we used BuildPairF64 and SplitF64 only if Zfa was supported since they will select register file moves that are only available with Zfa. We recently changed the handling of BuildPairF64/SplitF64 for Zdinx to not go through memory so we should use that for bitcast. That leaves the D without Zfa case that does need to go through memory. Previously we let type legalization expand to loads and stores using a new stack temporary created for each bitcast. After this patch we will create the loads ands stores in the custom inserter and share the same stack slot for all. This also allows DAGCombiner to optimize when bitcast is mixed with BuildPairF64/SplitF64.

llvmbot · 2024-03-20T18:21:38Z

@llvm/pr-subscribers-backend-risc-v

Author: Craig Topper (topperc)

Changes

Previously we used BuildPairF64 and SplitF64 only if Zfa was supported since they will select register file moves that are only available with Zfa.

We recently changed the handling of BuildPairF64/SplitF64 for Zdinx to not go through memory so we should use that for bitcast.

That leaves the D without Zfa case that does need to go through memory. Previously we let type legalization expand to loads and stores using a new stack temporary created for each bitcast. After this patch we will create the loads ands stores in the custom inserter and share the same stack slot for all. This also allows DAGCombiner to optimize when bitcast is mixed with BuildPairF64/SplitF64.

Full diff: https://github.com/llvm/llvm-project/pull/85982.diff

5 Files Affected:

(modified) llvm/lib/Target/RISCV/RISCVISelLowering.cpp (+5-6)
(modified) llvm/test/CodeGen/RISCV/double-convert.ll (+6-22)
(modified) llvm/test/CodeGen/RISCV/inline-asm-d-constraint-f.ll (+1-15)
(modified) llvm/test/CodeGen/RISCV/rvv/fixed-vectors-bitcast.ll (+1-1)
(modified) llvm/test/CodeGen/RISCV/spill-fill-fold.ll (+8-6)

diff --git a/llvm/lib/Target/RISCV/RISCVISelLowering.cpp b/llvm/lib/Target/RISCV/RISCVISelLowering.cpp
index 3aa28215efc23c..a3ebfb34ad7aa8 100644
--- a/llvm/lib/Target/RISCV/RISCVISelLowering.cpp
+++ b/llvm/lib/Target/RISCV/RISCVISelLowering.cpp
@@ -559,11 +559,12 @@ RISCVTargetLowering::RISCVTargetLowering(const TargetMachine &TM,
   if (Subtarget.hasStdExtDOrZdinx()) {
     setOperationAction(FPLegalNodeTypes, MVT::f64, Legal);
 
+    if (!Subtarget.is64Bit())
+      setOperationAction(ISD::BITCAST, MVT::i64, Custom);
+
     if (Subtarget.hasStdExtZfa()) {
       setOperationAction(FPRndMode, MVT::f64, Legal);
       setOperationAction(ISD::FNEARBYINT, MVT::f64, Legal);
-      if (!Subtarget.is64Bit())
-        setOperationAction(ISD::BITCAST, MVT::i64, Custom);
     } else {
       if (Subtarget.is64Bit())
         setOperationAction(FPRndMode, MVT::f64, Custom);
@@ -6071,8 +6072,7 @@ SDValue RISCVTargetLowering::LowerOperation(SDValue Op,
           DAG.getNode(RISCVISD::FMV_W_X_RV64, DL, MVT::f32, NewOp0);
       return FPConv;
     }
-    if (VT == MVT::f64 && Op0VT == MVT::i64 && XLenVT == MVT::i32 &&
-        Subtarget.hasStdExtZfa()) {
+    if (VT == MVT::f64 && Op0VT == MVT::i64 && XLenVT == MVT::i32) {
       SDValue Lo, Hi;
       std::tie(Lo, Hi) = DAG.SplitScalar(Op0, DL, MVT::i32, MVT::i32);
       SDValue RetReg =
@@ -12157,8 +12157,7 @@ void RISCVTargetLowering::ReplaceNodeResults(SDNode *N,
       SDValue FPConv =
           DAG.getNode(RISCVISD::FMV_X_ANYEXTW_RV64, DL, MVT::i64, Op0);
       Results.push_back(DAG.getNode(ISD::TRUNCATE, DL, MVT::i32, FPConv));
-    } else if (VT == MVT::i64 && Op0VT == MVT::f64 && XLenVT == MVT::i32 &&
-               Subtarget.hasStdExtZfa()) {
+    } else if (VT == MVT::i64 && Op0VT == MVT::f64 && XLenVT == MVT::i32) {
       SDValue NewReg = DAG.getNode(RISCVISD::SplitF64, DL,
                                    DAG.getVTList(MVT::i32, MVT::i32), Op0);
       SDValue RetReg = DAG.getNode(ISD::BUILD_PAIR, DL, MVT::i64,
diff --git a/llvm/test/CodeGen/RISCV/double-convert.ll b/llvm/test/CodeGen/RISCV/double-convert.ll
index 7a9439e5b322c6..c1429642962e89 100644
--- a/llvm/test/CodeGen/RISCV/double-convert.ll
+++ b/llvm/test/CodeGen/RISCV/double-convert.ll
@@ -1116,13 +1116,7 @@ define i64 @fmv_x_d(double %a, double %b) nounwind {
 ;
 ; RV32IZFINXZDINX-LABEL: fmv_x_d:
 ; RV32IZFINXZDINX:       # %bb.0:
-; RV32IZFINXZDINX-NEXT:    addi sp, sp, -16
 ; RV32IZFINXZDINX-NEXT:    fadd.d a0, a0, a2
-; RV32IZFINXZDINX-NEXT:    sw a0, 8(sp)
-; RV32IZFINXZDINX-NEXT:    sw a1, 12(sp)
-; RV32IZFINXZDINX-NEXT:    lw a0, 8(sp)
-; RV32IZFINXZDINX-NEXT:    lw a1, 12(sp)
-; RV32IZFINXZDINX-NEXT:    addi sp, sp, 16
 ; RV32IZFINXZDINX-NEXT:    ret
 ;
 ; RV64IZFINXZDINX-LABEL: fmv_x_d:
@@ -1257,13 +1251,13 @@ define double @fmv_d_x(i64 %a, i64 %b) nounwind {
 ; RV32IFD-LABEL: fmv_d_x:
 ; RV32IFD:       # %bb.0:
 ; RV32IFD-NEXT:    addi sp, sp, -16
-; RV32IFD-NEXT:    sw a3, 4(sp)
-; RV32IFD-NEXT:    sw a2, 0(sp)
-; RV32IFD-NEXT:    sw a1, 12(sp)
 ; RV32IFD-NEXT:    sw a0, 8(sp)
-; RV32IFD-NEXT:    fld fa5, 0(sp)
+; RV32IFD-NEXT:    sw a1, 12(sp)
+; RV32IFD-NEXT:    fld fa5, 8(sp)
+; RV32IFD-NEXT:    sw a2, 8(sp)
+; RV32IFD-NEXT:    sw a3, 12(sp)
 ; RV32IFD-NEXT:    fld fa4, 8(sp)
-; RV32IFD-NEXT:    fadd.d fa0, fa4, fa5
+; RV32IFD-NEXT:    fadd.d fa0, fa5, fa4
 ; RV32IFD-NEXT:    addi sp, sp, 16
 ; RV32IFD-NEXT:    ret
 ;
@@ -1276,17 +1270,7 @@ define double @fmv_d_x(i64 %a, i64 %b) nounwind {
 ;
 ; RV32IZFINXZDINX-LABEL: fmv_d_x:
 ; RV32IZFINXZDINX:       # %bb.0:
-; RV32IZFINXZDINX-NEXT:    addi sp, sp, -16
-; RV32IZFINXZDINX-NEXT:    sw a3, 4(sp)
-; RV32IZFINXZDINX-NEXT:    sw a2, 0(sp)
-; RV32IZFINXZDINX-NEXT:    sw a1, 12(sp)
-; RV32IZFINXZDINX-NEXT:    sw a0, 8(sp)
-; RV32IZFINXZDINX-NEXT:    lw a0, 0(sp)
-; RV32IZFINXZDINX-NEXT:    lw a1, 4(sp)
-; RV32IZFINXZDINX-NEXT:    lw a2, 8(sp)
-; RV32IZFINXZDINX-NEXT:    lw a3, 12(sp)
-; RV32IZFINXZDINX-NEXT:    fadd.d a0, a2, a0
-; RV32IZFINXZDINX-NEXT:    addi sp, sp, 16
+; RV32IZFINXZDINX-NEXT:    fadd.d a0, a0, a2
 ; RV32IZFINXZDINX-NEXT:    ret
 ;
 ; RV64IZFINXZDINX-LABEL: fmv_d_x:
diff --git a/llvm/test/CodeGen/RISCV/inline-asm-d-constraint-f.ll b/llvm/test/CodeGen/RISCV/inline-asm-d-constraint-f.ll
index 71769a800c06bf..c480ba800c6904 100644
--- a/llvm/test/CodeGen/RISCV/inline-asm-d-constraint-f.ll
+++ b/llvm/test/CodeGen/RISCV/inline-asm-d-constraint-f.ll
@@ -75,24 +75,10 @@ define double @constraint_f_double_abi_name(double %a) nounwind {
 define double @constraint_gpr(double %x) {
 ; RV32F-LABEL: constraint_gpr:
 ; RV32F:       # %bb.0:
-; RV32F-NEXT:    addi sp, sp, -32
-; RV32F-NEXT:    .cfi_def_cfa_offset 32
-; RV32F-NEXT:    sw a0, 8(sp)
-; RV32F-NEXT:    sw a1, 12(sp)
-; RV32F-NEXT:    fld fa5, 8(sp)
-; RV32F-NEXT:    fsd fa5, 24(sp)
-; RV32F-NEXT:    lw a0, 24(sp)
-; RV32F-NEXT:    lw a1, 28(sp)
+; RV32F-NEXT:    .cfi_def_cfa_offset 0
 ; RV32F-NEXT:    #APP
 ; RV32F-NEXT:    mv a0, a0
 ; RV32F-NEXT:    #NO_APP
-; RV32F-NEXT:    sw a1, 20(sp)
-; RV32F-NEXT:    sw a0, 16(sp)
-; RV32F-NEXT:    fld fa5, 16(sp)
-; RV32F-NEXT:    fsd fa5, 8(sp)
-; RV32F-NEXT:    lw a0, 8(sp)
-; RV32F-NEXT:    lw a1, 12(sp)
-; RV32F-NEXT:    addi sp, sp, 32
 ; RV32F-NEXT:    ret
 ;
 ; RV64F-LABEL: constraint_gpr:
diff --git a/llvm/test/CodeGen/RISCV/rvv/fixed-vectors-bitcast.ll b/llvm/test/CodeGen/RISCV/rvv/fixed-vectors-bitcast.ll
index b7afee754f68b3..5252eb71c383db 100644
--- a/llvm/test/CodeGen/RISCV/rvv/fixed-vectors-bitcast.ll
+++ b/llvm/test/CodeGen/RISCV/rvv/fixed-vectors-bitcast.ll
@@ -416,8 +416,8 @@ define double @bitcast_v1i64_f64(<1 x i64> %a) {
 ; RV32ELEN32:       # %bb.0:
 ; RV32ELEN32-NEXT:    addi sp, sp, -16
 ; RV32ELEN32-NEXT:    .cfi_def_cfa_offset 16
-; RV32ELEN32-NEXT:    sw a1, 12(sp)
 ; RV32ELEN32-NEXT:    sw a0, 8(sp)
+; RV32ELEN32-NEXT:    sw a1, 12(sp)
 ; RV32ELEN32-NEXT:    fld fa0, 8(sp)
 ; RV32ELEN32-NEXT:    addi sp, sp, 16
 ; RV32ELEN32-NEXT:    ret
diff --git a/llvm/test/CodeGen/RISCV/spill-fill-fold.ll b/llvm/test/CodeGen/RISCV/spill-fill-fold.ll
index a9a0cc5cf94d85..8cf5f55ad5c51d 100644
--- a/llvm/test/CodeGen/RISCV/spill-fill-fold.ll
+++ b/llvm/test/CodeGen/RISCV/spill-fill-fold.ll
@@ -290,8 +290,8 @@ define double @spill_i64_to_double(i64 %a) nounwind {
 ; RV32ID-NEXT:    fsd fs9, 40(sp) # 8-byte Folded Spill
 ; RV32ID-NEXT:    fsd fs10, 32(sp) # 8-byte Folded Spill
 ; RV32ID-NEXT:    fsd fs11, 24(sp) # 8-byte Folded Spill
-; RV32ID-NEXT:    sw a1, 20(sp)
 ; RV32ID-NEXT:    sw a0, 16(sp)
+; RV32ID-NEXT:    sw a1, 20(sp)
 ; RV32ID-NEXT:    fld fa5, 16(sp)
 ; RV32ID-NEXT:    fsd fa5, 8(sp) # 8-byte Folded Spill
 ; RV32ID-NEXT:    #APP
@@ -804,13 +804,15 @@ define double @fill_i64_to_double(i64 %a) nounwind {
 ; RV32ID-NEXT:    fsd fs9, 40(sp) # 8-byte Folded Spill
 ; RV32ID-NEXT:    fsd fs10, 32(sp) # 8-byte Folded Spill
 ; RV32ID-NEXT:    fsd fs11, 24(sp) # 8-byte Folded Spill
-; RV32ID-NEXT:    sw a1, 20(sp)
-; RV32ID-NEXT:    sw a0, 16(sp)
-; RV32ID-NEXT:    fld fa5, 16(sp)
-; RV32ID-NEXT:    fsd fa5, 8(sp) # 8-byte Folded Spill
+; RV32ID-NEXT:    sw a1, 12(sp) # 4-byte Folded Spill
+; RV32ID-NEXT:    sw a0, 8(sp) # 4-byte Folded Spill
 ; RV32ID-NEXT:    #APP
 ; RV32ID-NEXT:    #NO_APP
-; RV32ID-NEXT:    fld fa0, 8(sp) # 8-byte Folded Reload
+; RV32ID-NEXT:    lw a0, 8(sp) # 4-byte Folded Reload
+; RV32ID-NEXT:    sw a0, 16(sp)
+; RV32ID-NEXT:    lw a0, 12(sp) # 4-byte Folded Reload
+; RV32ID-NEXT:    sw a0, 20(sp)
+; RV32ID-NEXT:    fld fa0, 16(sp)
 ; RV32ID-NEXT:    lw ra, 172(sp) # 4-byte Folded Reload
 ; RV32ID-NEXT:    lw s0, 168(sp) # 4-byte Folded Reload
 ; RV32ID-NEXT:    lw s1, 164(sp) # 4-byte Folded Reload

topperc · 2024-03-20T18:23:54Z

This trips the subregister assertions in getRegAllocationHints for this test case with Zdinx on RV32.

define double @foo(i64 noundef %0) {     
  %2 = add i64 %0, 1                                                             
  %3 = bitcast i64 %2 to double                                                  
  %4 = fadd double %3, 1.000000e+00                                              
  ret double %4                                                                  
}

…nts (#85998) With GPR pairs from Zdinx, we can't guarantee there are no subregisters on integer instruction operands. I've been able to get these assertions to fire after some other recent PRs. I've added a FIXME to support this properly. I just wanted to prevent the assertion failure for now. No test case because my other patch #85982 that allowed me to fail the assert hasn't been approved yet, and I don't know for that that patch is required to hit this assert. It's just what exposed it for me. So I think this patch is a good precaution regardless.

wangpc-pp · 2024-03-21T06:34:24Z

llvm/test/CodeGen/RISCV/spill-fill-fold.ll

 ; RV32ID-NEXT:    #APP
 ; RV32ID-NEXT:    #NO_APP
-; RV32ID-NEXT:    fld fa0, 8(sp) # 8-byte Folded Reload
+; RV32ID-NEXT:    lw a0, 8(sp) # 4-byte Folded Reload


This is a regression?

I don't know what to do about it. We spilled the GPRs that were holding the double due to the inline assembly. Then we reloaded the GPRs, then did the copy to FP. I think we must have gotten lucky with instruction scheduling before so that the copy to FPR happened before the inline assembly so we only had to spill and reload an FPR.

wangpc-pp

LGTM.

…nts (llvm#85998) With GPR pairs from Zdinx, we can't guarantee there are no subregisters on integer instruction operands. I've been able to get these assertions to fire after some other recent PRs. I've added a FIXME to support this properly. I just wanted to prevent the assertion failure for now. No test case because my other patch llvm#85982 that allowed me to fail the assert hasn't been approved yet, and I don't know for that that patch is required to hit this assert. It's just what exposed it for me. So I think this patch is a good precaution regardless.

…egardless of Zfa. (llvm#85982) Previously we used BuildPairF64 and SplitF64 only if Zfa was supported since they will select register file moves that are only available with Zfa. We recently changed the handling of BuildPairF64/SplitF64 for Zdinx to not go through memory so we should use that for bitcast. That leaves the D without Zfa case that does need to go through memory. Previously we let type legalization expand to loads and stores using a new stack temporary created for each bitcast. After this patch we will create the loads ands stores in the custom inserter and share the same stack slot for all. This also allows DAGCombiner to optimize when bitcast is mixed with BuildPairF64/SplitF64.

topperc requested review from asb, preames and wangpc-pp March 20, 2024 18:21

llvmbot added the backend:RISC-V label Mar 20, 2024

topperc mentioned this pull request Mar 20, 2024

[RISCV][NFC] Remove VRRegMO of tryAddHint #85755

Closed

wangpc-pp reviewed Mar 21, 2024

View reviewed changes

wangpc-pp approved these changes Mar 21, 2024

View reviewed changes

topperc merged commit f5c90f3 into llvm:main Mar 21, 2024

topperc deleted the pr/bitcast-i64-f64 branch March 21, 2024 15:52

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

[RISCV] Use BuildPairF64 and SplitF64 for bitcast i64<->f64 on rv32 regardless of Zfa. #85982

[RISCV] Use BuildPairF64 and SplitF64 for bitcast i64<->f64 on rv32 regardless of Zfa. #85982

Uh oh!

topperc commented Mar 20, 2024

Uh oh!

llvmbot commented Mar 20, 2024

Uh oh!

topperc commented Mar 20, 2024 •

edited

Loading

Uh oh!

wangpc-pp Mar 21, 2024

Uh oh!

topperc Mar 21, 2024

Uh oh!

wangpc-pp left a comment

Uh oh!

Uh oh!

[RISCV] Use BuildPairF64 and SplitF64 for bitcast i64<->f64 on rv32 regardless of Zfa. #85982

[RISCV] Use BuildPairF64 and SplitF64 for bitcast i64<->f64 on rv32 regardless of Zfa. #85982

Uh oh!

Conversation

topperc commented Mar 20, 2024

Uh oh!

llvmbot commented Mar 20, 2024

Uh oh!

topperc commented Mar 20, 2024 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

wangpc-pp Mar 21, 2024

Choose a reason for hiding this comment

Uh oh!

topperc Mar 21, 2024

Choose a reason for hiding this comment

Uh oh!

wangpc-pp left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

topperc commented Mar 20, 2024 •

edited

Loading