[MIPS][float] Fixed SingleFloat codegen on N32/N64 targets #140575

Tazdevil971 · 2025-05-19T16:51:24Z

This patch aims at making the combination of single-float and N32/N64 ABI properly work.

Right now when both options are enabled the compiler chooses an incorrect ABI and in some cases even generates wrong instructions.

The floating point behavior on MIPS is controlled through 3 flags: soft-float, single-float, fp64. This makes things complicated because fp64 indicates the presence of 64bit floating point registers, but cannot be easily disabled (the mips3 feature require it, but mips3 CPUs with only 32bit floating point exist). Also if fp64 is missing it doesn't actually disable 64bit floating point operations, because certain MIPS1/2 CPUs support 64bit floating point with 32bit registers, hence the single-float option.

I'm guessing that originally single-float was only intended for the latter case, and that's the reason why it doesn't properly work on 64bit targets.

So this patch does the following:

Make single-float a "master disable", even if fp64 is enabled this should completely disable generation of 64bit floating point operations, making it available on targets which hard require fp64.
Add proper calling conventions for N32/N64 single-float combinations.
Fixup codegen to not generate certain 64bit floating point operations, apparently not assigning a register class to f64 values is not enough to prevent them from showing up.
Add tests for the new calling conventions and codegen.

Tazdevil971 · 2025-05-21T12:50:23Z

Ping @nikic @arsenm

llvmbot · 2025-05-21T14:53:17Z

@llvm/pr-subscribers-backend-mips

Author: Davide (Tazdevil971)

Changes

This patch aims at making the combination of single-float and N32/N64 ABI properly work.

Right now when both options are enabled the compiler chooses an incorrect ABI and in some cases even generates wrong instructions.

The floating point behavior on MIPS is controlled through 3 flags: soft-float, single-float, fp64. This makes things complicated because fp64 indicates the presence of 64bit floating point registers, but cannot be easily disabled (the mips3 feature require it, but mips3 CPUs with only 32bit floating point exist). Also if fp64 is missing it doesn't actually disable 64bit floating point operations, because certain MIPS1/2 CPUs support 64bit floating point with 32bit registers, hence the single-float option.

I'm guessing that originally single-float was only intended for the latter case, and that's the reason why it doesn't properly work on 64bit targets.

So this patch does the following:

Make single-float a "master disable", even if fp64 is enabled this should completely disable generation of 64bit floating point operations, making it available on targets which hard require fp64.
Add proper calling conventions for N32/N64 single-float combinations.
Fixup codegen to not generate certain 64bit floating point operations, apparently not assigning a register class to f64 values is not enough to prevent them from showing up.
Add tests for the new calling conventions and codegen.

Patch is 38.45 KiB, truncated to 20.00 KiB below, full version: https://github.com/llvm/llvm-project/pull/140575.diff

11 Files Affected:

(modified) llvm/lib/Target/Mips/MipsCallingConv.td (+17-9)
(modified) llvm/lib/Target/Mips/MipsISelLowering.cpp (+10-4)
(modified) llvm/lib/Target/Mips/MipsRegisterInfo.cpp (+30-8)
(modified) llvm/lib/Target/Mips/MipsSEISelLowering.cpp (+15)
(added) llvm/test/CodeGen/Mips/cconv/arguments-hard-float-single-varargs.ll (+148)
(added) llvm/test/CodeGen/Mips/cconv/arguments-hard-float-single.ll (+224)
(added) llvm/test/CodeGen/Mips/cconv/arguments-hard-fp128-single.ll (+42)
(added) llvm/test/CodeGen/Mips/cconv/callee-saved-float-single.ll (+111)
(added) llvm/test/CodeGen/Mips/cconv/return-hard-float-single.ll (+43)
(added) llvm/test/CodeGen/Mips/cconv/return-hard-fp128-single.ll (+24)
(modified) llvm/test/CodeGen/Mips/int-to-float-conversion.ll (+34-18)

diff --git a/llvm/lib/Target/Mips/MipsCallingConv.td b/llvm/lib/Target/Mips/MipsCallingConv.td
index 3c60114f507b9..929eb7e391667 100644
--- a/llvm/lib/Target/Mips/MipsCallingConv.td
+++ b/llvm/lib/Target/Mips/MipsCallingConv.td
@@ -196,7 +196,8 @@ def RetCC_MipsN : CallingConv<[
   //
   // f128 should only occur for the N64 ABI where long double is 128-bit. On
   // N32, long double is equivalent to double.
-  CCIfType<[i64], CCIfOrigArgWasF128<CCDelegateTo<RetCC_F128>>>,
+  CCIfSubtargetNot<"isSingleFloat()",
+      CCIfType<[i64], CCIfOrigArgWasF128<CCDelegateTo<RetCC_F128>>>>,
 
   // Aggregate returns are positioned at the lowest address in the slot for
   // both little and big-endian targets. When passing in registers, this
@@ -333,9 +334,10 @@ def CC_Mips_FixedArg : CallingConv<[
   //
   // f128 should only occur for the N64 ABI where long double is 128-bit. On
   // N32, long double is equivalent to double.
-  CCIfType<[i64],
-      CCIfSubtargetNot<"useSoftFloat()",
-          CCIfOrigArgWasF128<CCBitConvertToType<f64>>>>,
+  CCIfType<[i64], 
+      CCIfSubtargetNot<"isSingleFloat()",
+          CCIfSubtargetNot<"useSoftFloat()", 
+              CCIfOrigArgWasF128<CCBitConvertToType<f64>>>>>,
 
   CCIfCC<"CallingConv::Fast", CCDelegateTo<CC_Mips_FastCC>>,
 
@@ -359,8 +361,8 @@ def CC_Mips : CallingConv<[
 // Callee-saved register lists.
 //===----------------------------------------------------------------------===//
 
-def CSR_SingleFloatOnly : CalleeSavedRegs<(add (sequence "F%u", 31, 20), RA, FP,
-                                               (sequence "S%u", 7, 0))>;
+def CSR_O32_SingleFloat : CalleeSavedRegs<(add(sequence "F%u", 31, 20), RA, FP,
+                              (sequence "S%u", 7, 0))>;
 
 def CSR_O32_FPXX : CalleeSavedRegs<(add (sequence "D%u", 15, 10), RA, FP,
                                         (sequence "S%u", 7, 0))> {
@@ -374,13 +376,19 @@ def CSR_O32_FP64 :
   CalleeSavedRegs<(add (decimate (sequence "D%u_64", 30, 20), 2), RA, FP,
                        (sequence "S%u", 7, 0))>;
 
-def CSR_N32 : CalleeSavedRegs<(add D20_64, D22_64, D24_64, D26_64, D28_64,
-                                   D30_64, RA_64, FP_64, GP_64,
-                                   (sequence "S%u_64", 7, 0))>;
+def CSR_N32 : CalleeSavedRegs<(add(decimate(sequence "D%u_64", 30, 20), 2),
+                  RA_64, FP_64, GP_64, (sequence "S%u_64", 7, 0))>;
+
+def CSR_N32_SingleFloat
+    : CalleeSavedRegs<(add(decimate(sequence "F%u", 30, 20), 2), RA_64, FP_64,
+          GP_64, (sequence "S%u_64", 7, 0))>;
 
 def CSR_N64 : CalleeSavedRegs<(add (sequence "D%u_64", 31, 24), RA_64, FP_64,
                                    GP_64, (sequence "S%u_64", 7, 0))>;
 
+def CSR_N64_SingleFloat : CalleeSavedRegs<(add(sequence "F%u", 31, 24), RA_64,
+                              FP_64, GP_64, (sequence "S%u_64", 7, 0))>;
+
 def CSR_Mips16RetHelper :
   CalleeSavedRegs<(add V0, V1, FP,
                    (sequence "A%u", 3, 0), (sequence "S%u", 7, 0),
diff --git a/llvm/lib/Target/Mips/MipsISelLowering.cpp b/llvm/lib/Target/Mips/MipsISelLowering.cpp
index e933e97ea3706..6cf35397f52ec 100644
--- a/llvm/lib/Target/Mips/MipsISelLowering.cpp
+++ b/llvm/lib/Target/Mips/MipsISelLowering.cpp
@@ -4295,10 +4295,16 @@ parseRegForInlineAsmConstraint(StringRef C, MVT VT) const {
     return std::make_pair(0U, nullptr);
 
   if (Prefix == "$f") { // Parse $f0-$f31.
-    // If the size of FP registers is 64-bit or Reg is an even number, select
-    // the 64-bit register class. Otherwise, select the 32-bit register class.
-    if (VT == MVT::Other)
-      VT = (Subtarget.isFP64bit() || !(Reg % 2)) ? MVT::f64 : MVT::f32;
+    // If the targets is single float only, always select 32-bit registers,
+    // otherwise if the size of FP registers is 64-bit or Reg is an even number,
+    // select the 64-bit register class. Otherwise, select the 32-bit register
+    // class.
+    if (VT == MVT::Other) {
+      if (Subtarget.isSingleFloat())
+        VT = MVT::f32;
+      else
+        VT = (Subtarget.isFP64bit() || !(Reg % 2)) ? MVT::f64 : MVT::f32;
+    }
 
     RC = getRegClassFor(VT);
 
diff --git a/llvm/lib/Target/Mips/MipsRegisterInfo.cpp b/llvm/lib/Target/Mips/MipsRegisterInfo.cpp
index ae4b2377ad218..66288419e363d 100644
--- a/llvm/lib/Target/Mips/MipsRegisterInfo.cpp
+++ b/llvm/lib/Target/Mips/MipsRegisterInfo.cpp
@@ -102,14 +102,25 @@ MipsRegisterInfo::getCalleeSavedRegs(const MachineFunction *MF) const {
                                      : CSR_Interrupt_32_SaveList;
   }
 
-  if (Subtarget.isSingleFloat())
-    return CSR_SingleFloatOnly_SaveList;
+  // N64 ABI
+  if (Subtarget.isABI_N64()) {
+    if (Subtarget.isSingleFloat())
+      return CSR_N64_SingleFloat_SaveList;
 
-  if (Subtarget.isABI_N64())
     return CSR_N64_SaveList;
+  }
+
+  // N32 ABI
+  if (Subtarget.isABI_N32()) {
+    if (Subtarget.isSingleFloat())
+      return CSR_N32_SingleFloat_SaveList;
 
-  if (Subtarget.isABI_N32())
     return CSR_N32_SaveList;
+  }
+
+  // O32 ABI
+  if (Subtarget.isSingleFloat())
+    return CSR_O32_SingleFloat_SaveList;
 
   if (Subtarget.isFP64bit())
     return CSR_O32_FP64_SaveList;
@@ -124,14 +135,25 @@ const uint32_t *
 MipsRegisterInfo::getCallPreservedMask(const MachineFunction &MF,
                                        CallingConv::ID) const {
   const MipsSubtarget &Subtarget = MF.getSubtarget<MipsSubtarget>();
-  if (Subtarget.isSingleFloat())
-    return CSR_SingleFloatOnly_RegMask;
+  // N64 ABI
+  if (Subtarget.isABI_N64()) {
+    if (Subtarget.isSingleFloat())
+      return CSR_N64_SingleFloat_RegMask;
 
-  if (Subtarget.isABI_N64())
     return CSR_N64_RegMask;
+  }
+
+  // N32 ABI
+  if (Subtarget.isABI_N32()) {
+    if (Subtarget.isSingleFloat())
+      return CSR_N32_SingleFloat_RegMask;
 
-  if (Subtarget.isABI_N32())
     return CSR_N32_RegMask;
+  }
+
+  // O32 ABI
+  if (Subtarget.isSingleFloat())
+    return CSR_O32_SingleFloat_RegMask;
 
   if (Subtarget.isFP64bit())
     return CSR_O32_FP64_RegMask;
diff --git a/llvm/lib/Target/Mips/MipsSEISelLowering.cpp b/llvm/lib/Target/Mips/MipsSEISelLowering.cpp
index 71a70d9c2dd46..c5cb52dd91191 100644
--- a/llvm/lib/Target/Mips/MipsSEISelLowering.cpp
+++ b/llvm/lib/Target/Mips/MipsSEISelLowering.cpp
@@ -28,6 +28,7 @@
 #include "llvm/CodeGen/SelectionDAG.h"
 #include "llvm/CodeGen/SelectionDAGNodes.h"
 #include "llvm/CodeGen/TargetInstrInfo.h"
+#include "llvm/CodeGen/TargetLowering.h"
 #include "llvm/CodeGen/TargetSubtargetInfo.h"
 #include "llvm/CodeGen/ValueTypes.h"
 #include "llvm/CodeGenTypes/MachineValueType.h"
@@ -211,6 +212,20 @@ MipsSETargetLowering::MipsSETargetLowering(const MipsTargetMachine &TM,
     }
   }
 
+  // Targets with 64bits integer registers, but no 64bit floating point register
+  // do not support conversion between them
+  if (Subtarget.isGP64bit() && Subtarget.isSingleFloat() &&
+      !Subtarget.useSoftFloat()) {
+    setOperationAction(ISD::FP_TO_SINT, MVT::i64, Expand);
+    setOperationAction(ISD::FP_TO_UINT, MVT::i64, Expand);
+    setOperationAction(ISD::STRICT_FP_TO_SINT, MVT::i64, Expand);
+    setOperationAction(ISD::STRICT_FP_TO_UINT, MVT::i64, Expand);
+    setOperationAction(ISD::SINT_TO_FP, MVT::i64, Expand);
+    setOperationAction(ISD::UINT_TO_FP, MVT::i64, Expand);
+    setOperationAction(ISD::STRICT_SINT_TO_FP, MVT::i64, Expand);
+    setOperationAction(ISD::STRICT_UINT_TO_FP, MVT::i64, Expand);
+  }
+
   setOperationAction(ISD::SMUL_LOHI,          MVT::i32, Custom);
   setOperationAction(ISD::UMUL_LOHI,          MVT::i32, Custom);
   setOperationAction(ISD::MULHS,              MVT::i32, Custom);
diff --git a/llvm/test/CodeGen/Mips/cconv/arguments-hard-float-single-varargs.ll b/llvm/test/CodeGen/Mips/cconv/arguments-hard-float-single-varargs.ll
new file mode 100644
index 0000000000000..8cbc879310f61
--- /dev/null
+++ b/llvm/test/CodeGen/Mips/cconv/arguments-hard-float-single-varargs.ll
@@ -0,0 +1,148 @@
+; RUN: llc -mtriple=mips -relocation-model=static -mattr=single-float < %s \
+; RUN:   | FileCheck --check-prefixes=ALL,SYM32,O32 %s
+; RUN: llc -mtriple=mipsel -relocation-model=static -mattr=single-float < %s \
+; RUN:   | FileCheck --check-prefixes=ALL,SYM32,O32 %s
+
+; RUN: llc -mtriple=mips64 -relocation-model=static -target-abi n32 -mattr=single-float < %s \
+; RUN:   | FileCheck --check-prefixes=ALL,SYM32,N32,NEW,NEWBE %s
+; RUN: llc -mtriple=mips64el -relocation-model=static -target-abi n32 -mattr=single-float < %s \
+; RUN:   | FileCheck --check-prefixes=ALL,SYM32,N32,NEW,NEWLE %s
+
+; RUN: llc -mtriple=mips64 -relocation-model=static -target-abi n64 -mattr=single-float < %s \
+; RUN:   | FileCheck --check-prefixes=ALL,SYM64,N64,NEW,NEWBE %s
+; RUN: llc -mtriple=mips64el -relocation-model=static -target-abi n64 -mattr=single-float < %s \
+; RUN:   | FileCheck --check-prefixes=ALL,SYM64,N64,NEW,NEWLE %s
+
+@floats = global [11 x float] zeroinitializer
+@doubles = global [11 x double] zeroinitializer
+
+define void @double_args(double %a, ...)
+                         nounwind {
+entry:
+        %0 = getelementptr [11 x double], ptr @doubles, i32 0, i32 1
+        store volatile double %a, ptr %0
+
+        %ap = alloca ptr
+        call void @llvm.va_start(ptr %ap)
+        %b = va_arg ptr %ap, double
+        %1 = getelementptr [11 x double], ptr @doubles, i32 0, i32 2
+        store volatile double %b, ptr %1
+        call void @llvm.va_end(ptr %ap)
+        ret void
+}
+
+; ALL-LABEL: double_args:
+; We won't test the way the global address is calculated in this test. This is
+; just to get the register number for the other checks.
+; SYM32-DAG:         addiu [[R2:\$[0-9]+]], ${{[0-9]+}}, %lo(doubles)
+; SYM64-DAG:         daddiu [[R2:\$[0-9]+]], ${{[0-9]+}}, %lo(doubles)
+
+; O32 forbids using floating point registers for the non-variable portion.
+; N32/N64 allow it.
+; O32-DAG:           sw $4, 8([[R2]])
+; O32-DAG:           sw $5, 12([[R2]])
+; NEW-DAG:           sd $4, 8([[R2]])
+
+; The varargs portion is dumped to stack
+; O32-DAG:           sw $6, 16($sp)
+; O32-DAG:           sw $7, 20($sp)
+; NEW-DAG:           sd $5, 8($sp)
+; NEW-DAG:           sd $6, 16($sp)
+; NEW-DAG:           sd $7, 24($sp)
+; NEW-DAG:           sd $8, 32($sp)
+; NEW-DAG:           sd $9, 40($sp)
+; NEW-DAG:           sd $10, 48($sp)
+; NEW-DAG:           sd $11, 56($sp)
+
+; Get the varargs pointer
+; O32 has 4 bytes padding, 4 bytes for the varargs pointer, and 8 bytes reserved
+; for arguments 1 and 2.
+; N32/N64 has 8 bytes for the varargs pointer, and no reserved area.
+; O32-DAG:           addiu [[VAPTR:\$[0-9]+]], $sp, 16
+; O32-DAG:           sw [[VAPTR]], 4($sp)
+; N32-DAG:           addiu [[VAPTR:\$[0-9]+]], $sp, 8
+; N32-DAG:           sw [[VAPTR]], 4($sp)
+; N64-DAG:           daddiu [[VAPTR:\$[0-9]+]], $sp, 8
+; N64-DAG:           sd [[VAPTR]], 0($sp)
+
+; Increment the pointer then get the varargs arg
+; LLVM will rebind the load to the stack pointer instead of the varargs pointer
+; during lowering. This is fine and doesn't change the behaviour.
+; O32-DAG:           addiu [[VAPTR]], [[VAPTR]], 8
+; N32-DAG:           addiu [[VAPTR]], [[VAPTR]], 8
+; N64-DAG:           daddiu [[VAPTR]], [[VAPTR]], 8
+; O32-DAG:           lw	[[R3:\$[0-9]+]], 16($sp)
+; O32-DAG:           lw	[[R4:\$[0-9]+]], 20($sp)
+; O32-DAG:           sw [[R3]], 16([[R2]])
+; O32-DAG:           sw [[R4]], 20([[R2]])
+; NEW-DAG:           ld [[R3:\$[0-9]+]], 8($sp)
+; NEW-DAG:           sd [[R3]], 16([[R2]])
+
+define void @float_args(float %a, ...) nounwind {
+entry:
+        %0 = getelementptr [11 x float], ptr @floats, i32 0, i32 1
+        store volatile float %a, ptr %0
+
+        %ap = alloca ptr
+        call void @llvm.va_start(ptr %ap)
+        %b = va_arg ptr %ap, float
+        %1 = getelementptr [11 x float], ptr @floats, i32 0, i32 2
+        store volatile float %b, ptr %1
+        call void @llvm.va_end(ptr %ap)
+        ret void
+}
+
+; ALL-LABEL: float_args:
+; We won't test the way the global address is calculated in this test. This is
+; just to get the register number for the other checks.
+; SYM32-DAG:         addiu [[R2:\$[0-9]+]], ${{[0-9]+}}, %lo(floats)
+; SYM64-DAG:         daddiu [[R2:\$[0-9]+]], ${{[0-9]+}}, %lo(floats)
+
+; The first four arguments are the same in O32/N32/N64.
+; The non-variable portion should be unaffected.
+; O32-DAG:           mtc1 $4, $f0
+; O32-DAG:           swc1 $f0, 4([[R2]])
+; NEW-DAG:           swc1 $f12, 4([[R2]])
+
+; The varargs portion is dumped to stack
+; O32-DAG:           sw $5, 12($sp)
+; O32-DAG:           sw $6, 16($sp)
+; O32-DAG:           sw $7, 20($sp)
+; NEW-DAG:           sd $5, 8($sp)
+; NEW-DAG:           sd $6, 16($sp)
+; NEW-DAG:           sd $7, 24($sp)
+; NEW-DAG:           sd $8, 32($sp)
+; NEW-DAG:           sd $9, 40($sp)
+; NEW-DAG:           sd $10, 48($sp)
+; NEW-DAG:           sd $11, 56($sp)
+
+; Get the varargs pointer
+; O32 has 4 bytes padding, 4 bytes for the varargs pointer, and should have 8
+; bytes reserved for arguments 1 and 2 (the first float arg) but as discussed in
+; arguments-float.ll, GCC doesn't agree with MD00305 and treats floats as 4
+; bytes so we only have 12 bytes total.
+; N32/N64 has 8 bytes for the varargs pointer, and no reserved area.
+; O32-DAG:           addiu [[VAPTR:\$[0-9]+]], $sp, 12
+; O32-DAG:           sw [[VAPTR]], 4($sp)
+; N32-DAG:           addiu [[VAPTR:\$[0-9]+]], $sp, 8
+; N32-DAG:           sw [[VAPTR]], 4($sp)
+; N64-DAG:           daddiu [[VAPTR:\$[0-9]+]], $sp, 8
+; N64-DAG:           sd [[VAPTR]], 0($sp)
+
+; Increment the pointer then get the varargs arg
+; LLVM will rebind the load to the stack pointer instead of the varargs pointer
+; during lowering. This is fine and doesn't change the behaviour.
+; Also, in big-endian mode the offset must be increased by 4 to retrieve the
+; correct half of the argument slot.
+;
+; O32-DAG:           addiu [[VAPTR]], [[VAPTR]], 4
+; N32-DAG:           addiu [[VAPTR]], [[VAPTR]], 8
+; N64-DAG:           daddiu [[VAPTR]], [[VAPTR]], 8
+; O32-DAG:           lwc1 [[FTMP1:\$f[0-9]+]], 12($sp)
+; NEWLE-DAG:         lwc1 [[FTMP1:\$f[0-9]+]], 8($sp)
+; NEWBE-DAG:         lwc1 [[FTMP1:\$f[0-9]+]], 12($sp)
+; ALL-DAG:           swc1 [[FTMP1]], 8([[R2]])
+
+declare void @llvm.va_start(ptr)
+declare void @llvm.va_copy(ptr, ptr)
+declare void @llvm.va_end(ptr)
diff --git a/llvm/test/CodeGen/Mips/cconv/arguments-hard-float-single.ll b/llvm/test/CodeGen/Mips/cconv/arguments-hard-float-single.ll
new file mode 100644
index 0000000000000..6b7ad03c8e1c2
--- /dev/null
+++ b/llvm/test/CodeGen/Mips/cconv/arguments-hard-float-single.ll
@@ -0,0 +1,224 @@
+; RUN: llc -mtriple=mips -relocation-model=static -mattr=single-float < %s \
+; RUN:   | FileCheck --check-prefixes=ALL,SYM32,O32 %s
+; RUN: llc -mtriple=mipsel -relocation-model=static -mattr=single-float < %s \
+; RUN:   | FileCheck --check-prefixes=ALL,SYM32,O32 %s
+
+; RUN: llc -mtriple=mips64 -relocation-model=static -target-abi n32 -mattr=single-float < %s \
+; RUN:   | FileCheck --check-prefixes=ALL,SYM32,NEW %s
+; RUN: llc -mtriple=mips64el -relocation-model=static -target-abi n32 -mattr=single-float < %s \
+; RUN:   | FileCheck --check-prefixes=ALL,SYM32,NEW %s
+
+; RUN: llc -mtriple=mips64 -relocation-model=static -target-abi n64 -mattr=single-float < %s \
+; RUN:   | FileCheck --check-prefixes=ALL,SYM64,NEW %s
+; RUN: llc -mtriple=mips64el -relocation-model=static -target-abi n64 -mattr=single-float < %s \
+; RUN:   | FileCheck --check-prefixes=ALL,SYM64,NEW %s
+
+@bytes = global [11 x i8] zeroinitializer
+@dwords = global [11 x i64] zeroinitializer
+@floats = global [11 x float] zeroinitializer
+@doubles = global [11 x double] zeroinitializer
+
+define void @double_args(double %a, double %b, double %c, double %d, double %e,
+                         double %f, double %g, double %h, double %i) nounwind {
+entry:
+        %0 = getelementptr [11 x double], ptr @doubles, i32 0, i32 1
+        store volatile double %a, ptr %0
+        %1 = getelementptr [11 x double], ptr @doubles, i32 0, i32 2
+        store volatile double %b, ptr %1
+        %2 = getelementptr [11 x double], ptr @doubles, i32 0, i32 3
+        store volatile double %c, ptr %2
+        %3 = getelementptr [11 x double], ptr @doubles, i32 0, i32 4
+        store volatile double %d, ptr %3
+        %4 = getelementptr [11 x double], ptr @doubles, i32 0, i32 5
+        store volatile double %e, ptr %4
+        %5 = getelementptr [11 x double], ptr @doubles, i32 0, i32 6
+        store volatile double %f, ptr %5
+        %6 = getelementptr [11 x double], ptr @doubles, i32 0, i32 7
+        store volatile double %g, ptr %6
+        %7 = getelementptr [11 x double], ptr @doubles, i32 0, i32 8
+        store volatile double %h, ptr %7
+        %8 = getelementptr [11 x double], ptr @doubles, i32 0, i32 9
+        store volatile double %i, ptr %8
+        ret void
+}
+
+; ALL-LABEL: double_args:
+; We won't test the way the global address is calculated in this test. This is
+; just to get the register number for the other checks.
+; SYM32-DAG:         addiu [[R2:\$[0-9]+]], ${{[0-9]+}}, %lo(doubles)
+; SYM64-DAG:         daddiu [[R2:\$[0-9]+]], ${{[0-9]+}}, %lo(doubles)
+
+; The first four arguments are the same in O32/N32/N64.
+; The first argument is floating point but single-float is enabled so floating
+; point registers are not used.
+; O32-DAG:           sw $4, 8([[R2]])
+; O32-DAG:           sw $5, 12([[R2]])
+; NEW-DAG:           sd $4, 8([[R2]])
+
+; O32-DAG:           sw $6, 16([[R2]])
+; O32-DAG:           sw $7, 20([[R2]])
+; NEW-DAG:           sd $5, 16([[R2]])
+
+; O32 has run out of argument registers and starts using the stack
+; O32-DAG:           lw [[R3:\$([0-9]+|gp)]], 16($sp)
+; O32-DAG:           lw [[R4:\$([0-9]+|gp)]], 20($sp)
+; O32-DAG:           sw [[R3]], 24([[R2]])
+; O32-DAG:           sw [[R4]], 28([[R2]])
+; NEW-DAG:           sd $6, 24([[R2]])
+
+; O32-DAG:           lw [[R3:\$([0-9]+|gp)]], 24($sp)
+; O32-DAG:           lw [[R4:\$([0-9]+|gp)]], 28($sp)
+; O32-DAG:           sw [[R3]], 32([[R2]])
+; O32-DAG:           sw [[R4]], 36([[R2]])
+; NEW-DAG:           sd $7, 32([[R2]])
+
+; O32-DAG:           lw [[R3:\$([0-9]+|gp)]], 32($sp)
+; O32-DAG:           lw [[R4:\$([0-9]+|gp)]], 36($sp)
+; O32-DAG:           sw [[R3]], 40([[R2]])
+; O32-DAG:           sw [[R4]], 44([[R2]])
+; NEW-DAG:           sd $8, 40([[R2]])
+
+; O32-DAG:           lw [[R3:\$([0-9]+|gp)]], 40($sp)
+; O32-DAG:           lw [[R4:\$([0-9]+|gp)]], 44($sp)
+; O32-DAG:           sw [[R3]], 48([[R2]])
+; O32-DAG:           sw [[R4]], 52([[R2]])
+; NEW-DAG:           sd $9, 48([[R2]])
+
+; O32-DAG:           lw [[R3:\$([0-9]+|gp)]], 48($sp)
+; O32-DAG:           lw [[R4:\$([0-9]+|gp)]], 52($sp)
+; O32-DAG:           sw [[R3]], 56([[R2]])
+; O32-DAG:           sw [[R4]], 60([[R2]])
+; NEW-DAG:           sd $10, 56([[R2]])
+
+; N32/N64 have run out of registers and starts using the stack too
+; O32-DAG:           lw [[R3:\$[0-9]+]], 56($sp)
+; O32-DAG:           lw [[R4:\$[0-9]+]], 60($sp)
+; O32-DAG:           sw [[R3]], 64([[R2]])
+; O32-DAG:           sw [[R4]], 68([[R2]])
+; NEW-DAG:           ld [[R3:\$[0-9]+]], 0($sp)
+; NEW-DAG:           sd $11, 64([[R2]])
+
+define void @float_args(float %a, float %b, float %c, float %d, float %e,
+                        float %f, float %g, float %h, float %i) nounwind {
+entry:
+        %0 = getelementptr [11 x float], ptr @floats, i32 0, i32 1
+        store volatile float %a, ptr %0
+        %1 = getelementptr [11 x float], ptr @floats, i32 0, i32 2
+        store volatile float %b, ptr %1
+        %2 = getelementptr [11 x float], ptr @floats, i32 0, i32 3
+        store volatile float %c, ptr %2
+        %3 = getelementptr [11 x float], ptr @floats, i32 0, i32 4
+        store volatile float %d, ptr %3
+        %4 = getelementptr [11 x float], ptr @floats, i32 0, i32 5
+        store volatile float %e, ptr %4
+        %5 = getelementptr [11 x float], ptr @floats, i32 0, i32 6
+        store volatile float %f, ptr %5
+        %6 = getelementptr [11 x float], ptr @floats, i32 0, i32 7
+        store volatile float %g, ptr %6
+        %7 = getelementptr [11 x float], ptr @floats, i32 0, i32 8
+        store volatile float %h, ptr %7
+        %8 = getelementptr [11 x float], ptr @floats, i32 0, i32 9
+        store volatile float %i, ptr %8
+        ret void
+}
+
+; ALL-LABEL: float_args:
+; We won...
[truncated]

arsenm · 2025-05-21T14:54:20Z

llvm/test/CodeGen/Mips/cconv/arguments-hard-float-single-varargs.ll

+        %0 = getelementptr [11 x double], ptr @doubles, i32 0, i32 1
+        store volatile double %a, ptr %0
+
+        %ap = alloca ptr
+        call void @llvm.va_start(ptr %ap)
+        %b = va_arg ptr %ap, double
+        %1 = getelementptr [11 x double], ptr @doubles, i32 0, i32 2
+        store volatile double %b, ptr %1
+        call void @llvm.va_end(ptr %ap)
+        ret void


Weird / too much indentation, and should use named values in tests

This is because this tests are updated versions of the soft-float/hard-float variants.

If you look at how tests are currently structured, for example arguments-float.ll, there are two versions arguments-float.ll which tests soft-float behavior, then arguments-hard-float.ll for hard-float behavior, but the underlying test is identical.

So what I did was add a third case for hard single-float, and updated the outcome. So if they look weird it's probably because the test I took this code from use outdated coding conventions.

I didn't want to update the old tests, so I think the best option is to just leave all of this as is.

The only thing I could change is naming, because now that I look at them, all the -hard tests put -hard right after the first word, while I put -single at the end of the path.

So to be more consistent I could change the name to arguments-hard-single-float-varargs.ll for example.

So the IR content is all the same? You could cleanup the old tests and switch them to generated tests, and add new run lines? I guess that will still be a very long test with a few too many run combinations

Yes the IR is the same.

This test need to already deal with 3 ABIs (O32/N32/N64) and 2 endianess, plus if we put all of the combinations of floating point flags this would cram 18 different run combinations in a single test file which I think is wayyyy to many.

But I could switch all of them to generated tests, unless there is a specific reason I'm missing for why they were hand-written. As you can see some registers are captured and matched multiple times, instead of being hardcoded, maybe this is to make the test agnostic to register allocation?

Also I don't quite understand what you mean by "named values in tests" in your first comment, sorry I'm not familiar with test code conventions.

I tried merging all of the different test cases, honestly it's not that bad, you can see it here. You can actually ignore BE/LE variants and end up with just 9 different run combinations.

If you are okay with it I could merge all of the test variants and do some refactoring of the old tests touched by this PR.

arsenm · 2025-05-21T14:55:17Z

llvm/lib/Target/Mips/MipsISelLowering.cpp

+    if (VT == MVT::Other) {
+      if (Subtarget.isSingleFloat())
+        VT = MVT::f32;
+      else
+        VT = (Subtarget.isFP64bit() || !(Reg % 2)) ? MVT::f64 : MVT::f32;


Not sure this is properly covered in the tests, the only asm is a full clobber list which I'm assuming is to stress calling convention handling

You are right, I'll prepare a test case for it

I added a new test case for it. Also while writing that test I discovered another small bug in the inline asm constraint logic, I fixed it in the latest revision and added test for it as well.

brad0 · 2025-05-22T03:02:48Z

cc @yingopq @wzssyqa

Tazdevil971 · 2025-05-23T19:51:50Z

Ok I changed a couple of things:

Added new test cases for the assembly register constraint logic
Renamed tests to be more consistent

I still haven't merged all of the variants of the calling conventions tests as I'm not sure it's a good idea, I'll first try on another branch and see if it could work.

arsenm added the backend:MIPS label May 21, 2025

arsenm reviewed May 21, 2025

View reviewed changes

[MIPS][float] Fixed SingleFloat codegen on N32/N64 targets

2a030d8

Tazdevil971 force-pushed the tazdevil971/dev-single-float branch from 9db2755 to 2a030d8 Compare May 23, 2025 19:45

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

[MIPS][float] Fixed SingleFloat codegen on N32/N64 targets #140575

[MIPS][float] Fixed SingleFloat codegen on N32/N64 targets #140575

Tazdevil971 commented May 19, 2025

Uh oh!

Tazdevil971 commented May 21, 2025

Uh oh!

llvmbot commented May 21, 2025

Uh oh!

arsenm May 21, 2025

Uh oh!

Tazdevil971 May 21, 2025

Uh oh!

arsenm May 21, 2025

Uh oh!

Tazdevil971 May 21, 2025

Uh oh!

Tazdevil971 May 23, 2025

Uh oh!

arsenm May 21, 2025

Uh oh!

Tazdevil971 May 21, 2025

Uh oh!

Tazdevil971 May 23, 2025

Uh oh!

brad0 commented May 22, 2025

Uh oh!

Tazdevil971 commented May 23, 2025

Uh oh!

Uh oh!

[MIPS][float] Fixed SingleFloat codegen on N32/N64 targets #140575

Are you sure you want to change the base?

[MIPS][float] Fixed SingleFloat codegen on N32/N64 targets #140575

Conversation

Tazdevil971 commented May 19, 2025

Uh oh!

Tazdevil971 commented May 21, 2025

Uh oh!

llvmbot commented May 21, 2025

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

brad0 commented May 22, 2025

Uh oh!

Tazdevil971 commented May 23, 2025

Uh oh!

Uh oh!