-
Notifications
You must be signed in to change notification settings - Fork 13.6k
[MIPS][float] Fixed SingleFloat codegen on N32/N64 targets #140575
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
@llvm/pr-subscribers-backend-mips Author: Davide (Tazdevil971) ChangesThis patch aims at making the combination of single-float and N32/N64 ABI properly work. Right now when both options are enabled the compiler chooses an incorrect ABI and in some cases even generates wrong instructions. The floating point behavior on MIPS is controlled through 3 flags: soft-float, single-float, fp64. This makes things complicated because fp64 indicates the presence of 64bit floating point registers, but cannot be easily disabled (the mips3 feature require it, but mips3 CPUs with only 32bit floating point exist). Also if fp64 is missing it doesn't actually disable 64bit floating point operations, because certain MIPS1/2 CPUs support 64bit floating point with 32bit registers, hence the single-float option. I'm guessing that originally single-float was only intended for the latter case, and that's the reason why it doesn't properly work on 64bit targets. So this patch does the following:
Patch is 38.45 KiB, truncated to 20.00 KiB below, full version: https://github.com/llvm/llvm-project/pull/140575.diff 11 Files Affected:
diff --git a/llvm/lib/Target/Mips/MipsCallingConv.td b/llvm/lib/Target/Mips/MipsCallingConv.td
index 3c60114f507b9..929eb7e391667 100644
--- a/llvm/lib/Target/Mips/MipsCallingConv.td
+++ b/llvm/lib/Target/Mips/MipsCallingConv.td
@@ -196,7 +196,8 @@ def RetCC_MipsN : CallingConv<[
//
// f128 should only occur for the N64 ABI where long double is 128-bit. On
// N32, long double is equivalent to double.
- CCIfType<[i64], CCIfOrigArgWasF128<CCDelegateTo<RetCC_F128>>>,
+ CCIfSubtargetNot<"isSingleFloat()",
+ CCIfType<[i64], CCIfOrigArgWasF128<CCDelegateTo<RetCC_F128>>>>,
// Aggregate returns are positioned at the lowest address in the slot for
// both little and big-endian targets. When passing in registers, this
@@ -333,9 +334,10 @@ def CC_Mips_FixedArg : CallingConv<[
//
// f128 should only occur for the N64 ABI where long double is 128-bit. On
// N32, long double is equivalent to double.
- CCIfType<[i64],
- CCIfSubtargetNot<"useSoftFloat()",
- CCIfOrigArgWasF128<CCBitConvertToType<f64>>>>,
+ CCIfType<[i64],
+ CCIfSubtargetNot<"isSingleFloat()",
+ CCIfSubtargetNot<"useSoftFloat()",
+ CCIfOrigArgWasF128<CCBitConvertToType<f64>>>>>,
CCIfCC<"CallingConv::Fast", CCDelegateTo<CC_Mips_FastCC>>,
@@ -359,8 +361,8 @@ def CC_Mips : CallingConv<[
// Callee-saved register lists.
//===----------------------------------------------------------------------===//
-def CSR_SingleFloatOnly : CalleeSavedRegs<(add (sequence "F%u", 31, 20), RA, FP,
- (sequence "S%u", 7, 0))>;
+def CSR_O32_SingleFloat : CalleeSavedRegs<(add(sequence "F%u", 31, 20), RA, FP,
+ (sequence "S%u", 7, 0))>;
def CSR_O32_FPXX : CalleeSavedRegs<(add (sequence "D%u", 15, 10), RA, FP,
(sequence "S%u", 7, 0))> {
@@ -374,13 +376,19 @@ def CSR_O32_FP64 :
CalleeSavedRegs<(add (decimate (sequence "D%u_64", 30, 20), 2), RA, FP,
(sequence "S%u", 7, 0))>;
-def CSR_N32 : CalleeSavedRegs<(add D20_64, D22_64, D24_64, D26_64, D28_64,
- D30_64, RA_64, FP_64, GP_64,
- (sequence "S%u_64", 7, 0))>;
+def CSR_N32 : CalleeSavedRegs<(add(decimate(sequence "D%u_64", 30, 20), 2),
+ RA_64, FP_64, GP_64, (sequence "S%u_64", 7, 0))>;
+
+def CSR_N32_SingleFloat
+ : CalleeSavedRegs<(add(decimate(sequence "F%u", 30, 20), 2), RA_64, FP_64,
+ GP_64, (sequence "S%u_64", 7, 0))>;
def CSR_N64 : CalleeSavedRegs<(add (sequence "D%u_64", 31, 24), RA_64, FP_64,
GP_64, (sequence "S%u_64", 7, 0))>;
+def CSR_N64_SingleFloat : CalleeSavedRegs<(add(sequence "F%u", 31, 24), RA_64,
+ FP_64, GP_64, (sequence "S%u_64", 7, 0))>;
+
def CSR_Mips16RetHelper :
CalleeSavedRegs<(add V0, V1, FP,
(sequence "A%u", 3, 0), (sequence "S%u", 7, 0),
diff --git a/llvm/lib/Target/Mips/MipsISelLowering.cpp b/llvm/lib/Target/Mips/MipsISelLowering.cpp
index e933e97ea3706..6cf35397f52ec 100644
--- a/llvm/lib/Target/Mips/MipsISelLowering.cpp
+++ b/llvm/lib/Target/Mips/MipsISelLowering.cpp
@@ -4295,10 +4295,16 @@ parseRegForInlineAsmConstraint(StringRef C, MVT VT) const {
return std::make_pair(0U, nullptr);
if (Prefix == "$f") { // Parse $f0-$f31.
- // If the size of FP registers is 64-bit or Reg is an even number, select
- // the 64-bit register class. Otherwise, select the 32-bit register class.
- if (VT == MVT::Other)
- VT = (Subtarget.isFP64bit() || !(Reg % 2)) ? MVT::f64 : MVT::f32;
+ // If the targets is single float only, always select 32-bit registers,
+ // otherwise if the size of FP registers is 64-bit or Reg is an even number,
+ // select the 64-bit register class. Otherwise, select the 32-bit register
+ // class.
+ if (VT == MVT::Other) {
+ if (Subtarget.isSingleFloat())
+ VT = MVT::f32;
+ else
+ VT = (Subtarget.isFP64bit() || !(Reg % 2)) ? MVT::f64 : MVT::f32;
+ }
RC = getRegClassFor(VT);
diff --git a/llvm/lib/Target/Mips/MipsRegisterInfo.cpp b/llvm/lib/Target/Mips/MipsRegisterInfo.cpp
index ae4b2377ad218..66288419e363d 100644
--- a/llvm/lib/Target/Mips/MipsRegisterInfo.cpp
+++ b/llvm/lib/Target/Mips/MipsRegisterInfo.cpp
@@ -102,14 +102,25 @@ MipsRegisterInfo::getCalleeSavedRegs(const MachineFunction *MF) const {
: CSR_Interrupt_32_SaveList;
}
- if (Subtarget.isSingleFloat())
- return CSR_SingleFloatOnly_SaveList;
+ // N64 ABI
+ if (Subtarget.isABI_N64()) {
+ if (Subtarget.isSingleFloat())
+ return CSR_N64_SingleFloat_SaveList;
- if (Subtarget.isABI_N64())
return CSR_N64_SaveList;
+ }
+
+ // N32 ABI
+ if (Subtarget.isABI_N32()) {
+ if (Subtarget.isSingleFloat())
+ return CSR_N32_SingleFloat_SaveList;
- if (Subtarget.isABI_N32())
return CSR_N32_SaveList;
+ }
+
+ // O32 ABI
+ if (Subtarget.isSingleFloat())
+ return CSR_O32_SingleFloat_SaveList;
if (Subtarget.isFP64bit())
return CSR_O32_FP64_SaveList;
@@ -124,14 +135,25 @@ const uint32_t *
MipsRegisterInfo::getCallPreservedMask(const MachineFunction &MF,
CallingConv::ID) const {
const MipsSubtarget &Subtarget = MF.getSubtarget<MipsSubtarget>();
- if (Subtarget.isSingleFloat())
- return CSR_SingleFloatOnly_RegMask;
+ // N64 ABI
+ if (Subtarget.isABI_N64()) {
+ if (Subtarget.isSingleFloat())
+ return CSR_N64_SingleFloat_RegMask;
- if (Subtarget.isABI_N64())
return CSR_N64_RegMask;
+ }
+
+ // N32 ABI
+ if (Subtarget.isABI_N32()) {
+ if (Subtarget.isSingleFloat())
+ return CSR_N32_SingleFloat_RegMask;
- if (Subtarget.isABI_N32())
return CSR_N32_RegMask;
+ }
+
+ // O32 ABI
+ if (Subtarget.isSingleFloat())
+ return CSR_O32_SingleFloat_RegMask;
if (Subtarget.isFP64bit())
return CSR_O32_FP64_RegMask;
diff --git a/llvm/lib/Target/Mips/MipsSEISelLowering.cpp b/llvm/lib/Target/Mips/MipsSEISelLowering.cpp
index 71a70d9c2dd46..c5cb52dd91191 100644
--- a/llvm/lib/Target/Mips/MipsSEISelLowering.cpp
+++ b/llvm/lib/Target/Mips/MipsSEISelLowering.cpp
@@ -28,6 +28,7 @@
#include "llvm/CodeGen/SelectionDAG.h"
#include "llvm/CodeGen/SelectionDAGNodes.h"
#include "llvm/CodeGen/TargetInstrInfo.h"
+#include "llvm/CodeGen/TargetLowering.h"
#include "llvm/CodeGen/TargetSubtargetInfo.h"
#include "llvm/CodeGen/ValueTypes.h"
#include "llvm/CodeGenTypes/MachineValueType.h"
@@ -211,6 +212,20 @@ MipsSETargetLowering::MipsSETargetLowering(const MipsTargetMachine &TM,
}
}
+ // Targets with 64bits integer registers, but no 64bit floating point register
+ // do not support conversion between them
+ if (Subtarget.isGP64bit() && Subtarget.isSingleFloat() &&
+ !Subtarget.useSoftFloat()) {
+ setOperationAction(ISD::FP_TO_SINT, MVT::i64, Expand);
+ setOperationAction(ISD::FP_TO_UINT, MVT::i64, Expand);
+ setOperationAction(ISD::STRICT_FP_TO_SINT, MVT::i64, Expand);
+ setOperationAction(ISD::STRICT_FP_TO_UINT, MVT::i64, Expand);
+ setOperationAction(ISD::SINT_TO_FP, MVT::i64, Expand);
+ setOperationAction(ISD::UINT_TO_FP, MVT::i64, Expand);
+ setOperationAction(ISD::STRICT_SINT_TO_FP, MVT::i64, Expand);
+ setOperationAction(ISD::STRICT_UINT_TO_FP, MVT::i64, Expand);
+ }
+
setOperationAction(ISD::SMUL_LOHI, MVT::i32, Custom);
setOperationAction(ISD::UMUL_LOHI, MVT::i32, Custom);
setOperationAction(ISD::MULHS, MVT::i32, Custom);
diff --git a/llvm/test/CodeGen/Mips/cconv/arguments-hard-float-single-varargs.ll b/llvm/test/CodeGen/Mips/cconv/arguments-hard-float-single-varargs.ll
new file mode 100644
index 0000000000000..8cbc879310f61
--- /dev/null
+++ b/llvm/test/CodeGen/Mips/cconv/arguments-hard-float-single-varargs.ll
@@ -0,0 +1,148 @@
+; RUN: llc -mtriple=mips -relocation-model=static -mattr=single-float < %s \
+; RUN: | FileCheck --check-prefixes=ALL,SYM32,O32 %s
+; RUN: llc -mtriple=mipsel -relocation-model=static -mattr=single-float < %s \
+; RUN: | FileCheck --check-prefixes=ALL,SYM32,O32 %s
+
+; RUN: llc -mtriple=mips64 -relocation-model=static -target-abi n32 -mattr=single-float < %s \
+; RUN: | FileCheck --check-prefixes=ALL,SYM32,N32,NEW,NEWBE %s
+; RUN: llc -mtriple=mips64el -relocation-model=static -target-abi n32 -mattr=single-float < %s \
+; RUN: | FileCheck --check-prefixes=ALL,SYM32,N32,NEW,NEWLE %s
+
+; RUN: llc -mtriple=mips64 -relocation-model=static -target-abi n64 -mattr=single-float < %s \
+; RUN: | FileCheck --check-prefixes=ALL,SYM64,N64,NEW,NEWBE %s
+; RUN: llc -mtriple=mips64el -relocation-model=static -target-abi n64 -mattr=single-float < %s \
+; RUN: | FileCheck --check-prefixes=ALL,SYM64,N64,NEW,NEWLE %s
+
+@floats = global [11 x float] zeroinitializer
+@doubles = global [11 x double] zeroinitializer
+
+define void @double_args(double %a, ...)
+ nounwind {
+entry:
+ %0 = getelementptr [11 x double], ptr @doubles, i32 0, i32 1
+ store volatile double %a, ptr %0
+
+ %ap = alloca ptr
+ call void @llvm.va_start(ptr %ap)
+ %b = va_arg ptr %ap, double
+ %1 = getelementptr [11 x double], ptr @doubles, i32 0, i32 2
+ store volatile double %b, ptr %1
+ call void @llvm.va_end(ptr %ap)
+ ret void
+}
+
+; ALL-LABEL: double_args:
+; We won't test the way the global address is calculated in this test. This is
+; just to get the register number for the other checks.
+; SYM32-DAG: addiu [[R2:\$[0-9]+]], ${{[0-9]+}}, %lo(doubles)
+; SYM64-DAG: daddiu [[R2:\$[0-9]+]], ${{[0-9]+}}, %lo(doubles)
+
+; O32 forbids using floating point registers for the non-variable portion.
+; N32/N64 allow it.
+; O32-DAG: sw $4, 8([[R2]])
+; O32-DAG: sw $5, 12([[R2]])
+; NEW-DAG: sd $4, 8([[R2]])
+
+; The varargs portion is dumped to stack
+; O32-DAG: sw $6, 16($sp)
+; O32-DAG: sw $7, 20($sp)
+; NEW-DAG: sd $5, 8($sp)
+; NEW-DAG: sd $6, 16($sp)
+; NEW-DAG: sd $7, 24($sp)
+; NEW-DAG: sd $8, 32($sp)
+; NEW-DAG: sd $9, 40($sp)
+; NEW-DAG: sd $10, 48($sp)
+; NEW-DAG: sd $11, 56($sp)
+
+; Get the varargs pointer
+; O32 has 4 bytes padding, 4 bytes for the varargs pointer, and 8 bytes reserved
+; for arguments 1 and 2.
+; N32/N64 has 8 bytes for the varargs pointer, and no reserved area.
+; O32-DAG: addiu [[VAPTR:\$[0-9]+]], $sp, 16
+; O32-DAG: sw [[VAPTR]], 4($sp)
+; N32-DAG: addiu [[VAPTR:\$[0-9]+]], $sp, 8
+; N32-DAG: sw [[VAPTR]], 4($sp)
+; N64-DAG: daddiu [[VAPTR:\$[0-9]+]], $sp, 8
+; N64-DAG: sd [[VAPTR]], 0($sp)
+
+; Increment the pointer then get the varargs arg
+; LLVM will rebind the load to the stack pointer instead of the varargs pointer
+; during lowering. This is fine and doesn't change the behaviour.
+; O32-DAG: addiu [[VAPTR]], [[VAPTR]], 8
+; N32-DAG: addiu [[VAPTR]], [[VAPTR]], 8
+; N64-DAG: daddiu [[VAPTR]], [[VAPTR]], 8
+; O32-DAG: lw [[R3:\$[0-9]+]], 16($sp)
+; O32-DAG: lw [[R4:\$[0-9]+]], 20($sp)
+; O32-DAG: sw [[R3]], 16([[R2]])
+; O32-DAG: sw [[R4]], 20([[R2]])
+; NEW-DAG: ld [[R3:\$[0-9]+]], 8($sp)
+; NEW-DAG: sd [[R3]], 16([[R2]])
+
+define void @float_args(float %a, ...) nounwind {
+entry:
+ %0 = getelementptr [11 x float], ptr @floats, i32 0, i32 1
+ store volatile float %a, ptr %0
+
+ %ap = alloca ptr
+ call void @llvm.va_start(ptr %ap)
+ %b = va_arg ptr %ap, float
+ %1 = getelementptr [11 x float], ptr @floats, i32 0, i32 2
+ store volatile float %b, ptr %1
+ call void @llvm.va_end(ptr %ap)
+ ret void
+}
+
+; ALL-LABEL: float_args:
+; We won't test the way the global address is calculated in this test. This is
+; just to get the register number for the other checks.
+; SYM32-DAG: addiu [[R2:\$[0-9]+]], ${{[0-9]+}}, %lo(floats)
+; SYM64-DAG: daddiu [[R2:\$[0-9]+]], ${{[0-9]+}}, %lo(floats)
+
+; The first four arguments are the same in O32/N32/N64.
+; The non-variable portion should be unaffected.
+; O32-DAG: mtc1 $4, $f0
+; O32-DAG: swc1 $f0, 4([[R2]])
+; NEW-DAG: swc1 $f12, 4([[R2]])
+
+; The varargs portion is dumped to stack
+; O32-DAG: sw $5, 12($sp)
+; O32-DAG: sw $6, 16($sp)
+; O32-DAG: sw $7, 20($sp)
+; NEW-DAG: sd $5, 8($sp)
+; NEW-DAG: sd $6, 16($sp)
+; NEW-DAG: sd $7, 24($sp)
+; NEW-DAG: sd $8, 32($sp)
+; NEW-DAG: sd $9, 40($sp)
+; NEW-DAG: sd $10, 48($sp)
+; NEW-DAG: sd $11, 56($sp)
+
+; Get the varargs pointer
+; O32 has 4 bytes padding, 4 bytes for the varargs pointer, and should have 8
+; bytes reserved for arguments 1 and 2 (the first float arg) but as discussed in
+; arguments-float.ll, GCC doesn't agree with MD00305 and treats floats as 4
+; bytes so we only have 12 bytes total.
+; N32/N64 has 8 bytes for the varargs pointer, and no reserved area.
+; O32-DAG: addiu [[VAPTR:\$[0-9]+]], $sp, 12
+; O32-DAG: sw [[VAPTR]], 4($sp)
+; N32-DAG: addiu [[VAPTR:\$[0-9]+]], $sp, 8
+; N32-DAG: sw [[VAPTR]], 4($sp)
+; N64-DAG: daddiu [[VAPTR:\$[0-9]+]], $sp, 8
+; N64-DAG: sd [[VAPTR]], 0($sp)
+
+; Increment the pointer then get the varargs arg
+; LLVM will rebind the load to the stack pointer instead of the varargs pointer
+; during lowering. This is fine and doesn't change the behaviour.
+; Also, in big-endian mode the offset must be increased by 4 to retrieve the
+; correct half of the argument slot.
+;
+; O32-DAG: addiu [[VAPTR]], [[VAPTR]], 4
+; N32-DAG: addiu [[VAPTR]], [[VAPTR]], 8
+; N64-DAG: daddiu [[VAPTR]], [[VAPTR]], 8
+; O32-DAG: lwc1 [[FTMP1:\$f[0-9]+]], 12($sp)
+; NEWLE-DAG: lwc1 [[FTMP1:\$f[0-9]+]], 8($sp)
+; NEWBE-DAG: lwc1 [[FTMP1:\$f[0-9]+]], 12($sp)
+; ALL-DAG: swc1 [[FTMP1]], 8([[R2]])
+
+declare void @llvm.va_start(ptr)
+declare void @llvm.va_copy(ptr, ptr)
+declare void @llvm.va_end(ptr)
diff --git a/llvm/test/CodeGen/Mips/cconv/arguments-hard-float-single.ll b/llvm/test/CodeGen/Mips/cconv/arguments-hard-float-single.ll
new file mode 100644
index 0000000000000..6b7ad03c8e1c2
--- /dev/null
+++ b/llvm/test/CodeGen/Mips/cconv/arguments-hard-float-single.ll
@@ -0,0 +1,224 @@
+; RUN: llc -mtriple=mips -relocation-model=static -mattr=single-float < %s \
+; RUN: | FileCheck --check-prefixes=ALL,SYM32,O32 %s
+; RUN: llc -mtriple=mipsel -relocation-model=static -mattr=single-float < %s \
+; RUN: | FileCheck --check-prefixes=ALL,SYM32,O32 %s
+
+; RUN: llc -mtriple=mips64 -relocation-model=static -target-abi n32 -mattr=single-float < %s \
+; RUN: | FileCheck --check-prefixes=ALL,SYM32,NEW %s
+; RUN: llc -mtriple=mips64el -relocation-model=static -target-abi n32 -mattr=single-float < %s \
+; RUN: | FileCheck --check-prefixes=ALL,SYM32,NEW %s
+
+; RUN: llc -mtriple=mips64 -relocation-model=static -target-abi n64 -mattr=single-float < %s \
+; RUN: | FileCheck --check-prefixes=ALL,SYM64,NEW %s
+; RUN: llc -mtriple=mips64el -relocation-model=static -target-abi n64 -mattr=single-float < %s \
+; RUN: | FileCheck --check-prefixes=ALL,SYM64,NEW %s
+
+@bytes = global [11 x i8] zeroinitializer
+@dwords = global [11 x i64] zeroinitializer
+@floats = global [11 x float] zeroinitializer
+@doubles = global [11 x double] zeroinitializer
+
+define void @double_args(double %a, double %b, double %c, double %d, double %e,
+ double %f, double %g, double %h, double %i) nounwind {
+entry:
+ %0 = getelementptr [11 x double], ptr @doubles, i32 0, i32 1
+ store volatile double %a, ptr %0
+ %1 = getelementptr [11 x double], ptr @doubles, i32 0, i32 2
+ store volatile double %b, ptr %1
+ %2 = getelementptr [11 x double], ptr @doubles, i32 0, i32 3
+ store volatile double %c, ptr %2
+ %3 = getelementptr [11 x double], ptr @doubles, i32 0, i32 4
+ store volatile double %d, ptr %3
+ %4 = getelementptr [11 x double], ptr @doubles, i32 0, i32 5
+ store volatile double %e, ptr %4
+ %5 = getelementptr [11 x double], ptr @doubles, i32 0, i32 6
+ store volatile double %f, ptr %5
+ %6 = getelementptr [11 x double], ptr @doubles, i32 0, i32 7
+ store volatile double %g, ptr %6
+ %7 = getelementptr [11 x double], ptr @doubles, i32 0, i32 8
+ store volatile double %h, ptr %7
+ %8 = getelementptr [11 x double], ptr @doubles, i32 0, i32 9
+ store volatile double %i, ptr %8
+ ret void
+}
+
+; ALL-LABEL: double_args:
+; We won't test the way the global address is calculated in this test. This is
+; just to get the register number for the other checks.
+; SYM32-DAG: addiu [[R2:\$[0-9]+]], ${{[0-9]+}}, %lo(doubles)
+; SYM64-DAG: daddiu [[R2:\$[0-9]+]], ${{[0-9]+}}, %lo(doubles)
+
+; The first four arguments are the same in O32/N32/N64.
+; The first argument is floating point but single-float is enabled so floating
+; point registers are not used.
+; O32-DAG: sw $4, 8([[R2]])
+; O32-DAG: sw $5, 12([[R2]])
+; NEW-DAG: sd $4, 8([[R2]])
+
+; O32-DAG: sw $6, 16([[R2]])
+; O32-DAG: sw $7, 20([[R2]])
+; NEW-DAG: sd $5, 16([[R2]])
+
+; O32 has run out of argument registers and starts using the stack
+; O32-DAG: lw [[R3:\$([0-9]+|gp)]], 16($sp)
+; O32-DAG: lw [[R4:\$([0-9]+|gp)]], 20($sp)
+; O32-DAG: sw [[R3]], 24([[R2]])
+; O32-DAG: sw [[R4]], 28([[R2]])
+; NEW-DAG: sd $6, 24([[R2]])
+
+; O32-DAG: lw [[R3:\$([0-9]+|gp)]], 24($sp)
+; O32-DAG: lw [[R4:\$([0-9]+|gp)]], 28($sp)
+; O32-DAG: sw [[R3]], 32([[R2]])
+; O32-DAG: sw [[R4]], 36([[R2]])
+; NEW-DAG: sd $7, 32([[R2]])
+
+; O32-DAG: lw [[R3:\$([0-9]+|gp)]], 32($sp)
+; O32-DAG: lw [[R4:\$([0-9]+|gp)]], 36($sp)
+; O32-DAG: sw [[R3]], 40([[R2]])
+; O32-DAG: sw [[R4]], 44([[R2]])
+; NEW-DAG: sd $8, 40([[R2]])
+
+; O32-DAG: lw [[R3:\$([0-9]+|gp)]], 40($sp)
+; O32-DAG: lw [[R4:\$([0-9]+|gp)]], 44($sp)
+; O32-DAG: sw [[R3]], 48([[R2]])
+; O32-DAG: sw [[R4]], 52([[R2]])
+; NEW-DAG: sd $9, 48([[R2]])
+
+; O32-DAG: lw [[R3:\$([0-9]+|gp)]], 48($sp)
+; O32-DAG: lw [[R4:\$([0-9]+|gp)]], 52($sp)
+; O32-DAG: sw [[R3]], 56([[R2]])
+; O32-DAG: sw [[R4]], 60([[R2]])
+; NEW-DAG: sd $10, 56([[R2]])
+
+; N32/N64 have run out of registers and starts using the stack too
+; O32-DAG: lw [[R3:\$[0-9]+]], 56($sp)
+; O32-DAG: lw [[R4:\$[0-9]+]], 60($sp)
+; O32-DAG: sw [[R3]], 64([[R2]])
+; O32-DAG: sw [[R4]], 68([[R2]])
+; NEW-DAG: ld [[R3:\$[0-9]+]], 0($sp)
+; NEW-DAG: sd $11, 64([[R2]])
+
+define void @float_args(float %a, float %b, float %c, float %d, float %e,
+ float %f, float %g, float %h, float %i) nounwind {
+entry:
+ %0 = getelementptr [11 x float], ptr @floats, i32 0, i32 1
+ store volatile float %a, ptr %0
+ %1 = getelementptr [11 x float], ptr @floats, i32 0, i32 2
+ store volatile float %b, ptr %1
+ %2 = getelementptr [11 x float], ptr @floats, i32 0, i32 3
+ store volatile float %c, ptr %2
+ %3 = getelementptr [11 x float], ptr @floats, i32 0, i32 4
+ store volatile float %d, ptr %3
+ %4 = getelementptr [11 x float], ptr @floats, i32 0, i32 5
+ store volatile float %e, ptr %4
+ %5 = getelementptr [11 x float], ptr @floats, i32 0, i32 6
+ store volatile float %f, ptr %5
+ %6 = getelementptr [11 x float], ptr @floats, i32 0, i32 7
+ store volatile float %g, ptr %6
+ %7 = getelementptr [11 x float], ptr @floats, i32 0, i32 8
+ store volatile float %h, ptr %7
+ %8 = getelementptr [11 x float], ptr @floats, i32 0, i32 9
+ store volatile float %i, ptr %8
+ ret void
+}
+
+; ALL-LABEL: float_args:
+; We won...
[truncated]
|
%0 = getelementptr [11 x double], ptr @doubles, i32 0, i32 1 | ||
store volatile double %a, ptr %0 | ||
|
||
%ap = alloca ptr | ||
call void @llvm.va_start(ptr %ap) | ||
%b = va_arg ptr %ap, double | ||
%1 = getelementptr [11 x double], ptr @doubles, i32 0, i32 2 | ||
store volatile double %b, ptr %1 | ||
call void @llvm.va_end(ptr %ap) | ||
ret void |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Weird / too much indentation, and should use named values in tests
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is because this tests are updated versions of the soft-float/hard-float variants.
If you look at how tests are currently structured, for example arguments-float.ll
, there are two versions arguments-float.ll
which tests soft-float behavior, then arguments-hard-float.ll
for hard-float behavior, but the underlying test is identical.
So what I did was add a third case for hard single-float, and updated the outcome. So if they look weird it's probably because the test I took this code from use outdated coding conventions.
I didn't want to update the old tests, so I think the best option is to just leave all of this as is.
The only thing I could change is naming, because now that I look at them, all the -hard tests put -hard right after the first word, while I put -single at the end of the path.
So to be more consistent I could change the name to arguments-hard-single-float-varargs.ll
for example.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
So the IR content is all the same? You could cleanup the old tests and switch them to generated tests, and add new run lines? I guess that will still be a very long test with a few too many run combinations
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes the IR is the same.
This test need to already deal with 3 ABIs (O32/N32/N64) and 2 endianess, plus if we put all of the combinations of floating point flags this would cram 18 different run combinations in a single test file which I think is wayyyy to many.
But I could switch all of them to generated tests, unless there is a specific reason I'm missing for why they were hand-written. As you can see some registers are captured and matched multiple times, instead of being hardcoded, maybe this is to make the test agnostic to register allocation?
Also I don't quite understand what you mean by "named values in tests" in your first comment, sorry I'm not familiar with test code conventions.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I tried merging all of the different test cases, honestly it's not that bad, you can see it here. You can actually ignore BE/LE variants and end up with just 9 different run combinations.
If you are okay with it I could merge all of the test variants and do some refactoring of the old tests touched by this PR.
if (VT == MVT::Other) { | ||
if (Subtarget.isSingleFloat()) | ||
VT = MVT::f32; | ||
else | ||
VT = (Subtarget.isFP64bit() || !(Reg % 2)) ? MVT::f64 : MVT::f32; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not sure this is properly covered in the tests, the only asm is a full clobber list which I'm assuming is to stress calling convention handling
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You are right, I'll prepare a test case for it
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I added a new test case for it. Also while writing that test I discovered another small bug in the inline asm constraint logic, I fixed it in the latest revision and added test for it as well.
9db2755
to
2a030d8
Compare
Ok I changed a couple of things:
I still haven't merged all of the variants of the calling conventions tests as I'm not sure it's a good idea, I'll first try on another branch and see if it could work. |
This patch aims at making the combination of single-float and N32/N64 ABI properly work.
Right now when both options are enabled the compiler chooses an incorrect ABI and in some cases even generates wrong instructions.
The floating point behavior on MIPS is controlled through 3 flags: soft-float, single-float, fp64. This makes things complicated because fp64 indicates the presence of 64bit floating point registers, but cannot be easily disabled (the mips3 feature require it, but mips3 CPUs with only 32bit floating point exist). Also if fp64 is missing it doesn't actually disable 64bit floating point operations, because certain MIPS1/2 CPUs support 64bit floating point with 32bit registers, hence the single-float option.
I'm guessing that originally single-float was only intended for the latter case, and that's the reason why it doesn't properly work on 64bit targets.
So this patch does the following: