-
Notifications
You must be signed in to change notification settings - Fork 13.6k
[RISCV][GlobalISel] Fix legalizing ‘llvm.va_copy’ intrinsic #86863
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
@llvm/pr-subscribers-backend-risc-v @llvm/pr-subscribers-llvm-globalisel Author: None (bvlgah) ChangesHi, I spotted a problem when running benchmarking programs on a RISCV64 device. IssueSegmentation faults only occurred while running the programs compiled with Here is a small but complete example (it is adopted from Google's benchmark framework to reproduce the issue, #include <cstdarg>
#include <cstdio>
#include <iostream>
#include <memory>
#include <string>
std::string FormatString(const char* msg, va_list args) {
// we might need a second shot at this, so pre-emptivly make a copy
va_list args_cp;
va_copy(args_cp, args);
std::size_t size = 256;
char local_buff[256];
auto ret = vsnprintf(local_buff, size, msg, args_cp);
va_end(args_cp);
// currently there is no error handling for failure, so this is hack.
// BM_CHECK(ret >= 0);
if (ret == 0) // handle empty expansion
return {};
else if (static_cast<size_t>(ret) < size)
return local_buff;
else {
// we did not provide a long enough buffer on our first attempt.
size = static_cast<size_t>(ret) + 1; // + 1 for the null byte
std::unique_ptr<char[]> buff(new char[size]);
ret = vsnprintf(buff.get(), size, msg, args);
// BM_CHECK(ret > 0 && (static_cast<size_t>(ret)) < size);
return buff.get();
}
}
std::string FormatString(const char* msg, ...) {
va_list args;
va_start(args, msg);
auto tmp = FormatString(msg, args);
va_end(args);
return tmp;
}
int main() {
std::string Str =
FormatString("%-*s %13s %15s %12s", static_cast<int>(20),
"Benchmark", "Time", "CPU", "Iterations");
std::cout << Str << std::endl;
} Use CauseI have examined MIR, it shows that these segmentation faults resulted from a small mistake about legalizing the intrinsic function llvm-project/llvm/lib/Target/RISCV/GISel/RISCVLegalizerInfo.cpp Lines 451 to 453 in 36e74cf
ChangesI have tweaked the test case Patch is 41.74 KiB, truncated to 20.00 KiB below, full version: https://github.com/llvm/llvm-project/pull/86863.diff 3 Files Affected:
diff --git a/llvm/lib/Target/RISCV/GISel/RISCVLegalizerInfo.cpp b/llvm/lib/Target/RISCV/GISel/RISCVLegalizerInfo.cpp
index 9a388f4cd2717b..22cae389cc3354 100644
--- a/llvm/lib/Target/RISCV/GISel/RISCVLegalizerInfo.cpp
+++ b/llvm/lib/Target/RISCV/GISel/RISCVLegalizerInfo.cpp
@@ -450,7 +450,7 @@ bool RISCVLegalizerInfo::legalizeIntrinsic(LegalizerHelper &Helper,
// Store the result in the destination va_list
MachineMemOperand *StoreMMO = MF.getMachineMemOperand(
MachinePointerInfo(), MachineMemOperand::MOStore, PtrTy, Alignment);
- MIRBuilder.buildStore(DstLst, Tmp, *StoreMMO);
+ MIRBuilder.buildStore(Tmp, DstLst, *StoreMMO);
MI.eraseFromParent();
return true;
diff --git a/llvm/test/CodeGen/RISCV/GlobalISel/legalizer/legalize-vacopy.mir b/llvm/test/CodeGen/RISCV/GlobalISel/legalizer/legalize-vacopy.mir
index f9eda1252937e8..16542f58001212 100644
--- a/llvm/test/CodeGen/RISCV/GlobalISel/legalizer/legalize-vacopy.mir
+++ b/llvm/test/CodeGen/RISCV/GlobalISel/legalizer/legalize-vacopy.mir
@@ -14,7 +14,7 @@ body: |
; CHECK-NEXT: [[COPY:%[0-9]+]]:_(p0) = COPY $x10
; CHECK-NEXT: [[COPY1:%[0-9]+]]:_(p0) = COPY $x11
; CHECK-NEXT: [[LOAD:%[0-9]+]]:_(p0) = G_LOAD [[COPY1]](p0) :: (load (p0))
- ; CHECK-NEXT: G_STORE [[COPY]](p0), [[LOAD]](p0) :: (store (p0))
+ ; CHECK-NEXT: G_STORE [[LOAD]](p0), [[COPY]](p0) :: (store (p0))
; CHECK-NEXT: PseudoRET
%0:_(p0) = COPY $x10
%1:_(p0) = COPY $x11
diff --git a/llvm/test/CodeGen/RISCV/GlobalISel/vararg.ll b/llvm/test/CodeGen/RISCV/GlobalISel/vararg.ll
index 7b110e562e0533..d55adf371119b5 100644
--- a/llvm/test/CodeGen/RISCV/GlobalISel/vararg.ll
+++ b/llvm/test/CodeGen/RISCV/GlobalISel/vararg.ll
@@ -17,6 +17,12 @@
; RUN: sed 's/iXLen/i64/g' %s | llc -mtriple=riscv64 -global-isel -mattr=+d -target-abi lp64d \
; RUN: -verify-machineinstrs \
; RUN: | FileCheck -check-prefixes=RV64,LP64D %s
+; RUN: sed 's/iXLen/i64/g' %s | llc -mtriple=riscv32 -global-isel \
+; RUN: -frame-pointer=all -target-abi ilp32 -verify-machineinstrs \
+; RUN: | FileCheck -check-prefixes=RV32-WITHFP %s
+; RUN: sed 's/iXLen/i64/g' %s | llc -mtriple=riscv64 -global-isel \
+; RUN: -frame-pointer=all -target-abi lp64 -verify-machineinstrs \
+; RUN: | FileCheck -check-prefixes=RV64-WITHFP %s
; The same vararg calling convention is used for ilp32/ilp32f/ilp32d and for
; lp64/lp64f/lp64d. Different CHECK lines are required due to slight
@@ -79,6 +85,67 @@ define i32 @va1(ptr %fmt, ...) {
; RV64-NEXT: lw a0, 0(a0)
; RV64-NEXT: addi sp, sp, 80
; RV64-NEXT: ret
+;
+; RV32-WITHFP-LABEL: va1:
+; RV32-WITHFP: # %bb.0:
+; RV32-WITHFP-NEXT: addi sp, sp, -48
+; RV32-WITHFP-NEXT: .cfi_def_cfa_offset 48
+; RV32-WITHFP-NEXT: sw ra, 12(sp) # 4-byte Folded Spill
+; RV32-WITHFP-NEXT: sw s0, 8(sp) # 4-byte Folded Spill
+; RV32-WITHFP-NEXT: .cfi_offset ra, -36
+; RV32-WITHFP-NEXT: .cfi_offset s0, -40
+; RV32-WITHFP-NEXT: addi s0, sp, 16
+; RV32-WITHFP-NEXT: .cfi_def_cfa s0, 32
+; RV32-WITHFP-NEXT: sw a1, 4(s0)
+; RV32-WITHFP-NEXT: sw a2, 8(s0)
+; RV32-WITHFP-NEXT: sw a3, 12(s0)
+; RV32-WITHFP-NEXT: sw a4, 16(s0)
+; RV32-WITHFP-NEXT: addi a0, s0, 4
+; RV32-WITHFP-NEXT: sw a0, -12(s0)
+; RV32-WITHFP-NEXT: lw a0, -12(s0)
+; RV32-WITHFP-NEXT: sw a5, 20(s0)
+; RV32-WITHFP-NEXT: sw a6, 24(s0)
+; RV32-WITHFP-NEXT: sw a7, 28(s0)
+; RV32-WITHFP-NEXT: addi a1, a0, 4
+; RV32-WITHFP-NEXT: sw a1, -12(s0)
+; RV32-WITHFP-NEXT: lw a0, 0(a0)
+; RV32-WITHFP-NEXT: lw ra, 12(sp) # 4-byte Folded Reload
+; RV32-WITHFP-NEXT: lw s0, 8(sp) # 4-byte Folded Reload
+; RV32-WITHFP-NEXT: addi sp, sp, 48
+; RV32-WITHFP-NEXT: ret
+;
+; RV64-WITHFP-LABEL: va1:
+; RV64-WITHFP: # %bb.0:
+; RV64-WITHFP-NEXT: addi sp, sp, -96
+; RV64-WITHFP-NEXT: .cfi_def_cfa_offset 96
+; RV64-WITHFP-NEXT: sd ra, 24(sp) # 8-byte Folded Spill
+; RV64-WITHFP-NEXT: sd s0, 16(sp) # 8-byte Folded Spill
+; RV64-WITHFP-NEXT: .cfi_offset ra, -72
+; RV64-WITHFP-NEXT: .cfi_offset s0, -80
+; RV64-WITHFP-NEXT: addi s0, sp, 32
+; RV64-WITHFP-NEXT: .cfi_def_cfa s0, 64
+; RV64-WITHFP-NEXT: sd a1, 8(s0)
+; RV64-WITHFP-NEXT: sd a2, 16(s0)
+; RV64-WITHFP-NEXT: sd a3, 24(s0)
+; RV64-WITHFP-NEXT: sd a4, 32(s0)
+; RV64-WITHFP-NEXT: sd a5, 40(s0)
+; RV64-WITHFP-NEXT: addi a0, s0, 8
+; RV64-WITHFP-NEXT: sd a0, -24(s0)
+; RV64-WITHFP-NEXT: lw a0, -20(s0)
+; RV64-WITHFP-NEXT: lwu a1, -24(s0)
+; RV64-WITHFP-NEXT: sd a6, 48(s0)
+; RV64-WITHFP-NEXT: sd a7, 56(s0)
+; RV64-WITHFP-NEXT: slli a0, a0, 32
+; RV64-WITHFP-NEXT: or a0, a0, a1
+; RV64-WITHFP-NEXT: addi a1, a0, 4
+; RV64-WITHFP-NEXT: srli a2, a1, 32
+; RV64-WITHFP-NEXT: sw a1, -24(s0)
+; RV64-WITHFP-NEXT: sw a2, -20(s0)
+; RV64-WITHFP-NEXT: lw a0, 0(a0)
+; RV64-WITHFP-NEXT: ld ra, 24(sp) # 8-byte Folded Reload
+; RV64-WITHFP-NEXT: ld s0, 16(sp) # 8-byte Folded Reload
+; RV64-WITHFP-NEXT: addi sp, sp, 96
+; RV64-WITHFP-NEXT: ret
%va = alloca ptr
call void @llvm.va_start(ptr %va)
%argp.cur = load ptr, ptr %va, align 4
@@ -131,6 +198,58 @@ define i32 @va1_va_arg(ptr %fmt, ...) nounwind {
; RV64-NEXT: lw a0, 0(a0)
; RV64-NEXT: addi sp, sp, 80
; RV64-NEXT: ret
+;
+; RV32-WITHFP-LABEL: va1_va_arg:
+; RV32-WITHFP: # %bb.0:
+; RV32-WITHFP-NEXT: addi sp, sp, -48
+; RV32-WITHFP-NEXT: sw ra, 12(sp) # 4-byte Folded Spill
+; RV32-WITHFP-NEXT: sw s0, 8(sp) # 4-byte Folded Spill
+; RV32-WITHFP-NEXT: addi s0, sp, 16
+; RV32-WITHFP-NEXT: sw a1, 4(s0)
+; RV32-WITHFP-NEXT: sw a2, 8(s0)
+; RV32-WITHFP-NEXT: sw a3, 12(s0)
+; RV32-WITHFP-NEXT: sw a4, 16(s0)
+; RV32-WITHFP-NEXT: sw a5, 20(s0)
+; RV32-WITHFP-NEXT: sw a6, 24(s0)
+; RV32-WITHFP-NEXT: sw a7, 28(s0)
+; RV32-WITHFP-NEXT: addi a0, s0, 4
+; RV32-WITHFP-NEXT: sw a0, -12(s0)
+; RV32-WITHFP-NEXT: lw a0, -12(s0)
+; RV32-WITHFP-NEXT: addi a0, a0, 3
+; RV32-WITHFP-NEXT: andi a0, a0, -4
+; RV32-WITHFP-NEXT: addi a1, a0, 4
+; RV32-WITHFP-NEXT: sw a1, -12(s0)
+; RV32-WITHFP-NEXT: lw a0, 0(a0)
+; RV32-WITHFP-NEXT: lw ra, 12(sp) # 4-byte Folded Reload
+; RV32-WITHFP-NEXT: lw s0, 8(sp) # 4-byte Folded Reload
+; RV32-WITHFP-NEXT: addi sp, sp, 48
+; RV32-WITHFP-NEXT: ret
+;
+; RV64-WITHFP-LABEL: va1_va_arg:
+; RV64-WITHFP: # %bb.0:
+; RV64-WITHFP-NEXT: addi sp, sp, -96
+; RV64-WITHFP-NEXT: sd ra, 24(sp) # 8-byte Folded Spill
+; RV64-WITHFP-NEXT: sd s0, 16(sp) # 8-byte Folded Spill
+; RV64-WITHFP-NEXT: addi s0, sp, 32
+; RV64-WITHFP-NEXT: sd a1, 8(s0)
+; RV64-WITHFP-NEXT: sd a2, 16(s0)
+; RV64-WITHFP-NEXT: sd a3, 24(s0)
+; RV64-WITHFP-NEXT: sd a4, 32(s0)
+; RV64-WITHFP-NEXT: sd a5, 40(s0)
+; RV64-WITHFP-NEXT: sd a6, 48(s0)
+; RV64-WITHFP-NEXT: sd a7, 56(s0)
+; RV64-WITHFP-NEXT: addi a0, s0, 8
+; RV64-WITHFP-NEXT: sd a0, -24(s0)
+; RV64-WITHFP-NEXT: ld a0, -24(s0)
+; RV64-WITHFP-NEXT: addi a0, a0, 3
+; RV64-WITHFP-NEXT: andi a0, a0, -4
+; RV64-WITHFP-NEXT: addi a1, a0, 4
+; RV64-WITHFP-NEXT: sd a1, -24(s0)
+; RV64-WITHFP-NEXT: lw a0, 0(a0)
+; RV64-WITHFP-NEXT: ld ra, 24(sp) # 8-byte Folded Reload
+; RV64-WITHFP-NEXT: ld s0, 16(sp) # 8-byte Folded Reload
+; RV64-WITHFP-NEXT: addi sp, sp, 96
+; RV64-WITHFP-NEXT: ret
%va = alloca ptr
call void @llvm.va_start(ptr %va)
%1 = va_arg ptr %va, i32
@@ -212,6 +331,78 @@ define i32 @va1_va_arg_alloca(ptr %fmt, ...) nounwind {
; RV64-NEXT: ld s1, 8(sp) # 8-byte Folded Reload
; RV64-NEXT: addi sp, sp, 96
; RV64-NEXT: ret
+;
+; RV32-WITHFP-LABEL: va1_va_arg_alloca:
+; RV32-WITHFP: # %bb.0:
+; RV32-WITHFP-NEXT: addi sp, sp, -48
+; RV32-WITHFP-NEXT: sw ra, 12(sp) # 4-byte Folded Spill
+; RV32-WITHFP-NEXT: sw s0, 8(sp) # 4-byte Folded Spill
+; RV32-WITHFP-NEXT: sw s1, 4(sp) # 4-byte Folded Spill
+; RV32-WITHFP-NEXT: addi s0, sp, 16
+; RV32-WITHFP-NEXT: sw a1, 4(s0)
+; RV32-WITHFP-NEXT: sw a2, 8(s0)
+; RV32-WITHFP-NEXT: sw a3, 12(s0)
+; RV32-WITHFP-NEXT: sw a4, 16(s0)
+; RV32-WITHFP-NEXT: sw a5, 20(s0)
+; RV32-WITHFP-NEXT: sw a6, 24(s0)
+; RV32-WITHFP-NEXT: sw a7, 28(s0)
+; RV32-WITHFP-NEXT: addi a0, s0, 4
+; RV32-WITHFP-NEXT: sw a0, -16(s0)
+; RV32-WITHFP-NEXT: lw a0, -16(s0)
+; RV32-WITHFP-NEXT: addi a0, a0, 3
+; RV32-WITHFP-NEXT: andi a0, a0, -4
+; RV32-WITHFP-NEXT: addi a1, a0, 4
+; RV32-WITHFP-NEXT: sw a1, -16(s0)
+; RV32-WITHFP-NEXT: lw s1, 0(a0)
+; RV32-WITHFP-NEXT: addi a0, s1, 15
+; RV32-WITHFP-NEXT: andi a0, a0, -16
+; RV32-WITHFP-NEXT: sub a0, sp, a0
+; RV32-WITHFP-NEXT: mv sp, a0
+; RV32-WITHFP-NEXT: call notdead
+; RV32-WITHFP-NEXT: mv a0, s1
+; RV32-WITHFP-NEXT: addi sp, s0, -16
+; RV32-WITHFP-NEXT: lw ra, 12(sp) # 4-byte Folded Reload
+; RV32-WITHFP-NEXT: lw s0, 8(sp) # 4-byte Folded Reload
+; RV32-WITHFP-NEXT: lw s1, 4(sp) # 4-byte Folded Reload
+; RV32-WITHFP-NEXT: addi sp, sp, 48
+; RV32-WITHFP-NEXT: ret
+;
+; RV64-WITHFP-LABEL: va1_va_arg_alloca:
+; RV64-WITHFP: # %bb.0:
+; RV64-WITHFP-NEXT: addi sp, sp, -96
+; RV64-WITHFP-NEXT: sd ra, 24(sp) # 8-byte Folded Spill
+; RV64-WITHFP-NEXT: sd s0, 16(sp) # 8-byte Folded Spill
+; RV64-WITHFP-NEXT: sd s1, 8(sp) # 8-byte Folded Spill
+; RV64-WITHFP-NEXT: addi s0, sp, 32
+; RV64-WITHFP-NEXT: sd a1, 8(s0)
+; RV64-WITHFP-NEXT: sd a2, 16(s0)
+; RV64-WITHFP-NEXT: sd a3, 24(s0)
+; RV64-WITHFP-NEXT: sd a4, 32(s0)
+; RV64-WITHFP-NEXT: sd a5, 40(s0)
+; RV64-WITHFP-NEXT: sd a6, 48(s0)
+; RV64-WITHFP-NEXT: sd a7, 56(s0)
+; RV64-WITHFP-NEXT: addi a0, s0, 8
+; RV64-WITHFP-NEXT: sd a0, -32(s0)
+; RV64-WITHFP-NEXT: ld a0, -32(s0)
+; RV64-WITHFP-NEXT: addi a0, a0, 3
+; RV64-WITHFP-NEXT: andi a0, a0, -4
+; RV64-WITHFP-NEXT: addi a1, a0, 4
+; RV64-WITHFP-NEXT: sd a1, -32(s0)
+; RV64-WITHFP-NEXT: lw s1, 0(a0)
+; RV64-WITHFP-NEXT: slli a0, s1, 32
+; RV64-WITHFP-NEXT: srli a0, a0, 32
+; RV64-WITHFP-NEXT: addi a0, a0, 15
+; RV64-WITHFP-NEXT: andi a0, a0, -16
+; RV64-WITHFP-NEXT: sub a0, sp, a0
+; RV64-WITHFP-NEXT: mv sp, a0
+; RV64-WITHFP-NEXT: call notdead
+; RV64-WITHFP-NEXT: mv a0, s1
+; RV64-WITHFP-NEXT: addi sp, s0, -32
+; RV64-WITHFP-NEXT: ld ra, 24(sp) # 8-byte Folded Reload
+; RV64-WITHFP-NEXT: ld s0, 16(sp) # 8-byte Folded Reload
+; RV64-WITHFP-NEXT: ld s1, 8(sp) # 8-byte Folded Reload
+; RV64-WITHFP-NEXT: addi sp, sp, 96
+; RV64-WITHFP-NEXT: ret
%va = alloca ptr
call void @llvm.va_start(ptr %va)
%1 = va_arg ptr %va, i32
@@ -273,6 +464,36 @@ define void @va1_caller() nounwind {
; LP64D-NEXT: ld ra, 8(sp) # 8-byte Folded Reload
; LP64D-NEXT: addi sp, sp, 16
; LP64D-NEXT: ret
+;
+; RV32-WITHFP-LABEL: va1_caller:
+; RV32-WITHFP: # %bb.0:
+; RV32-WITHFP-NEXT: addi sp, sp, -16
+; RV32-WITHFP-NEXT: sw ra, 12(sp) # 4-byte Folded Spill
+; RV32-WITHFP-NEXT: sw s0, 8(sp) # 4-byte Folded Spill
+; RV32-WITHFP-NEXT: addi s0, sp, 16
+; RV32-WITHFP-NEXT: lui a3, 261888
+; RV32-WITHFP-NEXT: li a4, 2
+; RV32-WITHFP-NEXT: li a2, 0
+; RV32-WITHFP-NEXT: call va1
+; RV32-WITHFP-NEXT: lw ra, 12(sp) # 4-byte Folded Reload
+; RV32-WITHFP-NEXT: lw s0, 8(sp) # 4-byte Folded Reload
+; RV32-WITHFP-NEXT: addi sp, sp, 16
+; RV32-WITHFP-NEXT: ret
+;
+; RV64-WITHFP-LABEL: va1_caller:
+; RV64-WITHFP: # %bb.0:
+; RV64-WITHFP-NEXT: addi sp, sp, -16
+; RV64-WITHFP-NEXT: sd ra, 8(sp) # 8-byte Folded Spill
+; RV64-WITHFP-NEXT: sd s0, 0(sp) # 8-byte Folded Spill
+; RV64-WITHFP-NEXT: addi s0, sp, 16
+; RV64-WITHFP-NEXT: lui a0, %hi(.LCPI3_0)
+; RV64-WITHFP-NEXT: ld a1, %lo(.LCPI3_0)(a0)
+; RV64-WITHFP-NEXT: li a2, 2
+; RV64-WITHFP-NEXT: call va1
+; RV64-WITHFP-NEXT: ld ra, 8(sp) # 8-byte Folded Reload
+; RV64-WITHFP-NEXT: ld s0, 0(sp) # 8-byte Folded Reload
+; RV64-WITHFP-NEXT: addi sp, sp, 16
+; RV64-WITHFP-NEXT: ret
%1 = call i32 (ptr, ...) @va1(ptr undef, double 1.0, i32 2)
ret void
}
@@ -395,6 +616,59 @@ define i64 @va2(ptr %fmt, ...) nounwind {
; RV64-NEXT: ld a0, 0(a1)
; RV64-NEXT: addi sp, sp, 80
; RV64-NEXT: ret
+;
+; RV32-WITHFP-LABEL: va2:
+; RV32-WITHFP: # %bb.0:
+; RV32-WITHFP-NEXT: addi sp, sp, -48
+; RV32-WITHFP-NEXT: sw ra, 12(sp) # 4-byte Folded Spill
+; RV32-WITHFP-NEXT: sw s0, 8(sp) # 4-byte Folded Spill
+; RV32-WITHFP-NEXT: addi s0, sp, 16
+; RV32-WITHFP-NEXT: sw a1, 4(s0)
+; RV32-WITHFP-NEXT: sw a2, 8(s0)
+; RV32-WITHFP-NEXT: sw a3, 12(s0)
+; RV32-WITHFP-NEXT: sw a4, 16(s0)
+; RV32-WITHFP-NEXT: addi a0, s0, 4
+; RV32-WITHFP-NEXT: sw a0, -12(s0)
+; RV32-WITHFP-NEXT: lw a0, -12(s0)
+; RV32-WITHFP-NEXT: sw a5, 20(s0)
+; RV32-WITHFP-NEXT: sw a6, 24(s0)
+; RV32-WITHFP-NEXT: sw a7, 28(s0)
+; RV32-WITHFP-NEXT: addi a0, a0, 7
+; RV32-WITHFP-NEXT: andi a1, a0, -8
+; RV32-WITHFP-NEXT: addi a0, a0, 8
+; RV32-WITHFP-NEXT: sw a0, -12(s0)
+; RV32-WITHFP-NEXT: lw a0, 0(a1)
+; RV32-WITHFP-NEXT: lw a1, 4(a1)
+; RV32-WITHFP-NEXT: lw ra, 12(sp) # 4-byte Folded Reload
+; RV32-WITHFP-NEXT: lw s0, 8(sp) # 4-byte Folded Reload
+; RV32-WITHFP-NEXT: addi sp, sp, 48
+; RV32-WITHFP-NEXT: ret
+;
+; RV64-WITHFP-LABEL: va2:
+; RV64-WITHFP: # %bb.0:
+; RV64-WITHFP-NEXT: addi sp, sp, -96
+; RV64-WITHFP-NEXT: sd ra, 24(sp) # 8-byte Folded Spill
+; RV64-WITHFP-NEXT: sd s0, 16(sp) # 8-byte Folded Spill
+; RV64-WITHFP-NEXT: addi s0, sp, 32
+; RV64-WITHFP-NEXT: sd a1, 8(s0)
+; RV64-WITHFP-NEXT: sd a2, 16(s0)
+; RV64-WITHFP-NEXT: sd a3, 24(s0)
+; RV64-WITHFP-NEXT: sd a4, 32(s0)
+; RV64-WITHFP-NEXT: addi a0, s0, 8
+; RV64-WITHFP-NEXT: sd a0, -24(s0)
+; RV64-WITHFP-NEXT: ld a0, -24(s0)
+; RV64-WITHFP-NEXT: sd a5, 40(s0)
+; RV64-WITHFP-NEXT: sd a6, 48(s0)
+; RV64-WITHFP-NEXT: sd a7, 56(s0)
+; RV64-WITHFP-NEXT: addi a1, a0, 7
+; RV64-WITHFP-NEXT: andi a1, a1, -8
+; RV64-WITHFP-NEXT: addi a0, a0, 15
+; RV64-WITHFP-NEXT: sd a0, -24(s0)
+; RV64-WITHFP-NEXT: ld a0, 0(a1)
+; RV64-WITHFP-NEXT: ld ra, 24(sp) # 8-byte Folded Reload
+; RV64-WITHFP-NEXT: ld s0, 16(sp) # 8-byte Folded Reload
+; RV64-WITHFP-NEXT: addi sp, sp, 96
+; RV64-WITHFP-NEXT: ret
%va = alloca ptr
call void @llvm.va_start(ptr %va)
%argp.cur = load ptr, ptr %va
@@ -459,6 +733,61 @@ define i64 @va2_va_arg(ptr %fmt, ...) nounwind {
; RV64-NEXT: srli a0, a0, 32
; RV64-NEXT: addi sp, sp, 80
; RV64-NEXT: ret
+;
+; RV32-WITHFP-LABEL: va2_va_arg:
+; RV32-WITHFP: # %bb.0:
+; RV32-WITHFP-NEXT: addi sp, sp, -48
+; RV32-WITHFP-NEXT: sw ra, 12(sp) # 4-byte Folded Spill
+; RV32-WITHFP-NEXT: sw s0, 8(sp) # 4-byte Folded Spill
+; RV32-WITHFP-NEXT: addi s0, sp, 16
+; RV32-WITHFP-NEXT: sw a1, 4(s0)
+; RV32-WITHFP-NEXT: sw a2, 8(s0)
+; RV32-WITHFP-NEXT: sw a3, 12(s0)
+; RV32-WITHFP-NEXT: sw a4, 16(s0)
+; RV32-WITHFP-NEXT: sw a5, 20(s0)
+; RV32-WITHFP-NEXT: sw a6, 24(s0)
+; RV32-WITHFP-NEXT: sw a7, 28(s0)
+; RV32-WITHFP-NEXT: addi a0, s0, 4
+; RV32-WITHFP-NEXT: sw a0, -12(s0)
+; RV32-WITHFP-NEXT: lw a0, -12(s0)
+; RV32-WITHFP-NEXT: addi a0, a0, 3
+; RV32-WITHFP-NEXT: andi a0, a0, -4
+; RV32-WITHFP-NEXT: addi a1, a0, 4
+; RV32-WITHFP-NEXT: sw a1, -12(s0)
+; RV32-WITHFP-NEXT: lw a0, 0(a0)
+; RV32-WITHFP-NEXT: li a1, 0
+; RV32-WITHFP-NEXT: lw ra, 12(sp) # 4-byte Folded Reload
+; RV32-WITHFP-NEXT: lw s0, 8(sp) # 4-byte Folded Reload
+; RV32-WITHFP-NEXT: addi sp, sp, 48
+; RV32-WITHFP-NEXT: ret
+;
+; RV64-WITHFP-LABEL: va2_va_arg:
+; RV64-WITHFP: # %bb.0:
+; RV64-WITHFP-NEXT: addi sp, sp, -96
+; RV64-WITHFP-NEXT: sd ra, 24(sp) # 8-byte Folded Spill
+; RV64-WITHFP-NEXT: sd s0, 16(sp) # 8-byte Folded Spill
+; RV64-WITHFP-NEXT: addi s0, sp, 32
+; RV64-WITHFP-NEXT: sd a1, 8(s0)
+; RV64-WITHFP-NEXT: sd a2, 16(s0)
+; RV64-WITHFP-NEXT: sd a3, 24(s0)
+; RV64-WITHFP-NEXT: sd a4, 32(s0)
+; RV64-WITHFP-NEXT: sd a5, 40(s0)
+; RV64-WITHFP-NEXT: sd a6, 48(s0)
+; RV64-WITHFP-NEXT: sd a7, 56(s0)
+; RV64-WITHFP-NEXT: addi a0, s0, 8
+; RV64-WITHFP-NEXT: sd a0, -24(s0)
+; RV64-WITHFP-NEXT: ld a0, -24(s0)
+; RV64-WITHFP-NEXT: addi a0, a0, 3
+; RV64-WITHFP-NEXT: andi a0, a0, -4
+; RV64-WITHFP-NEXT: addi a1, a0, 4
+; RV64-WITHFP-NEXT: sd a1, -24(s0)
+; RV64-WITHFP-NEXT: lw a0, 0(a0)
+; RV64-WITHFP-NEXT: slli a0, a0, 32
+; RV64-WITHFP-NEXT: srli a0, a0, 32
+; RV64-WITHFP-NEXT: ld ra, 24(sp) # 8-byte Folded Reload
+; RV64-WITHFP-NEXT: ld s0, 16(sp) # 8-byte Folded Reload
+; RV64-WITHFP-NEXT: addi sp, sp, 96
+; RV64-WITHFP-NEXT: ret
%va = alloca ptr
call void @llvm.va_start(ptr %va)
%1 = va_arg ptr %va, i32
@@ -487,6 +816,32 @@ define void @va2_caller() nounwind {
; RV64-NEXT: ld ra, 8(sp) # 8-byte Folded Reload
; RV64-NEXT: addi sp, sp, 16
; RV64-NEXT: ret
+;
+; RV32-WITHFP-LABEL: va2_caller:
+; RV32-WITHFP: # %bb.0:
+; RV32-WITHFP-NEXT: addi sp, sp, -16
+; RV32-WITHFP-NEXT: sw ra, 12(sp) # 4-byte Folded Spill
+; RV32-WITHFP-NEXT: sw s0, 8(sp) # 4-byte Folded Spill
+; RV32-WITHFP-NEXT: addi s0, sp, 16
+; RV32-WITHFP-NEXT: li a1, 1
+; RV32-WITHFP-NEXT: call va2
+; RV32-WITHFP-NEXT: lw ra, 12(sp) # 4-byte Folded Reload
+; RV32-WITHFP-NEXT: lw s0, 8(sp) # 4-byte Folded Reload
+; RV32-WITHFP-NEXT: addi sp, sp, 16
+; RV32-WITHFP-NEXT: ret
+;
+; RV64-WITHFP-LABEL: va2_caller:
+; RV64-WITHFP: # %bb.0:
+; RV64-WITHFP-NEXT: addi sp, sp, -16
+; RV64-WITHFP-NEXT: sd ra, 8(sp) # 8-byte Folded Spill
+; RV64-WITHFP-NEXT: sd s0, 0(sp) # 8-byte Folded Spill
+; RV64-WITHFP-NEXT: addi s0, sp, 16
+; RV64-WITHFP-NEXT: li a1, 1
+; RV64-WITHFP-NEXT: call va2
+; RV64-WITHFP-NEXT: ld ra, 8(sp) # 8-byte Folded Reload
+; RV64-WITHFP-NEXT: ld s0, 0(sp) # 8-byte Folded Reload
+; RV64-WITHFP-NEXT: addi sp, sp, 16
+; RV64-WITHFP-NEXT: ret
%1 = call i64 (ptr, ...) @va2(ptr undef, i32 1)
ret void
}
@@ -617,6 +972,61 @@ define i64 @va3(i32 %a, i64 %b, ...) nounwind {
; RV64-NEXT: add a0, a1, a0
; RV64-NEXT: addi sp, sp, 64
; RV64-NEXT: ret
+;
+; RV32-WITHFP-LABEL: va3:
+; RV32-WITHFP: # %bb.0:
+; RV32-WITHFP-NEXT: addi sp, sp, -48
+; RV32-WITHFP-NEXT: sw ra, 20(sp) # 4-byte Folded Spill
+; RV32-WITHFP-NEXT: sw s0, 16(sp) # 4-byte Folded Spill
+; RV32-WITHFP-NEXT: addi s0, sp, 24
+; RV32-WITHFP-NEXT: sw a3, 4(s0)
+; RV32-WITHFP-NEXT: sw a4, 8(s0)
+; RV32-WITHFP-NEXT: addi a0, s0, 4
+; RV32-WITHFP-NEXT: sw a0, -12(s0)
+; RV32-WITHFP-NEXT: lw a0, -12(s0)
+; RV32-WITHFP-NEXT: sw a5, 12(s0)
+; RV32-WITHFP-NEXT: sw a6, 16(s0)
+; RV32-WITHFP-NEXT: sw a7, 20(s0)
+; RV32-WITHFP-NEXT: addi a0, a0, 7
+; RV32-WITHFP-NEXT: andi a3, a0, -8
+; RV32-WITHFP-NEXT: addi a0, a0, 8
+; RV32-WITHFP-NEXT: sw a0, -12(s0)
+; RV32-WITHFP-NEXT: lw a4, 0(a3)
+; RV32-WITHFP-NEXT: lw a3, 4(a3)
+; RV32-WITHFP-NEXT: add a0, a1, a4
+; RV32-WITHFP-NEXT: sltu a1, a0, a4
+; RV32-WITHFP-NEXT: add a2, a2, a3
+; RV32-WITHFP-NEXT: add a1, a2, a1
+; RV32-WITHFP-NEXT: lw ra, 20(sp) # 4-byte Folded Reload
+; RV32-WITHFP-NEXT: lw s0, 16(sp) # 4-byte Folded Reload
+; RV32-WITHFP-NEXT: addi sp, sp, 48
+; RV32-WITHFP-NEXT: ret
+;
+; RV64-WITHFP-LABEL: va3:
+; RV64-WITHFP: # %bb.0:
+; RV64-WITHFP-NEXT: addi sp, sp, -80
+; RV64-WITHFP-NEXT: sd ra, 24(sp) # 8-byte Folded Spill
+; RV64-WITHFP-NEXT: sd s0, 16(sp) # 8-byte Folded Spill
+; RV64-WITHFP-NEXT: addi s0, sp, 32
+; RV64-WITHFP-NEXT: sd a2, 0(s0)
+; RV64-WITHFP-NEXT: sd a3, 8(s0)
+; RV64-WITHFP-NEXT: sd a4, 16(s0)
+; RV64-WITHFP-NEXT: mv a0, s0
+; RV64-WITHFP-NEXT: sd a0, -24(s0)
+; RV64-WITHFP-NEXT: ld a0, -24(s0)
+; RV64-WITHFP-NEXT: sd a5, 24(s0)
+; RV64-WITHFP-NEXT: sd a6, 32(s0)
+; RV64-WITHFP-NEXT: sd a7, 40(s0)
+; RV64-WITHFP-NEXT: addi a2, a0, 7
+; RV64-WITHFP-NEXT: andi a2, a2, -8
+; RV64-WITHFP-NEXT: addi a0, a0, 15
+; RV64-WITHFP-NEXT: sd a0, -24(s0)
+; RV64-WITHFP-NEXT: ld a0, 0(a2)
+; RV64-WITHFP-NEXT: add a0, a1, a0
+; RV64-WITHFP-NEXT: ld ra, 24(sp) # 8-byte Folded Reload
+; RV64-WITHFP-NEXT: ld s0, 16(sp) # 8-byte Folded Reload
+; RV64-WITHFP-NEXT: addi sp, sp, 80
+; RV64-WITHFP-NEXT: ret
%va ...
[truncated]
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Lgtm
Thanks. I have locally compiled and test the example to reproduce the problem, and it works fine now. |
Hi, I spotted a problem when running benchmarking programs on a RISCV64 device.
Issue
Segmentation faults only occurred while running the programs compiled with
GlobalISel
enabled.Here is a small but complete example (it is adopted from Google's benchmark framework to reproduce the issue,
Use
clang++ -fglobal-isel -o main main.cpp
to compile it.Cause
I have examined MIR, it shows that these segmentation faults resulted from a small mistake about legalizing the intrinsic function
llvm.va_copy
.llvm-project/llvm/lib/Target/RISCV/GISel/RISCVLegalizerInfo.cpp
Lines 451 to 453 in 36e74cf
DstLst
andTmp
are placed in the wrong order.Changes
I have tweaked the test case
CodeGen/RISCV/GlobalISel/vararg.ll
so thats0
is used as the frame pointer (not in all checks) which points to the starting address of the save area. I believe that it helps reason about howllvm.va_copy
is handled.