Skip to content

[RISCV][GISel] Add calling convention support for half #94110

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 3 commits into from
Jun 8, 2024
Merged
Show file tree
Hide file tree
Changes from 2 commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
4 changes: 2 additions & 2 deletions llvm/lib/Target/RISCV/GISel/RISCVCallLowering.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -340,7 +340,7 @@ static bool isSupportedArgumentType(Type *T, const RISCVSubtarget &Subtarget,
// supported yet.
if (T->isIntegerTy())
return T->getIntegerBitWidth() <= Subtarget.getXLen() * 2;
if (T->isFloatTy() || T->isDoubleTy())
if (T->isHalfTy() || T->isFloatTy() || T->isDoubleTy())
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I still think this whole function should just be dropped and get the default treatment

return true;
if (T->isPointerTy())
return true;
Expand All @@ -361,7 +361,7 @@ static bool isSupportedReturnType(Type *T, const RISCVSubtarget &Subtarget,
// supported yet.
if (T->isIntegerTy())
return T->getIntegerBitWidth() <= Subtarget.getXLen() * 2;
if (T->isFloatTy() || T->isDoubleTy())
if (T->isHalfTy() || T->isFloatTy() || T->isDoubleTy())
return true;
if (T->isPointerTy())
return true;
Expand Down
2 changes: 2 additions & 0 deletions llvm/lib/Target/RISCV/GISel/RISCVInstructionSelector.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -849,6 +849,8 @@ const TargetRegisterClass *RISCVInstructionSelector::getRegClassForTypeOnBank(
}

if (RB.getID() == RISCV::FPRBRegBankID) {
if (Ty.getSizeInBits() == 16)
return &RISCV::FPR16RegClass;
if (Ty.getSizeInBits() == 32)
return &RISCV::FPR32RegClass;
if (Ty.getSizeInBits() == 64)
Expand Down
343 changes: 343 additions & 0 deletions llvm/test/CodeGen/RISCV/GlobalISel/irtranslator/calling-conv-half.ll
Original file line number Diff line number Diff line change
@@ -0,0 +1,343 @@
; NOTE: Assertions have been autogenerated by utils/update_mir_test_checks.py UTC_ARGS: --version 5
; RUN: llc -mtriple=riscv32 -global-isel -stop-after=irtranslator < %s \
; RUN: | FileCheck -check-prefix=RV32I %s
; RUN: llc -mtriple=riscv32 -mattr=+zfh -global-isel -stop-after=irtranslator < %s \
; RUN: | FileCheck -check-prefix=RV32IZFH %s

define half @callee_half_in_regs(half %x) nounwind {
; RV32I-LABEL: name: callee_half_in_regs
; RV32I: bb.1 (%ir-block.0):
; RV32I-NEXT: liveins: $x10
; RV32I-NEXT: {{ $}}
; RV32I-NEXT: [[COPY:%[0-9]+]]:_(s32) = COPY $x10
; RV32I-NEXT: [[TRUNC:%[0-9]+]]:_(s16) = G_TRUNC [[COPY]](s32)
Copy link

@tschuett tschuett Jun 1, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I cannot read RISC-V, but should this case be rejected? It looks as if it looses precision due to the trunk/anyext pair.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I cannot read RISC-V, but should this case be rejected? It looks as if it looses precision due to the trunk/anyext pair.

half-precision fp type only occupies 16-bits.

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nevermind.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am wondering that whether we should handle nan-boxing here.
See https://dtcxzyw.github.io/riscv-isa-manual-host/unpriv-isa-asciidoc.html#nanboxing

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

SelectionDAG has this nan-boxing code in RISCVTargetLowering::splitValueIntoRegisterParts. I think its only used when f32 is legal.

  if (IsABIRegCopy && (ValueVT == MVT::f16 || ValueVT == MVT::bf16) &&           
      PartVT == MVT::f32) {                                                      
    // Cast the [b]f16 to i16, extend to i32, pad with ones to make a float      
    // nan, and cast to f32.                                                     
    Val = DAG.getNode(ISD::BITCAST, DL, MVT::i16, Val);                          
    Val = DAG.getNode(ISD::ANY_EXTEND, DL, MVT::i32, Val);                       
    Val = DAG.getNode(ISD::OR, DL, MVT::i32, Val,                                
                      DAG.getConstant(0xFFFF0000, DL, MVT::i32));                
    Val = DAG.getNode(ISD::BITCAST, DL, MVT::f32, Val);                          
    Parts[0] = Val;                                                              
    return true;                                                                 
  } 

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it should be handled during the legalization of G_FCONSTANT half. Can we merge this PR first?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Test case:

; llc -mtriple=riscv32 -mattr=+f -target-abi=ilp32 -verify-machineinstrs test.ll -o -
declare i32 @callee_half_in_regs(i32 %a, half %b) nounwind

define i32 @caller_half_in_regs() nounwind {
  %1 = call i32 @callee_half_in_regs(i32 1, half 2.0)
  ret i32 %1
}

Copy link
Collaborator

@topperc topperc Jun 8, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it should be handled during the legalization of G_FCONSTANT half. Can we merge this PR first?

I don't think that's what SelectionDAG does. It's handled during call handling. splitValueIntoRegisterParts isn't called just for constants.

But having said that, I don't know that SelectionDAG even needs to do it. I'm not sure anything really cares that the value is nan-boxed.

; RV32I-NEXT: [[ANYEXT:%[0-9]+]]:_(s32) = G_ANYEXT [[TRUNC]](s16)
; RV32I-NEXT: $x10 = COPY [[ANYEXT]](s32)
; RV32I-NEXT: PseudoRET implicit $x10
;
; RV32IZFH-LABEL: name: callee_half_in_regs
; RV32IZFH: bb.1 (%ir-block.0):
; RV32IZFH-NEXT: liveins: $f10_h
; RV32IZFH-NEXT: {{ $}}
; RV32IZFH-NEXT: [[COPY:%[0-9]+]]:_(s16) = COPY $f10_h
; RV32IZFH-NEXT: $f10_h = COPY [[COPY]](s16)
; RV32IZFH-NEXT: PseudoRET implicit $f10_h
ret half %x
}

define half @caller_half_in_regs(half %x) nounwind {
; RV32I-LABEL: name: caller_half_in_regs
; RV32I: bb.1 (%ir-block.0):
; RV32I-NEXT: liveins: $x10
; RV32I-NEXT: {{ $}}
; RV32I-NEXT: [[COPY:%[0-9]+]]:_(s32) = COPY $x10
; RV32I-NEXT: [[TRUNC:%[0-9]+]]:_(s16) = G_TRUNC [[COPY]](s32)
; RV32I-NEXT: ADJCALLSTACKDOWN 0, 0, implicit-def $x2, implicit $x2
; RV32I-NEXT: [[ANYEXT:%[0-9]+]]:_(s32) = G_ANYEXT [[TRUNC]](s16)
; RV32I-NEXT: $x10 = COPY [[ANYEXT]](s32)
; RV32I-NEXT: PseudoCALL target-flags(riscv-call) @caller_half_in_regs, csr_ilp32_lp64, implicit-def $x1, implicit $x10, implicit-def $x10
; RV32I-NEXT: ADJCALLSTACKUP 0, 0, implicit-def $x2, implicit $x2
; RV32I-NEXT: [[COPY1:%[0-9]+]]:_(s32) = COPY $x10
; RV32I-NEXT: [[TRUNC1:%[0-9]+]]:_(s16) = G_TRUNC [[COPY1]](s32)
; RV32I-NEXT: [[ANYEXT1:%[0-9]+]]:_(s32) = G_ANYEXT [[TRUNC1]](s16)
; RV32I-NEXT: $x10 = COPY [[ANYEXT1]](s32)
; RV32I-NEXT: PseudoRET implicit $x10
;
; RV32IZFH-LABEL: name: caller_half_in_regs
; RV32IZFH: bb.1 (%ir-block.0):
; RV32IZFH-NEXT: liveins: $f10_h
; RV32IZFH-NEXT: {{ $}}
; RV32IZFH-NEXT: [[COPY:%[0-9]+]]:_(s16) = COPY $f10_h
; RV32IZFH-NEXT: ADJCALLSTACKDOWN 0, 0, implicit-def $x2, implicit $x2
; RV32IZFH-NEXT: $f10_h = COPY [[COPY]](s16)
; RV32IZFH-NEXT: PseudoCALL target-flags(riscv-call) @caller_half_in_regs, csr_ilp32f_lp64f, implicit-def $x1, implicit $f10_h, implicit-def $f10_h
; RV32IZFH-NEXT: ADJCALLSTACKUP 0, 0, implicit-def $x2, implicit $x2
; RV32IZFH-NEXT: [[COPY1:%[0-9]+]]:_(s16) = COPY $f10_h
; RV32IZFH-NEXT: $f10_h = COPY [[COPY1]](s16)
; RV32IZFH-NEXT: PseudoRET implicit $f10_h
%y = call half @caller_half_in_regs(half %x)
ret half %y
}

define half @callee_half_mixed_with_int(i32 %x0, half %x) nounwind {
; RV32I-LABEL: name: callee_half_mixed_with_int
; RV32I: bb.1 (%ir-block.0):
; RV32I-NEXT: liveins: $x10, $x11
; RV32I-NEXT: {{ $}}
; RV32I-NEXT: [[COPY:%[0-9]+]]:_(s32) = COPY $x10
; RV32I-NEXT: [[COPY1:%[0-9]+]]:_(s32) = COPY $x11
; RV32I-NEXT: [[TRUNC:%[0-9]+]]:_(s16) = G_TRUNC [[COPY1]](s32)
; RV32I-NEXT: [[ANYEXT:%[0-9]+]]:_(s32) = G_ANYEXT [[TRUNC]](s16)
; RV32I-NEXT: $x10 = COPY [[ANYEXT]](s32)
; RV32I-NEXT: PseudoRET implicit $x10
;
; RV32IZFH-LABEL: name: callee_half_mixed_with_int
; RV32IZFH: bb.1 (%ir-block.0):
; RV32IZFH-NEXT: liveins: $x10, $f10_h
; RV32IZFH-NEXT: {{ $}}
; RV32IZFH-NEXT: [[COPY:%[0-9]+]]:_(s32) = COPY $x10
; RV32IZFH-NEXT: [[COPY1:%[0-9]+]]:_(s16) = COPY $f10_h
; RV32IZFH-NEXT: $f10_h = COPY [[COPY1]](s16)
; RV32IZFH-NEXT: PseudoRET implicit $f10_h
ret half %x
}

define half @caller_half_mixed_with_int(half %x, i32 %x0) nounwind {
; RV32I-LABEL: name: caller_half_mixed_with_int
; RV32I: bb.1 (%ir-block.0):
; RV32I-NEXT: liveins: $x10, $x11
; RV32I-NEXT: {{ $}}
; RV32I-NEXT: [[COPY:%[0-9]+]]:_(s32) = COPY $x10
; RV32I-NEXT: [[TRUNC:%[0-9]+]]:_(s16) = G_TRUNC [[COPY]](s32)
; RV32I-NEXT: [[COPY1:%[0-9]+]]:_(s32) = COPY $x11
; RV32I-NEXT: ADJCALLSTACKDOWN 0, 0, implicit-def $x2, implicit $x2
; RV32I-NEXT: [[ANYEXT:%[0-9]+]]:_(s32) = G_ANYEXT [[TRUNC]](s16)
; RV32I-NEXT: $x10 = COPY [[COPY1]](s32)
; RV32I-NEXT: $x11 = COPY [[ANYEXT]](s32)
; RV32I-NEXT: PseudoCALL target-flags(riscv-call) @callee_half_mixed_with_int, csr_ilp32_lp64, implicit-def $x1, implicit $x10, implicit $x11, implicit-def $x10
; RV32I-NEXT: ADJCALLSTACKUP 0, 0, implicit-def $x2, implicit $x2
; RV32I-NEXT: [[COPY2:%[0-9]+]]:_(s32) = COPY $x10
; RV32I-NEXT: [[TRUNC1:%[0-9]+]]:_(s16) = G_TRUNC [[COPY2]](s32)
; RV32I-NEXT: [[ANYEXT1:%[0-9]+]]:_(s32) = G_ANYEXT [[TRUNC1]](s16)
; RV32I-NEXT: $x10 = COPY [[ANYEXT1]](s32)
; RV32I-NEXT: PseudoRET implicit $x10
;
; RV32IZFH-LABEL: name: caller_half_mixed_with_int
; RV32IZFH: bb.1 (%ir-block.0):
; RV32IZFH-NEXT: liveins: $x10, $f10_h
; RV32IZFH-NEXT: {{ $}}
; RV32IZFH-NEXT: [[COPY:%[0-9]+]]:_(s16) = COPY $f10_h
; RV32IZFH-NEXT: [[COPY1:%[0-9]+]]:_(s32) = COPY $x10
; RV32IZFH-NEXT: ADJCALLSTACKDOWN 0, 0, implicit-def $x2, implicit $x2
; RV32IZFH-NEXT: $x10 = COPY [[COPY1]](s32)
; RV32IZFH-NEXT: $f10_h = COPY [[COPY]](s16)
; RV32IZFH-NEXT: PseudoCALL target-flags(riscv-call) @callee_half_mixed_with_int, csr_ilp32f_lp64f, implicit-def $x1, implicit $x10, implicit $f10_h, implicit-def $f10_h
; RV32IZFH-NEXT: ADJCALLSTACKUP 0, 0, implicit-def $x2, implicit $x2
; RV32IZFH-NEXT: [[COPY2:%[0-9]+]]:_(s16) = COPY $f10_h
; RV32IZFH-NEXT: $f10_h = COPY [[COPY2]](s16)
; RV32IZFH-NEXT: PseudoRET implicit $f10_h
%y = call half @callee_half_mixed_with_int(i32 %x0, half %x)
ret half %y
}

define half @callee_half_return_stack1(i32 %v1, i32 %v2, i32 %v3, i32 %v4, i32 %v5, i32 %v6, i32 %v7, i32 %v8, half %x) nounwind {
; RV32I-LABEL: name: callee_half_return_stack1
; RV32I: bb.1 (%ir-block.0):
; RV32I-NEXT: liveins: $x10, $x11, $x12, $x13, $x14, $x15, $x16, $x17
; RV32I-NEXT: {{ $}}
; RV32I-NEXT: [[COPY:%[0-9]+]]:_(s32) = COPY $x10
; RV32I-NEXT: [[COPY1:%[0-9]+]]:_(s32) = COPY $x11
; RV32I-NEXT: [[COPY2:%[0-9]+]]:_(s32) = COPY $x12
; RV32I-NEXT: [[COPY3:%[0-9]+]]:_(s32) = COPY $x13
; RV32I-NEXT: [[COPY4:%[0-9]+]]:_(s32) = COPY $x14
; RV32I-NEXT: [[COPY5:%[0-9]+]]:_(s32) = COPY $x15
; RV32I-NEXT: [[COPY6:%[0-9]+]]:_(s32) = COPY $x16
; RV32I-NEXT: [[COPY7:%[0-9]+]]:_(s32) = COPY $x17
; RV32I-NEXT: [[FRAME_INDEX:%[0-9]+]]:_(p0) = G_FRAME_INDEX %fixed-stack.0
; RV32I-NEXT: [[LOAD:%[0-9]+]]:_(s32) = G_LOAD [[FRAME_INDEX]](p0) :: (load (s32) from %fixed-stack.0, align 16)
; RV32I-NEXT: [[TRUNC:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD]](s32)
; RV32I-NEXT: [[ANYEXT:%[0-9]+]]:_(s32) = G_ANYEXT [[TRUNC]](s16)
; RV32I-NEXT: $x10 = COPY [[ANYEXT]](s32)
; RV32I-NEXT: PseudoRET implicit $x10
;
; RV32IZFH-LABEL: name: callee_half_return_stack1
; RV32IZFH: bb.1 (%ir-block.0):
; RV32IZFH-NEXT: liveins: $x10, $x11, $x12, $x13, $x14, $x15, $x16, $x17, $f10_h
; RV32IZFH-NEXT: {{ $}}
; RV32IZFH-NEXT: [[COPY:%[0-9]+]]:_(s32) = COPY $x10
; RV32IZFH-NEXT: [[COPY1:%[0-9]+]]:_(s32) = COPY $x11
; RV32IZFH-NEXT: [[COPY2:%[0-9]+]]:_(s32) = COPY $x12
; RV32IZFH-NEXT: [[COPY3:%[0-9]+]]:_(s32) = COPY $x13
; RV32IZFH-NEXT: [[COPY4:%[0-9]+]]:_(s32) = COPY $x14
; RV32IZFH-NEXT: [[COPY5:%[0-9]+]]:_(s32) = COPY $x15
; RV32IZFH-NEXT: [[COPY6:%[0-9]+]]:_(s32) = COPY $x16
; RV32IZFH-NEXT: [[COPY7:%[0-9]+]]:_(s32) = COPY $x17
; RV32IZFH-NEXT: [[COPY8:%[0-9]+]]:_(s16) = COPY $f10_h
; RV32IZFH-NEXT: $f10_h = COPY [[COPY8]](s16)
; RV32IZFH-NEXT: PseudoRET implicit $f10_h
ret half %x
}

define half @caller_half_return_stack1(i32 %v1, half %x) nounwind {
; RV32I-LABEL: name: caller_half_return_stack1
; RV32I: bb.1 (%ir-block.0):
; RV32I-NEXT: liveins: $x10, $x11
; RV32I-NEXT: {{ $}}
; RV32I-NEXT: [[COPY:%[0-9]+]]:_(s32) = COPY $x10
; RV32I-NEXT: [[COPY1:%[0-9]+]]:_(s32) = COPY $x11
; RV32I-NEXT: [[TRUNC:%[0-9]+]]:_(s16) = G_TRUNC [[COPY1]](s32)
; RV32I-NEXT: [[C:%[0-9]+]]:_(s32) = G_CONSTANT i32 0
; RV32I-NEXT: [[C1:%[0-9]+]]:_(s32) = G_CONSTANT i32 1
; RV32I-NEXT: [[C2:%[0-9]+]]:_(s32) = G_CONSTANT i32 2
; RV32I-NEXT: [[C3:%[0-9]+]]:_(s32) = G_CONSTANT i32 5
; RV32I-NEXT: [[C4:%[0-9]+]]:_(s32) = G_CONSTANT i32 6
; RV32I-NEXT: [[C5:%[0-9]+]]:_(s32) = G_CONSTANT i32 7
; RV32I-NEXT: [[C6:%[0-9]+]]:_(s32) = G_CONSTANT i32 8
; RV32I-NEXT: ADJCALLSTACKDOWN 4, 0, implicit-def $x2, implicit $x2
; RV32I-NEXT: [[ANYEXT:%[0-9]+]]:_(s32) = G_ANYEXT [[TRUNC]](s16)
; RV32I-NEXT: [[COPY2:%[0-9]+]]:_(p0) = COPY $x2
; RV32I-NEXT: [[C7:%[0-9]+]]:_(s32) = G_CONSTANT i32 0
; RV32I-NEXT: [[PTR_ADD:%[0-9]+]]:_(p0) = G_PTR_ADD [[COPY2]], [[C7]](s32)
; RV32I-NEXT: G_STORE [[ANYEXT]](s32), [[PTR_ADD]](p0) :: (store (s32) into stack, align 16)
; RV32I-NEXT: $x10 = COPY [[C]](s32)
; RV32I-NEXT: $x11 = COPY [[C1]](s32)
; RV32I-NEXT: $x12 = COPY [[C2]](s32)
; RV32I-NEXT: $x13 = COPY [[COPY]](s32)
; RV32I-NEXT: $x14 = COPY [[C3]](s32)
; RV32I-NEXT: $x15 = COPY [[C4]](s32)
; RV32I-NEXT: $x16 = COPY [[C5]](s32)
; RV32I-NEXT: $x17 = COPY [[C6]](s32)
; RV32I-NEXT: PseudoCALL target-flags(riscv-call) @callee_half_return_stack1, csr_ilp32_lp64, implicit-def $x1, implicit $x10, implicit $x11, implicit $x12, implicit $x13, implicit $x14, implicit $x15, implicit $x16, implicit $x17, implicit-def $x10
; RV32I-NEXT: ADJCALLSTACKUP 4, 0, implicit-def $x2, implicit $x2
; RV32I-NEXT: [[COPY3:%[0-9]+]]:_(s32) = COPY $x10
; RV32I-NEXT: [[TRUNC1:%[0-9]+]]:_(s16) = G_TRUNC [[COPY3]](s32)
; RV32I-NEXT: [[ANYEXT1:%[0-9]+]]:_(s32) = G_ANYEXT [[TRUNC1]](s16)
; RV32I-NEXT: $x10 = COPY [[ANYEXT1]](s32)
; RV32I-NEXT: PseudoRET implicit $x10
;
; RV32IZFH-LABEL: name: caller_half_return_stack1
; RV32IZFH: bb.1 (%ir-block.0):
; RV32IZFH-NEXT: liveins: $x10, $f10_h
; RV32IZFH-NEXT: {{ $}}
; RV32IZFH-NEXT: [[COPY:%[0-9]+]]:_(s32) = COPY $x10
; RV32IZFH-NEXT: [[COPY1:%[0-9]+]]:_(s16) = COPY $f10_h
; RV32IZFH-NEXT: [[C:%[0-9]+]]:_(s32) = G_CONSTANT i32 0
; RV32IZFH-NEXT: [[C1:%[0-9]+]]:_(s32) = G_CONSTANT i32 1
; RV32IZFH-NEXT: [[C2:%[0-9]+]]:_(s32) = G_CONSTANT i32 2
; RV32IZFH-NEXT: [[C3:%[0-9]+]]:_(s32) = G_CONSTANT i32 5
; RV32IZFH-NEXT: [[C4:%[0-9]+]]:_(s32) = G_CONSTANT i32 6
; RV32IZFH-NEXT: [[C5:%[0-9]+]]:_(s32) = G_CONSTANT i32 7
; RV32IZFH-NEXT: [[C6:%[0-9]+]]:_(s32) = G_CONSTANT i32 8
; RV32IZFH-NEXT: ADJCALLSTACKDOWN 0, 0, implicit-def $x2, implicit $x2
; RV32IZFH-NEXT: $x10 = COPY [[C]](s32)
; RV32IZFH-NEXT: $x11 = COPY [[C1]](s32)
; RV32IZFH-NEXT: $x12 = COPY [[C2]](s32)
; RV32IZFH-NEXT: $x13 = COPY [[COPY]](s32)
; RV32IZFH-NEXT: $x14 = COPY [[C3]](s32)
; RV32IZFH-NEXT: $x15 = COPY [[C4]](s32)
; RV32IZFH-NEXT: $x16 = COPY [[C5]](s32)
; RV32IZFH-NEXT: $x17 = COPY [[C6]](s32)
; RV32IZFH-NEXT: $f10_h = COPY [[COPY1]](s16)
; RV32IZFH-NEXT: PseudoCALL target-flags(riscv-call) @callee_half_return_stack1, csr_ilp32f_lp64f, implicit-def $x1, implicit $x10, implicit $x11, implicit $x12, implicit $x13, implicit $x14, implicit $x15, implicit $x16, implicit $x17, implicit $f10_h, implicit-def $f10_h
; RV32IZFH-NEXT: ADJCALLSTACKUP 0, 0, implicit-def $x2, implicit $x2
; RV32IZFH-NEXT: [[COPY2:%[0-9]+]]:_(s16) = COPY $f10_h
; RV32IZFH-NEXT: $f10_h = COPY [[COPY2]](s16)
; RV32IZFH-NEXT: PseudoRET implicit $f10_h
%y = call half @callee_half_return_stack1(i32 0, i32 1, i32 2, i32 %v1, i32 5, i32 6, i32 7, i32 8, half %x)
ret half %y
}

define half @callee_half_return_stack2(half %v1, half %v2, half %v3, half %v4, half %v5, half %v6, half %v7, half %v8, half %x) nounwind {
; RV32I-LABEL: name: callee_half_return_stack2
; RV32I: bb.1 (%ir-block.0):
; RV32I-NEXT: liveins: $x10, $x11, $x12, $x13, $x14, $x15, $x16, $x17
; RV32I-NEXT: {{ $}}
; RV32I-NEXT: [[COPY:%[0-9]+]]:_(s32) = COPY $x10
; RV32I-NEXT: [[TRUNC:%[0-9]+]]:_(s16) = G_TRUNC [[COPY]](s32)
; RV32I-NEXT: [[COPY1:%[0-9]+]]:_(s32) = COPY $x11
; RV32I-NEXT: [[TRUNC1:%[0-9]+]]:_(s16) = G_TRUNC [[COPY1]](s32)
; RV32I-NEXT: [[COPY2:%[0-9]+]]:_(s32) = COPY $x12
; RV32I-NEXT: [[TRUNC2:%[0-9]+]]:_(s16) = G_TRUNC [[COPY2]](s32)
; RV32I-NEXT: [[COPY3:%[0-9]+]]:_(s32) = COPY $x13
; RV32I-NEXT: [[TRUNC3:%[0-9]+]]:_(s16) = G_TRUNC [[COPY3]](s32)
; RV32I-NEXT: [[COPY4:%[0-9]+]]:_(s32) = COPY $x14
; RV32I-NEXT: [[TRUNC4:%[0-9]+]]:_(s16) = G_TRUNC [[COPY4]](s32)
; RV32I-NEXT: [[COPY5:%[0-9]+]]:_(s32) = COPY $x15
; RV32I-NEXT: [[TRUNC5:%[0-9]+]]:_(s16) = G_TRUNC [[COPY5]](s32)
; RV32I-NEXT: [[COPY6:%[0-9]+]]:_(s32) = COPY $x16
; RV32I-NEXT: [[TRUNC6:%[0-9]+]]:_(s16) = G_TRUNC [[COPY6]](s32)
; RV32I-NEXT: [[COPY7:%[0-9]+]]:_(s32) = COPY $x17
; RV32I-NEXT: [[TRUNC7:%[0-9]+]]:_(s16) = G_TRUNC [[COPY7]](s32)
; RV32I-NEXT: [[FRAME_INDEX:%[0-9]+]]:_(p0) = G_FRAME_INDEX %fixed-stack.0
; RV32I-NEXT: [[LOAD:%[0-9]+]]:_(s32) = G_LOAD [[FRAME_INDEX]](p0) :: (load (s32) from %fixed-stack.0, align 16)
; RV32I-NEXT: [[TRUNC8:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD]](s32)
; RV32I-NEXT: [[ANYEXT:%[0-9]+]]:_(s32) = G_ANYEXT [[TRUNC8]](s16)
; RV32I-NEXT: $x10 = COPY [[ANYEXT]](s32)
; RV32I-NEXT: PseudoRET implicit $x10
;
; RV32IZFH-LABEL: name: callee_half_return_stack2
; RV32IZFH: bb.1 (%ir-block.0):
; RV32IZFH-NEXT: liveins: $x10, $f10_h, $f11_h, $f12_h, $f13_h, $f14_h, $f15_h, $f16_h, $f17_h
; RV32IZFH-NEXT: {{ $}}
; RV32IZFH-NEXT: [[COPY:%[0-9]+]]:_(s16) = COPY $f10_h
; RV32IZFH-NEXT: [[COPY1:%[0-9]+]]:_(s16) = COPY $f11_h
; RV32IZFH-NEXT: [[COPY2:%[0-9]+]]:_(s16) = COPY $f12_h
; RV32IZFH-NEXT: [[COPY3:%[0-9]+]]:_(s16) = COPY $f13_h
; RV32IZFH-NEXT: [[COPY4:%[0-9]+]]:_(s16) = COPY $f14_h
; RV32IZFH-NEXT: [[COPY5:%[0-9]+]]:_(s16) = COPY $f15_h
; RV32IZFH-NEXT: [[COPY6:%[0-9]+]]:_(s16) = COPY $f16_h
; RV32IZFH-NEXT: [[COPY7:%[0-9]+]]:_(s16) = COPY $f17_h
; RV32IZFH-NEXT: [[COPY8:%[0-9]+]]:_(s32) = COPY $x10
; RV32IZFH-NEXT: [[TRUNC:%[0-9]+]]:_(s16) = G_TRUNC [[COPY8]](s32)
; RV32IZFH-NEXT: $f10_h = COPY [[TRUNC]](s16)
; RV32IZFH-NEXT: PseudoRET implicit $f10_h
ret half %x
}

define half @caller_half_return_stack2(half %x, half %y) nounwind {
; RV32I-LABEL: name: caller_half_return_stack2
; RV32I: bb.1 (%ir-block.0):
; RV32I-NEXT: liveins: $x10, $x11
; RV32I-NEXT: {{ $}}
; RV32I-NEXT: [[COPY:%[0-9]+]]:_(s32) = COPY $x10
; RV32I-NEXT: [[TRUNC:%[0-9]+]]:_(s16) = G_TRUNC [[COPY]](s32)
; RV32I-NEXT: [[COPY1:%[0-9]+]]:_(s32) = COPY $x11
; RV32I-NEXT: [[TRUNC1:%[0-9]+]]:_(s16) = G_TRUNC [[COPY1]](s32)
; RV32I-NEXT: [[C:%[0-9]+]]:_(s16) = G_FCONSTANT half 0xH3C00
; RV32I-NEXT: [[C1:%[0-9]+]]:_(s16) = G_FCONSTANT half 0xH4200
; RV32I-NEXT: ADJCALLSTACKDOWN 4, 0, implicit-def $x2, implicit $x2
; RV32I-NEXT: [[ANYEXT:%[0-9]+]]:_(s32) = G_ANYEXT [[TRUNC]](s16)
; RV32I-NEXT: [[ANYEXT1:%[0-9]+]]:_(s32) = G_ANYEXT [[C]](s16)
; RV32I-NEXT: [[ANYEXT2:%[0-9]+]]:_(s32) = G_ANYEXT [[TRUNC]](s16)
; RV32I-NEXT: [[ANYEXT3:%[0-9]+]]:_(s32) = G_ANYEXT [[C1]](s16)
; RV32I-NEXT: [[ANYEXT4:%[0-9]+]]:_(s32) = G_ANYEXT [[TRUNC]](s16)
; RV32I-NEXT: [[ANYEXT5:%[0-9]+]]:_(s32) = G_ANYEXT [[TRUNC1]](s16)
; RV32I-NEXT: [[ANYEXT6:%[0-9]+]]:_(s32) = G_ANYEXT [[TRUNC1]](s16)
; RV32I-NEXT: [[ANYEXT7:%[0-9]+]]:_(s32) = G_ANYEXT [[TRUNC1]](s16)
; RV32I-NEXT: [[ANYEXT8:%[0-9]+]]:_(s32) = G_ANYEXT [[TRUNC]](s16)
; RV32I-NEXT: [[COPY2:%[0-9]+]]:_(p0) = COPY $x2
; RV32I-NEXT: [[C2:%[0-9]+]]:_(s32) = G_CONSTANT i32 0
; RV32I-NEXT: [[PTR_ADD:%[0-9]+]]:_(p0) = G_PTR_ADD [[COPY2]], [[C2]](s32)
; RV32I-NEXT: G_STORE [[ANYEXT8]](s32), [[PTR_ADD]](p0) :: (store (s32) into stack, align 16)
; RV32I-NEXT: $x10 = COPY [[ANYEXT]](s32)
; RV32I-NEXT: $x11 = COPY [[ANYEXT1]](s32)
; RV32I-NEXT: $x12 = COPY [[ANYEXT2]](s32)
; RV32I-NEXT: $x13 = COPY [[ANYEXT3]](s32)
; RV32I-NEXT: $x14 = COPY [[ANYEXT4]](s32)
; RV32I-NEXT: $x15 = COPY [[ANYEXT5]](s32)
; RV32I-NEXT: $x16 = COPY [[ANYEXT6]](s32)
; RV32I-NEXT: $x17 = COPY [[ANYEXT7]](s32)
; RV32I-NEXT: PseudoCALL target-flags(riscv-call) @callee_half_return_stack2, csr_ilp32_lp64, implicit-def $x1, implicit $x10, implicit $x11, implicit $x12, implicit $x13, implicit $x14, implicit $x15, implicit $x16, implicit $x17, implicit-def $x10
; RV32I-NEXT: ADJCALLSTACKUP 4, 0, implicit-def $x2, implicit $x2
; RV32I-NEXT: [[COPY3:%[0-9]+]]:_(s32) = COPY $x10
; RV32I-NEXT: [[TRUNC2:%[0-9]+]]:_(s16) = G_TRUNC [[COPY3]](s32)
; RV32I-NEXT: [[ANYEXT9:%[0-9]+]]:_(s32) = G_ANYEXT [[TRUNC2]](s16)
; RV32I-NEXT: $x10 = COPY [[ANYEXT9]](s32)
; RV32I-NEXT: PseudoRET implicit $x10
;
; RV32IZFH-LABEL: name: caller_half_return_stack2
; RV32IZFH: bb.1 (%ir-block.0):
; RV32IZFH-NEXT: liveins: $f10_h, $f11_h
; RV32IZFH-NEXT: {{ $}}
; RV32IZFH-NEXT: [[COPY:%[0-9]+]]:_(s16) = COPY $f10_h
; RV32IZFH-NEXT: [[COPY1:%[0-9]+]]:_(s16) = COPY $f11_h
; RV32IZFH-NEXT: [[C:%[0-9]+]]:_(s16) = G_FCONSTANT half 0xH3C00
; RV32IZFH-NEXT: [[C1:%[0-9]+]]:_(s16) = G_FCONSTANT half 0xH4200
; RV32IZFH-NEXT: ADJCALLSTACKDOWN 0, 0, implicit-def $x2, implicit $x2
; RV32IZFH-NEXT: $f10_h = COPY [[COPY]](s16)
; RV32IZFH-NEXT: $f11_h = COPY [[C]](s16)
; RV32IZFH-NEXT: $f12_h = COPY [[COPY]](s16)
; RV32IZFH-NEXT: $f13_h = COPY [[C1]](s16)
; RV32IZFH-NEXT: $f14_h = COPY [[COPY]](s16)
; RV32IZFH-NEXT: $f15_h = COPY [[COPY1]](s16)
; RV32IZFH-NEXT: $f16_h = COPY [[COPY1]](s16)
; RV32IZFH-NEXT: $f17_h = COPY [[COPY1]](s16)
; RV32IZFH-NEXT: $x10 = COPY [[COPY]](s16)
; RV32IZFH-NEXT: PseudoCALL target-flags(riscv-call) @callee_half_return_stack2, csr_ilp32f_lp64f, implicit-def $x1, implicit $f10_h, implicit $f11_h, implicit $f12_h, implicit $f13_h, implicit $f14_h, implicit $f15_h, implicit $f16_h, implicit $f17_h, implicit $x10, implicit-def $f10_h
; RV32IZFH-NEXT: ADJCALLSTACKUP 0, 0, implicit-def $x2, implicit $x2
; RV32IZFH-NEXT: [[COPY2:%[0-9]+]]:_(s16) = COPY $f10_h
; RV32IZFH-NEXT: $f10_h = COPY [[COPY2]](s16)
; RV32IZFH-NEXT: PseudoRET implicit $f10_h
%z = call half @callee_half_return_stack2(half %x, half 1.0, half %x, half 3.0, half %x, half %y, half %y, half %y, half %x)
ret half %z
}
Loading