Skip to content

[RISCV][GISel] Do libcall for G_FPTOSI, G_FPTOUI when no D or F support #94613

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 4 commits into from
Jun 7, 2024

Conversation

spaits
Copy link
Contributor

@spaits spaits commented Jun 6, 2024

When compiling the following code:

#include <stdio.h>
#include <stdlib.h>
#include <stddef.h>
#include <stdbool.h>

int main() {
    int a;
    float f;
    scanf("%d", &a);

    scanf("%f", &f);
    a += (int)f;
    
    return a;
}

for -march=rv32ima_zbb we get a libcall:

call    scanf
        lw      a0, -20(s0)
        call    __fixsfsi
        mv      a1, a0

When we try to use GlobalISel we get this error:

 error in backend: unable to legalize instruction: %9:_(s32) = G_FPTOSI %8:_(s32) (in function: main)

(Here is a link to a reproducer in Godblot: https://godbolt.org/z/f67vEEb41 )

The goal of this PR is to do a libcall for the legalization of G_FPTOSI and G_FPTOUI instead of doing a fallback to Selection DAG to do the same libcall later.

@spaits spaits marked this pull request as ready for review June 6, 2024 14:03
@llvmbot
Copy link
Member

llvmbot commented Jun 6, 2024

@llvm/pr-subscribers-llvm-globalisel

@llvm/pr-subscribers-backend-risc-v

Author: Gábor Spaits (spaits)

Changes

When compiling the following code:

#include &lt;stdio.h&gt;
#include &lt;stdlib.h&gt;
#include &lt;stddef.h&gt;
#include &lt;stdbool.h&gt;

int main() {
    int a;
    float f;
    scanf("%d", &amp;a);

    scanf("%f", &amp;f);
    a += (int)f;
    
    return a;
}

for -march=rv32ima_zbb we get a libcall:

call    scanf
        lw      a0, -20(s0)
        call    __fixsfsi
        mv      a1, a0

When we try to use GlobalISel we get this error:

 error in backend: unable to legalize instruction: %9:_(s32) = G_FPTOSI %8:_(s32) (in function: main)

(Here is a link to a reproducer in Godblot: https://godbolt.org/z/f67vEEb41 )

The goal of this PR is to do a libcall for the legalization of G_FPTOSI and G_FPTOUI instead of doing a fallback to Selection DAG to do the same libcall later.


Full diff: https://github.com/llvm/llvm-project/pull/94613.diff

2 Files Affected:

  • (modified) llvm/lib/Target/RISCV/GISel/RISCVLegalizerInfo.cpp (+2-1)
  • (added) llvm/test/CodeGen/RISCV/GlobalISel/legalizer/legalize-fptoi-rv32-libcall.mir (+194)
diff --git a/llvm/lib/Target/RISCV/GISel/RISCVLegalizerInfo.cpp b/llvm/lib/Target/RISCV/GISel/RISCVLegalizerInfo.cpp
index dbfcab7233bf8..69a139957ff23 100644
--- a/llvm/lib/Target/RISCV/GISel/RISCVLegalizerInfo.cpp
+++ b/llvm/lib/Target/RISCV/GISel/RISCVLegalizerInfo.cpp
@@ -409,7 +409,8 @@ RISCVLegalizerInfo::RISCVLegalizerInfo(const RISCVSubtarget &ST)
   getActionDefinitionsBuilder({G_FPTOSI, G_FPTOUI})
       .legalIf(all(typeInSet(0, {s32, sXLen}), typeIsScalarFPArith(1, ST)))
       .widenScalarToNextPow2(0)
-      .clampScalar(0, s32, sXLen);
+      .clampScalar(0, s32, sXLen)
+      .libcallFor({s32, s32});
 
   getActionDefinitionsBuilder({G_SITOFP, G_UITOFP})
       .legalIf(all(typeIsScalarFPArith(0, ST), typeInSet(1, {s32, sXLen})))
diff --git a/llvm/test/CodeGen/RISCV/GlobalISel/legalizer/legalize-fptoi-rv32-libcall.mir b/llvm/test/CodeGen/RISCV/GlobalISel/legalizer/legalize-fptoi-rv32-libcall.mir
new file mode 100644
index 0000000000000..b22338d521a46
--- /dev/null
+++ b/llvm/test/CodeGen/RISCV/GlobalISel/legalizer/legalize-fptoi-rv32-libcall.mir
@@ -0,0 +1,194 @@
+# NOTE: Assertions have been autogenerated by utils/update_mir_test_checks.py
+# RUN: llc -mtriple=riscv32 -mattr=-d -mattr=-f -run-pass=legalizer %s -o - \
+# RUN: | FileCheck %s
+
+---
+name:            fptosi_s1_s32
+body:             |
+  bb.1:
+    liveins: $f10_f
+
+    ; CHECK-LABEL: name: fptosi_s1_s32
+    ; CHECK: liveins: $f10_f
+    ; CHECK-NEXT: {{  $}}
+    ; CHECK-NEXT: [[COPY:%[0-9]+]]:_(s32) = COPY $f10_f
+    ; CHECK-NEXT: ADJCALLSTACKDOWN 0, 0, implicit-def $x2, implicit $x2
+    ; CHECK-NEXT: $x10 = COPY [[COPY]](s32)
+    ; CHECK-NEXT: PseudoCALL target-flags(riscv-call) &__fixsfsi, csr_ilp32_lp64, implicit-def $x1, implicit $x10, implicit-def $x10
+    ; CHECK-NEXT: ADJCALLSTACKUP 0, 0, implicit-def $x2, implicit $x2
+    ; CHECK-NEXT: [[COPY1:%[0-9]+]]:_(s32) = COPY $x10
+    ; CHECK-NEXT: $x10 = COPY [[COPY1]](s32)
+    ; CHECK-NEXT: PseudoRET implicit $x10
+    %0:_(s32) = COPY $f10_f
+    %1:_(s1) = G_FPTOSI %0(s32)
+    %2:_(s32) = G_ANYEXT %1(s1)
+    $x10 = COPY %2(s32)
+    PseudoRET implicit $x10
+
+...
+---
+name:            fptoui_s1_s32
+body:             |
+  bb.1:
+    liveins: $f10_f
+
+    ; CHECK-LABEL: name: fptoui_s1_s32
+    ; CHECK: liveins: $f10_f
+    ; CHECK-NEXT: {{  $}}
+    ; CHECK-NEXT: [[COPY:%[0-9]+]]:_(s32) = COPY $f10_f
+    ; CHECK-NEXT: ADJCALLSTACKDOWN 0, 0, implicit-def $x2, implicit $x2
+    ; CHECK-NEXT: $x10 = COPY [[COPY]](s32)
+    ; CHECK-NEXT: PseudoCALL target-flags(riscv-call) &__fixunssfsi, csr_ilp32_lp64, implicit-def $x1, implicit $x10, implicit-def $x10
+    ; CHECK-NEXT: ADJCALLSTACKUP 0, 0, implicit-def $x2, implicit $x2
+    ; CHECK-NEXT: [[COPY1:%[0-9]+]]:_(s32) = COPY $x10
+    ; CHECK-NEXT: $x10 = COPY [[COPY1]](s32)
+    ; CHECK-NEXT: PseudoRET implicit $x10
+    %0:_(s32) = COPY $f10_f
+    %1:_(s1) = G_FPTOUI %0(s32)
+    %2:_(s32) = G_ANYEXT %1(s1)
+    $x10 = COPY %2(s32)
+    PseudoRET implicit $x10
+
+...
+---
+name:            fptosi_s8_s32
+body:             |
+  bb.1:
+    liveins: $f10_f
+
+    ; CHECK-LABEL: name: fptosi_s8_s32
+    ; CHECK: liveins: $f10_f
+    ; CHECK-NEXT: {{  $}}
+    ; CHECK-NEXT: [[COPY:%[0-9]+]]:_(s32) = COPY $f10_f
+    ; CHECK-NEXT: ADJCALLSTACKDOWN 0, 0, implicit-def $x2, implicit $x2
+    ; CHECK-NEXT: $x10 = COPY [[COPY]](s32)
+    ; CHECK-NEXT: PseudoCALL target-flags(riscv-call) &__fixsfsi, csr_ilp32_lp64, implicit-def $x1, implicit $x10, implicit-def $x10
+    ; CHECK-NEXT: ADJCALLSTACKUP 0, 0, implicit-def $x2, implicit $x2
+    ; CHECK-NEXT: [[COPY1:%[0-9]+]]:_(s32) = COPY $x10
+    ; CHECK-NEXT: $x10 = COPY [[COPY1]](s32)
+    ; CHECK-NEXT: PseudoRET implicit $x10
+    %0:_(s32) = COPY $f10_f
+    %1:_(s8) = G_FPTOSI %0(s32)
+    %2:_(s32) = G_ANYEXT %1(s8)
+    $x10 = COPY %2(s32)
+    PseudoRET implicit $x10
+
+...
+---
+name:            fptoui_s8_s32
+body:             |
+  bb.1:
+    liveins: $f10_f
+
+    ; CHECK-LABEL: name: fptoui_s8_s32
+    ; CHECK: liveins: $f10_f
+    ; CHECK-NEXT: {{  $}}
+    ; CHECK-NEXT: [[COPY:%[0-9]+]]:_(s32) = COPY $f10_f
+    ; CHECK-NEXT: ADJCALLSTACKDOWN 0, 0, implicit-def $x2, implicit $x2
+    ; CHECK-NEXT: $x10 = COPY [[COPY]](s32)
+    ; CHECK-NEXT: PseudoCALL target-flags(riscv-call) &__fixunssfsi, csr_ilp32_lp64, implicit-def $x1, implicit $x10, implicit-def $x10
+    ; CHECK-NEXT: ADJCALLSTACKUP 0, 0, implicit-def $x2, implicit $x2
+    ; CHECK-NEXT: [[COPY1:%[0-9]+]]:_(s32) = COPY $x10
+    ; CHECK-NEXT: $x10 = COPY [[COPY1]](s32)
+    ; CHECK-NEXT: PseudoRET implicit $x10
+    %0:_(s32) = COPY $f10_f
+    %1:_(s8) = G_FPTOUI %0(s32)
+    %2:_(s32) = G_ANYEXT %1(s8)
+    $x10 = COPY %2(s32)
+    PseudoRET implicit $x10
+
+...
+---
+name:            fptosi_s16_s32
+body:             |
+  bb.1:
+    liveins: $f10_f
+
+    ; CHECK-LABEL: name: fptosi_s16_s32
+    ; CHECK: liveins: $f10_f
+    ; CHECK-NEXT: {{  $}}
+    ; CHECK-NEXT: [[COPY:%[0-9]+]]:_(s32) = COPY $f10_f
+    ; CHECK-NEXT: ADJCALLSTACKDOWN 0, 0, implicit-def $x2, implicit $x2
+    ; CHECK-NEXT: $x10 = COPY [[COPY]](s32)
+    ; CHECK-NEXT: PseudoCALL target-flags(riscv-call) &__fixsfsi, csr_ilp32_lp64, implicit-def $x1, implicit $x10, implicit-def $x10
+    ; CHECK-NEXT: ADJCALLSTACKUP 0, 0, implicit-def $x2, implicit $x2
+    ; CHECK-NEXT: [[COPY1:%[0-9]+]]:_(s32) = COPY $x10
+    ; CHECK-NEXT: $x10 = COPY [[COPY1]](s32)
+    ; CHECK-NEXT: PseudoRET implicit $x10
+    %0:_(s32) = COPY $f10_f
+    %1:_(s16) = G_FPTOSI %0(s32)
+    %2:_(s32) = G_ANYEXT %1(s16)
+    $x10 = COPY %2(s32)
+    PseudoRET implicit $x10
+
+...
+---
+name:            fptoui_s16_s32
+body:             |
+  bb.1:
+    liveins: $f10_f
+
+    ; CHECK-LABEL: name: fptoui_s16_s32
+    ; CHECK: liveins: $f10_f
+    ; CHECK-NEXT: {{  $}}
+    ; CHECK-NEXT: [[COPY:%[0-9]+]]:_(s32) = COPY $f10_f
+    ; CHECK-NEXT: ADJCALLSTACKDOWN 0, 0, implicit-def $x2, implicit $x2
+    ; CHECK-NEXT: $x10 = COPY [[COPY]](s32)
+    ; CHECK-NEXT: PseudoCALL target-flags(riscv-call) &__fixunssfsi, csr_ilp32_lp64, implicit-def $x1, implicit $x10, implicit-def $x10
+    ; CHECK-NEXT: ADJCALLSTACKUP 0, 0, implicit-def $x2, implicit $x2
+    ; CHECK-NEXT: [[COPY1:%[0-9]+]]:_(s32) = COPY $x10
+    ; CHECK-NEXT: $x10 = COPY [[COPY1]](s32)
+    ; CHECK-NEXT: PseudoRET implicit $x10
+    %0:_(s32) = COPY $f10_f
+    %1:_(s16) = G_FPTOUI %0(s32)
+    %2:_(s32) = G_ANYEXT %1(s16)
+    $x10 = COPY %2(s32)
+    PseudoRET implicit $x10
+
+...
+---
+name:            fptosi_s32_s32
+body:             |
+  bb.1:
+    liveins: $f10_f
+
+    ; CHECK-LABEL: name: fptosi_s32_s32
+    ; CHECK: liveins: $f10_f
+    ; CHECK-NEXT: {{  $}}
+    ; CHECK-NEXT: [[COPY:%[0-9]+]]:_(s32) = COPY $f10_f
+    ; CHECK-NEXT: ADJCALLSTACKDOWN 0, 0, implicit-def $x2, implicit $x2
+    ; CHECK-NEXT: $x10 = COPY [[COPY]](s32)
+    ; CHECK-NEXT: PseudoCALL target-flags(riscv-call) &__fixsfsi, csr_ilp32_lp64, implicit-def $x1, implicit $x10, implicit-def $x10
+    ; CHECK-NEXT: ADJCALLSTACKUP 0, 0, implicit-def $x2, implicit $x2
+    ; CHECK-NEXT: [[COPY1:%[0-9]+]]:_(s32) = COPY $x10
+    ; CHECK-NEXT: $x10 = COPY [[COPY1]](s32)
+    ; CHECK-NEXT: PseudoRET implicit $x10
+    %0:_(s32) = COPY $f10_f
+    %1:_(s32) = G_FPTOSI %0(s32)
+    $x10 = COPY %1(s32)
+    PseudoRET implicit $x10
+
+...
+---
+name:            fptoui_s32_s32
+body:             |
+  bb.1:
+    liveins: $f10_f
+
+    ; CHECK-LABEL: name: fptoui_s32_s32
+    ; CHECK: liveins: $f10_f
+    ; CHECK-NEXT: {{  $}}
+    ; CHECK-NEXT: [[COPY:%[0-9]+]]:_(s32) = COPY $f10_f
+    ; CHECK-NEXT: ADJCALLSTACKDOWN 0, 0, implicit-def $x2, implicit $x2
+    ; CHECK-NEXT: $x10 = COPY [[COPY]](s32)
+    ; CHECK-NEXT: PseudoCALL target-flags(riscv-call) &__fixunssfsi, csr_ilp32_lp64, implicit-def $x1, implicit $x10, implicit-def $x10
+    ; CHECK-NEXT: ADJCALLSTACKUP 0, 0, implicit-def $x2, implicit $x2
+    ; CHECK-NEXT: [[COPY1:%[0-9]+]]:_(s32) = COPY $x10
+    ; CHECK-NEXT: $x10 = COPY [[COPY1]](s32)
+    ; CHECK-NEXT: PseudoRET implicit $x10
+    %0:_(s32) = COPY $f10_f
+    %1:_(s32) = G_FPTOUI %0(s32)
+    $x10 = COPY %1(s32)
+    PseudoRET implicit $x10
+
+...

Copy link

github-actions bot commented Jun 6, 2024

✅ With the latest revision this PR passed the C/C++ code formatter.

@@ -17,6 +17,7 @@
#include "llvm/CodeGen/GlobalISel/GIMatchTableExecutor.h"
#include "llvm/CodeGen/GlobalISel/GenericMachineInstrs.h"
#include "llvm/CodeGen/GlobalISel/LegalizerHelper.h"
#include "llvm/CodeGen/GlobalISel/LegalizerInfo.h"
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why do we need a new include?

@@ -169,7 +170,8 @@ RISCVLegalizerInfo::RISCVLegalizerInfo(const RISCVSubtarget &ST)
MergeUnmergeActions.legalIf(
all(typeIs(BigTyIdx, s64), typeIs(LitTyIdx, s32)));
}
MergeUnmergeActions.widenScalarToNextPow2(LitTyIdx, XLen)
MergeUnmergeActions.legalIf(all(typeIs(BigTyIdx, sDoubleXLen), typeIs(LitTyIdx, sXLen)))
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How do we isel this UNMERGE?

; CHECK-NOFLOAT-LABEL: name: fptosi_s1_s64
; CHECK-NOFLOAT: liveins: $f10_d
; CHECK-NOFLOAT-NEXT: {{ $}}
; CHECK-NOFLOAT-NEXT: [[COPY:%[0-9]+]]:_(s64) = COPY $f10_d
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Something is broken here. We're passing f10_d, but we don't have floating point?

Copy link
Contributor Author

@spaits spaits Jun 6, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So f10_d is a register only available when D and F are enabled. So I should be using for example x10 as a livein register right?

This also mean that I will have to use a different file for these tests.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For rv32, you would need to use x10 and x11 to hold all 64 bits. I recommend writing an IR test and using -stop-before=legalizer to get the MIR

@spaits
Copy link
Contributor Author

spaits commented Jun 6, 2024

@topperc I have updated the PR. I have discovered, that I could move the test for the libcall cases into the same file in which the non-libcall test are with the help of file check. When running the tests fail in some cases with this:

LLVM ERROR: unable to legalize instruction: %4:_(s32), %5:_(s32) = G_UNMERGE_VALUES %0:_(s64) (in function: fptosi_s1_s64)

So it says that we have an illegal unmerge that can not be legalized. This does not makes sense to me. I was editing the test for merge a few hours back today and I think this should be legalized by the current rules.

I have then realized, that when we are not on an rv32 system that has D extension we never have a legalIf or legalFor call. When I add a legalFor call then the legalization will be fine.

This is the commit, that introduced that if that made the legalFor call for only a specific case: dbb9043 .
Looking at the instruction selection and register bank info I don't really see why we shouldn't have a legalFor when the D extension is not present.

How should I proceed with this PR? Why is there no legal G_UNMERGE call when we have a CPU that is not rv32 and has no D extension?

I would really appreciate your help.

@spaits spaits force-pushed the riscv-libcall-for-G_FPTOSI branch from d389266 to 669868b Compare June 7, 2024 09:26
@spaits
Copy link
Contributor Author

spaits commented Jun 7, 2024

@topperc I have removed the legalFor call for G_UNMERGE_VALUES. Also I have added the tests as you have suggested. Now they are using the correct liveins. Now there is no problem with the legality of G_UNMERGE_VALUES for libcalls. Thank you for your help and review.

Could you please re-review this PR?

@spaits spaits requested a review from topperc June 7, 2024 09:29
Copy link
Collaborator

@topperc topperc left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@spaits spaits merged commit 28dd55b into llvm:main Jun 7, 2024
7 checks passed
nekoshirro pushed a commit to nekoshirro/Alchemist-LLVM that referenced this pull request Jun 9, 2024
…rt (llvm#94613)

When compiling the following code:
```cpp
#include <stdio.h>
#include <stdlib.h>
#include <stddef.h>
#include <stdbool.h>

int main() {
    int a;
    float f;
    scanf("%d", &a);

    scanf("%f", &f);
    a += (int)f;

    return a;
}
```
for `-march=rv32ima_zbb` we get a libcall:
```
call    scanf
        lw      a0, -20(s0)
        call    __fixsfsi
        mv      a1, a0
```
When we try to use GlobalISel we get this error:
```
 error in backend: unable to legalize instruction: %9:_(s32) = G_FPTOSI %8:_(s32) (in function: main)
```

(Here is a link to a reproducer in Godblot:
https://godbolt.org/z/f67vEEb41 )

The goal of this PR is to do a libcall for the legalization of
`G_FPTOSI` and `G_FPTOUI` instead of doing a fallback to Selection DAG
to do the same libcall later.

Signed-off-by: Hafidz Muzakky <[email protected]>
@HerrCai0907 HerrCai0907 mentioned this pull request Jun 13, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants