-
Notifications
You must be signed in to change notification settings - Fork 13.6k
[RISCV][GISEL] Add legalizer for G_BSWAP #70226
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
@llvm/pr-subscribers-backend-risc-v @llvm/pr-subscribers-llvm-globalisel Author: Michael Maitland (michaelmaitland) ChangesLower G_BSWAP into simpler instructions that can be selected in instruction selection. A future patch can handle when there is Zbb. Full diff: https://github.com/llvm/llvm-project/pull/70226.diff 3 Files Affected:
diff --git a/llvm/lib/Target/RISCV/GISel/RISCVLegalizerInfo.cpp b/llvm/lib/Target/RISCV/GISel/RISCVLegalizerInfo.cpp
index 3aae38a7d18de98..fff6f46caa39cd0 100644
--- a/llvm/lib/Target/RISCV/GISel/RISCVLegalizerInfo.cpp
+++ b/llvm/lib/Target/RISCV/GISel/RISCVLegalizerInfo.cpp
@@ -81,6 +81,8 @@ RISCVLegalizerInfo::RISCVLegalizerInfo(const RISCVSubtarget &ST) {
.clampScalar(BigTyIdx, XLenLLT, XLenLLT);
}
+ getActionDefinitionsBuilder(G_BSWAP).lower();
+
getActionDefinitionsBuilder({G_CONSTANT, G_IMPLICIT_DEF})
.legalFor({s32, XLenLLT, p0})
.widenScalarToNextPow2(0)
diff --git a/llvm/test/CodeGen/RISCV/GlobalISel/legalizer/bswap32.mir b/llvm/test/CodeGen/RISCV/GlobalISel/legalizer/bswap32.mir
new file mode 100644
index 000000000000000..3344429de88be58
--- /dev/null
+++ b/llvm/test/CodeGen/RISCV/GlobalISel/legalizer/bswap32.mir
@@ -0,0 +1,180 @@
+# NOTE: Assertions have been autogenerated by utils/update_mir_test_checks.py UTC_ARGS: --version 3
+# RUN: llc -mtriple=riscv32 -run-pass=legalizer %s -o - | FileCheck %s
+
+---
+name: bswap_i8
+body: |
+ bb.0:
+ liveins: $x10
+ ; CHECK-LABEL: name: bswap_i8
+ ; CHECK: liveins: $x10
+ ; CHECK-NEXT: {{ $}}
+ ; CHECK-NEXT: [[COPY:%[0-9]+]]:_(s32) = COPY $x10
+ ; CHECK-NEXT: [[ASSERT_ZEXT:%[0-9]+]]:_(s32) = G_ASSERT_ZEXT [[COPY]], 8
+ ; CHECK-NEXT: [[C:%[0-9]+]]:_(s32) = G_CONSTANT i32 0
+ ; CHECK-NEXT: [[SHL:%[0-9]+]]:_(s32) = G_SHL [[ASSERT_ZEXT]], [[C]](s32)
+ ; CHECK-NEXT: [[C1:%[0-9]+]]:_(s32) = G_CONSTANT i32 0
+ ; CHECK-NEXT: [[C2:%[0-9]+]]:_(s32) = G_CONSTANT i32 255
+ ; CHECK-NEXT: [[AND:%[0-9]+]]:_(s32) = G_AND [[ASSERT_ZEXT]], [[C2]]
+ ; CHECK-NEXT: [[LSHR:%[0-9]+]]:_(s32) = G_LSHR [[AND]], [[C1]](s32)
+ ; CHECK-NEXT: [[OR:%[0-9]+]]:_(s32) = G_OR [[LSHR]], [[SHL]]
+ ; CHECK-NEXT: [[C3:%[0-9]+]]:_(s32) = G_CONSTANT i32 255
+ ; CHECK-NEXT: [[AND1:%[0-9]+]]:_(s32) = G_AND [[OR]], [[C3]]
+ ; CHECK-NEXT: $x10 = COPY [[AND1]](s32)
+ ; CHECK-NEXT: PseudoRET implicit $x10
+ %0:_(s32) = COPY $x10
+ %1:_(s32) = G_ASSERT_ZEXT %0, 8
+ %2:_(s8) = G_TRUNC %1(s32)
+ %3:_(s8) = G_BSWAP %2
+ %4:_(s32) = G_ZEXT %3(s8)
+ $x10 = COPY %4(s32)
+ PseudoRET implicit $x10
+...
+---
+name: bswap_i16
+body: |
+ bb.0:
+ liveins: $x10
+ ; CHECK-LABEL: name: bswap_i16
+ ; CHECK: liveins: $x10
+ ; CHECK-NEXT: {{ $}}
+ ; CHECK-NEXT: [[COPY:%[0-9]+]]:_(s32) = COPY $x10
+ ; CHECK-NEXT: [[ASSERT_ZEXT:%[0-9]+]]:_(s32) = G_ASSERT_ZEXT [[COPY]], 16
+ ; CHECK-NEXT: [[C:%[0-9]+]]:_(s32) = G_CONSTANT i32 8
+ ; CHECK-NEXT: [[SHL:%[0-9]+]]:_(s32) = G_SHL [[ASSERT_ZEXT]], [[C]](s32)
+ ; CHECK-NEXT: [[C1:%[0-9]+]]:_(s32) = G_CONSTANT i32 8
+ ; CHECK-NEXT: [[C2:%[0-9]+]]:_(s32) = G_CONSTANT i32 65535
+ ; CHECK-NEXT: [[AND:%[0-9]+]]:_(s32) = G_AND [[ASSERT_ZEXT]], [[C2]]
+ ; CHECK-NEXT: [[LSHR:%[0-9]+]]:_(s32) = G_LSHR [[AND]], [[C1]](s32)
+ ; CHECK-NEXT: [[OR:%[0-9]+]]:_(s32) = G_OR [[LSHR]], [[SHL]]
+ ; CHECK-NEXT: [[C3:%[0-9]+]]:_(s32) = G_CONSTANT i32 65535
+ ; CHECK-NEXT: [[AND1:%[0-9]+]]:_(s32) = G_AND [[OR]], [[C3]]
+ ; CHECK-NEXT: $x10 = COPY [[AND1]](s32)
+ ; CHECK-NEXT: PseudoRET implicit $x10
+ %0:_(s32) = COPY $x10
+ %1:_(s32) = G_ASSERT_ZEXT %0, 16
+ %2:_(s16) = G_TRUNC %1(s32)
+ %3:_(s16) = G_BSWAP %2
+ %4:_(s32) = G_ZEXT %3(s16)
+ $x10 = COPY %4(s32)
+ PseudoRET implicit $x10
+...
+---
+name: bswap_i32
+body: |
+ bb.0:
+ liveins: $x10
+ ; CHECK-LABEL: name: bswap_i32
+ ; CHECK: liveins: $x10
+ ; CHECK-NEXT: {{ $}}
+ ; CHECK-NEXT: [[COPY:%[0-9]+]]:_(s32) = COPY $x10
+ ; CHECK-NEXT: [[C:%[0-9]+]]:_(s32) = G_CONSTANT i32 24
+ ; CHECK-NEXT: [[SHL:%[0-9]+]]:_(s32) = G_SHL [[COPY]], [[C]](s32)
+ ; CHECK-NEXT: [[LSHR:%[0-9]+]]:_(s32) = G_LSHR [[COPY]], [[C]](s32)
+ ; CHECK-NEXT: [[OR:%[0-9]+]]:_(s32) = G_OR [[LSHR]], [[SHL]]
+ ; CHECK-NEXT: [[C1:%[0-9]+]]:_(s32) = G_CONSTANT i32 65280
+ ; CHECK-NEXT: [[C2:%[0-9]+]]:_(s32) = G_CONSTANT i32 8
+ ; CHECK-NEXT: [[AND:%[0-9]+]]:_(s32) = G_AND [[COPY]], [[C1]]
+ ; CHECK-NEXT: [[SHL1:%[0-9]+]]:_(s32) = G_SHL [[AND]], [[C2]](s32)
+ ; CHECK-NEXT: [[OR1:%[0-9]+]]:_(s32) = G_OR [[OR]], [[SHL1]]
+ ; CHECK-NEXT: [[LSHR1:%[0-9]+]]:_(s32) = G_LSHR [[COPY]], [[C2]](s32)
+ ; CHECK-NEXT: [[AND1:%[0-9]+]]:_(s32) = G_AND [[LSHR1]], [[C1]]
+ ; CHECK-NEXT: [[OR2:%[0-9]+]]:_(s32) = G_OR [[OR1]], [[AND1]]
+ ; CHECK-NEXT: $x10 = COPY [[OR2]](s32)
+ ; CHECK-NEXT: PseudoRET implicit $x10
+ %0:_(s32) = COPY $x10
+ %1:_(s32) = G_BSWAP %0
+ $x10 = COPY %1(s32)
+ PseudoRET implicit $x10
+...
+---
+name: bswap_i64
+body: |
+ bb.0:
+ liveins: $x10, $x11
+ ; CHECK-LABEL: name: bswap_i64
+ ; CHECK: liveins: $x10, $x11
+ ; CHECK-NEXT: {{ $}}
+ ; CHECK-NEXT: [[COPY:%[0-9]+]]:_(s32) = COPY $x10
+ ; CHECK-NEXT: [[COPY1:%[0-9]+]]:_(s32) = COPY $x11
+ ; CHECK-NEXT: [[C:%[0-9]+]]:_(s32) = G_CONSTANT i32 0
+ ; CHECK-NEXT: [[C1:%[0-9]+]]:_(s32) = G_CONSTANT i32 24
+ ; CHECK-NEXT: [[SHL:%[0-9]+]]:_(s32) = G_SHL [[COPY]], [[C1]](s32)
+ ; CHECK-NEXT: [[C2:%[0-9]+]]:_(s32) = G_CONSTANT i32 24
+ ; CHECK-NEXT: [[LSHR:%[0-9]+]]:_(s32) = G_LSHR [[COPY1]], [[C2]](s32)
+ ; CHECK-NEXT: [[C3:%[0-9]+]]:_(s32) = G_CONSTANT i32 0
+ ; CHECK-NEXT: [[OR:%[0-9]+]]:_(s32) = G_OR [[LSHR]], [[C]]
+ ; CHECK-NEXT: [[OR1:%[0-9]+]]:_(s32) = G_OR [[C3]], [[SHL]]
+ ; CHECK-NEXT: [[C4:%[0-9]+]]:_(s32) = G_CONSTANT i32 65280
+ ; CHECK-NEXT: [[C5:%[0-9]+]]:_(s32) = G_CONSTANT i32 0
+ ; CHECK-NEXT: [[AND:%[0-9]+]]:_(s32) = G_AND [[COPY]], [[C4]]
+ ; CHECK-NEXT: [[C6:%[0-9]+]]:_(s32) = G_CONSTANT i32 0
+ ; CHECK-NEXT: [[C7:%[0-9]+]]:_(s32) = G_CONSTANT i32 8
+ ; CHECK-NEXT: [[SHL1:%[0-9]+]]:_(s32) = G_SHL [[AND]], [[C7]](s32)
+ ; CHECK-NEXT: [[OR2:%[0-9]+]]:_(s32) = G_OR [[OR]], [[C6]]
+ ; CHECK-NEXT: [[OR3:%[0-9]+]]:_(s32) = G_OR [[OR1]], [[SHL1]]
+ ; CHECK-NEXT: [[C8:%[0-9]+]]:_(s32) = G_CONSTANT i32 8
+ ; CHECK-NEXT: [[LSHR1:%[0-9]+]]:_(s32) = G_LSHR [[COPY1]], [[C8]](s32)
+ ; CHECK-NEXT: [[C9:%[0-9]+]]:_(s32) = G_CONSTANT i32 0
+ ; CHECK-NEXT: [[AND1:%[0-9]+]]:_(s32) = G_AND [[LSHR1]], [[C4]]
+ ; CHECK-NEXT: [[AND2:%[0-9]+]]:_(s32) = G_AND [[C9]], [[C5]]
+ ; CHECK-NEXT: [[OR4:%[0-9]+]]:_(s32) = G_OR [[OR2]], [[AND1]]
+ ; CHECK-NEXT: [[OR5:%[0-9]+]]:_(s32) = G_OR [[OR3]], [[AND2]]
+ ; CHECK-NEXT: [[C10:%[0-9]+]]:_(s32) = G_CONSTANT i32 16711680
+ ; CHECK-NEXT: [[C11:%[0-9]+]]:_(s32) = G_CONSTANT i32 0
+ ; CHECK-NEXT: [[AND3:%[0-9]+]]:_(s32) = G_AND [[COPY]], [[C10]]
+ ; CHECK-NEXT: [[AND4:%[0-9]+]]:_(s32) = G_AND [[COPY1]], [[C11]]
+ ; CHECK-NEXT: [[C12:%[0-9]+]]:_(s32) = G_CONSTANT i32 24
+ ; CHECK-NEXT: [[SHL2:%[0-9]+]]:_(s32) = G_SHL [[AND3]], [[C12]](s32)
+ ; CHECK-NEXT: [[C13:%[0-9]+]]:_(s32) = G_CONSTANT i32 24
+ ; CHECK-NEXT: [[SHL3:%[0-9]+]]:_(s32) = G_SHL [[AND4]], [[C13]](s32)
+ ; CHECK-NEXT: [[C14:%[0-9]+]]:_(s32) = G_CONSTANT i32 8
+ ; CHECK-NEXT: [[LSHR2:%[0-9]+]]:_(s32) = G_LSHR [[AND3]], [[C14]](s32)
+ ; CHECK-NEXT: [[OR6:%[0-9]+]]:_(s32) = G_OR [[SHL3]], [[LSHR2]]
+ ; CHECK-NEXT: [[OR7:%[0-9]+]]:_(s32) = G_OR [[OR4]], [[SHL2]]
+ ; CHECK-NEXT: [[OR8:%[0-9]+]]:_(s32) = G_OR [[OR5]], [[OR6]]
+ ; CHECK-NEXT: [[C15:%[0-9]+]]:_(s32) = G_CONSTANT i32 24
+ ; CHECK-NEXT: [[LSHR3:%[0-9]+]]:_(s32) = G_LSHR [[COPY]], [[C15]](s32)
+ ; CHECK-NEXT: [[C16:%[0-9]+]]:_(s32) = G_CONSTANT i32 8
+ ; CHECK-NEXT: [[SHL4:%[0-9]+]]:_(s32) = G_SHL [[COPY1]], [[C16]](s32)
+ ; CHECK-NEXT: [[OR9:%[0-9]+]]:_(s32) = G_OR [[LSHR3]], [[SHL4]]
+ ; CHECK-NEXT: [[LSHR4:%[0-9]+]]:_(s32) = G_LSHR [[COPY1]], [[C15]](s32)
+ ; CHECK-NEXT: [[AND5:%[0-9]+]]:_(s32) = G_AND [[OR9]], [[C10]]
+ ; CHECK-NEXT: [[AND6:%[0-9]+]]:_(s32) = G_AND [[LSHR4]], [[C11]]
+ ; CHECK-NEXT: [[OR10:%[0-9]+]]:_(s32) = G_OR [[OR7]], [[AND5]]
+ ; CHECK-NEXT: [[OR11:%[0-9]+]]:_(s32) = G_OR [[OR8]], [[AND6]]
+ ; CHECK-NEXT: [[C17:%[0-9]+]]:_(s32) = G_CONSTANT i32 -16777216
+ ; CHECK-NEXT: [[C18:%[0-9]+]]:_(s32) = G_CONSTANT i32 -1
+ ; CHECK-NEXT: [[AND7:%[0-9]+]]:_(s32) = G_AND [[COPY]], [[C17]]
+ ; CHECK-NEXT: [[AND8:%[0-9]+]]:_(s32) = G_AND [[COPY1]], [[C18]]
+ ; CHECK-NEXT: [[C19:%[0-9]+]]:_(s32) = G_CONSTANT i32 8
+ ; CHECK-NEXT: [[SHL5:%[0-9]+]]:_(s32) = G_SHL [[AND7]], [[C19]](s32)
+ ; CHECK-NEXT: [[C20:%[0-9]+]]:_(s32) = G_CONSTANT i32 8
+ ; CHECK-NEXT: [[SHL6:%[0-9]+]]:_(s32) = G_SHL [[AND8]], [[C20]](s32)
+ ; CHECK-NEXT: [[C21:%[0-9]+]]:_(s32) = G_CONSTANT i32 24
+ ; CHECK-NEXT: [[LSHR5:%[0-9]+]]:_(s32) = G_LSHR [[AND7]], [[C21]](s32)
+ ; CHECK-NEXT: [[OR12:%[0-9]+]]:_(s32) = G_OR [[SHL6]], [[LSHR5]]
+ ; CHECK-NEXT: [[OR13:%[0-9]+]]:_(s32) = G_OR [[OR10]], [[SHL5]]
+ ; CHECK-NEXT: [[OR14:%[0-9]+]]:_(s32) = G_OR [[OR11]], [[OR12]]
+ ; CHECK-NEXT: [[C22:%[0-9]+]]:_(s32) = G_CONSTANT i32 8
+ ; CHECK-NEXT: [[LSHR6:%[0-9]+]]:_(s32) = G_LSHR [[COPY]], [[C22]](s32)
+ ; CHECK-NEXT: [[C23:%[0-9]+]]:_(s32) = G_CONSTANT i32 24
+ ; CHECK-NEXT: [[SHL7:%[0-9]+]]:_(s32) = G_SHL [[COPY1]], [[C23]](s32)
+ ; CHECK-NEXT: [[OR15:%[0-9]+]]:_(s32) = G_OR [[LSHR6]], [[SHL7]]
+ ; CHECK-NEXT: [[LSHR7:%[0-9]+]]:_(s32) = G_LSHR [[COPY1]], [[C22]](s32)
+ ; CHECK-NEXT: [[AND9:%[0-9]+]]:_(s32) = G_AND [[OR15]], [[C17]]
+ ; CHECK-NEXT: [[AND10:%[0-9]+]]:_(s32) = G_AND [[LSHR7]], [[C18]]
+ ; CHECK-NEXT: [[OR16:%[0-9]+]]:_(s32) = G_OR [[OR13]], [[AND9]]
+ ; CHECK-NEXT: [[OR17:%[0-9]+]]:_(s32) = G_OR [[OR14]], [[AND10]]
+ ; CHECK-NEXT: $x10 = COPY [[OR16]](s32)
+ ; CHECK-NEXT: $x11 = COPY [[OR17]](s32)
+ ; CHECK-NEXT: PseudoRET implicit $x10, implicit $x11
+ %0:_(s32) = COPY $x10
+ %1:_(s32) = COPY $x11
+ %2:_(s64) = G_MERGE_VALUES %0(s32), %1(s32)
+ %3:_(s64) = G_BSWAP %2
+ %4:_(s32), %5:_(s32) = G_UNMERGE_VALUES %3(s64)
+ $x10 = COPY %4(s32)
+ $x11 = COPY %5(s32)
+ PseudoRET implicit $x10, implicit $x11
+...
diff --git a/llvm/test/CodeGen/RISCV/GlobalISel/legalizer/bswap64.mir b/llvm/test/CodeGen/RISCV/GlobalISel/legalizer/bswap64.mir
new file mode 100644
index 000000000000000..da8d2b8ee1563d8
--- /dev/null
+++ b/llvm/test/CodeGen/RISCV/GlobalISel/legalizer/bswap64.mir
@@ -0,0 +1,155 @@
+# NOTE: Assertions have been autogenerated by utils/update_mir_test_checks.py
+# RUN: llc -mtriple=riscv64 -run-pass=legalizer %s -o - | FileCheck %s
+
+---
+name: bswap_i8
+body: |
+ bb.0:
+ liveins: $x10
+ ; CHECK-LABEL: name: bswap_i8
+ ; CHECK: liveins: $x10
+ ; CHECK-NEXT: {{ $}}
+ ; CHECK-NEXT: [[COPY:%[0-9]+]]:_(s64) = COPY $x10
+ ; CHECK-NEXT: [[ASSERT_ZEXT:%[0-9]+]]:_(s64) = G_ASSERT_ZEXT [[COPY]], 8
+ ; CHECK-NEXT: [[C:%[0-9]+]]:_(s32) = G_CONSTANT i32 0
+ ; CHECK-NEXT: [[TRUNC:%[0-9]+]]:_(s32) = G_TRUNC [[ASSERT_ZEXT]](s64)
+ ; CHECK-NEXT: [[SHL:%[0-9]+]]:_(s32) = G_SHL [[TRUNC]], [[C]](s32)
+ ; CHECK-NEXT: [[C1:%[0-9]+]]:_(s32) = G_CONSTANT i32 0
+ ; CHECK-NEXT: [[C2:%[0-9]+]]:_(s64) = G_CONSTANT i64 255
+ ; CHECK-NEXT: [[AND:%[0-9]+]]:_(s64) = G_AND [[ASSERT_ZEXT]], [[C2]]
+ ; CHECK-NEXT: [[TRUNC1:%[0-9]+]]:_(s32) = G_TRUNC [[AND]](s64)
+ ; CHECK-NEXT: [[LSHR:%[0-9]+]]:_(s32) = G_LSHR [[TRUNC1]], [[C1]](s32)
+ ; CHECK-NEXT: [[ANYEXT:%[0-9]+]]:_(s64) = G_ANYEXT [[LSHR]](s32)
+ ; CHECK-NEXT: [[ANYEXT1:%[0-9]+]]:_(s64) = G_ANYEXT [[SHL]](s32)
+ ; CHECK-NEXT: [[OR:%[0-9]+]]:_(s64) = G_OR [[ANYEXT]], [[ANYEXT1]]
+ ; CHECK-NEXT: [[C3:%[0-9]+]]:_(s64) = G_CONSTANT i64 255
+ ; CHECK-NEXT: [[AND1:%[0-9]+]]:_(s64) = G_AND [[OR]], [[C3]]
+ ; CHECK-NEXT: $x10 = COPY [[AND1]](s64)
+ ; CHECK-NEXT: PseudoRET implicit $x10
+ %0:_(s64) = COPY $x10
+ %1:_(s64) = G_ASSERT_ZEXT %0, 8
+ %2:_(s8) = G_TRUNC %1(s64)
+ %3:_(s8) = G_BSWAP %2
+ %4:_(s64) = G_ZEXT %3(s8)
+ $x10 = COPY %4(s64)
+ PseudoRET implicit $x10
+...
+---
+name: bswap_i16
+body: |
+ bb.0:
+ liveins: $x10
+ ; CHECK-LABEL: name: bswap_i16
+ ; CHECK: liveins: $x10
+ ; CHECK-NEXT: {{ $}}
+ ; CHECK-NEXT: [[COPY:%[0-9]+]]:_(s64) = COPY $x10
+ ; CHECK-NEXT: [[ASSERT_ZEXT:%[0-9]+]]:_(s64) = G_ASSERT_ZEXT [[COPY]], 16
+ ; CHECK-NEXT: [[C:%[0-9]+]]:_(s32) = G_CONSTANT i32 8
+ ; CHECK-NEXT: [[TRUNC:%[0-9]+]]:_(s32) = G_TRUNC [[ASSERT_ZEXT]](s64)
+ ; CHECK-NEXT: [[SHL:%[0-9]+]]:_(s32) = G_SHL [[TRUNC]], [[C]](s32)
+ ; CHECK-NEXT: [[C1:%[0-9]+]]:_(s32) = G_CONSTANT i32 8
+ ; CHECK-NEXT: [[C2:%[0-9]+]]:_(s64) = G_CONSTANT i64 65535
+ ; CHECK-NEXT: [[AND:%[0-9]+]]:_(s64) = G_AND [[ASSERT_ZEXT]], [[C2]]
+ ; CHECK-NEXT: [[TRUNC1:%[0-9]+]]:_(s32) = G_TRUNC [[AND]](s64)
+ ; CHECK-NEXT: [[LSHR:%[0-9]+]]:_(s32) = G_LSHR [[TRUNC1]], [[C1]](s32)
+ ; CHECK-NEXT: [[ANYEXT:%[0-9]+]]:_(s64) = G_ANYEXT [[LSHR]](s32)
+ ; CHECK-NEXT: [[ANYEXT1:%[0-9]+]]:_(s64) = G_ANYEXT [[SHL]](s32)
+ ; CHECK-NEXT: [[OR:%[0-9]+]]:_(s64) = G_OR [[ANYEXT]], [[ANYEXT1]]
+ ; CHECK-NEXT: [[C3:%[0-9]+]]:_(s64) = G_CONSTANT i64 65535
+ ; CHECK-NEXT: [[AND1:%[0-9]+]]:_(s64) = G_AND [[OR]], [[C3]]
+ ; CHECK-NEXT: $x10 = COPY [[AND1]](s64)
+ ; CHECK-NEXT: PseudoRET implicit $x10
+ %0:_(s64) = COPY $x10
+ %1:_(s64) = G_ASSERT_ZEXT %0, 16
+ %2:_(s16) = G_TRUNC %1(s64)
+ %3:_(s16) = G_BSWAP %2
+ %4:_(s64) = G_ZEXT %3(s16)
+ $x10 = COPY %4(s64)
+ PseudoRET implicit $x10
+...
+---
+name: bswap_i32
+body: |
+ bb.0:
+ liveins: $x10
+ ; CHECK-LABEL: name: bswap_i32
+ ; CHECK: liveins: $x10
+ ; CHECK-NEXT: {{ $}}
+ ; CHECK-NEXT: [[COPY:%[0-9]+]]:_(s64) = COPY $x10
+ ; CHECK-NEXT: [[ASSERT_ZEXT:%[0-9]+]]:_(s64) = G_ASSERT_ZEXT [[COPY]], 32
+ ; CHECK-NEXT: [[TRUNC:%[0-9]+]]:_(s32) = G_TRUNC [[ASSERT_ZEXT]](s64)
+ ; CHECK-NEXT: [[C:%[0-9]+]]:_(s32) = G_CONSTANT i32 24
+ ; CHECK-NEXT: [[SHL:%[0-9]+]]:_(s32) = G_SHL [[TRUNC]], [[C]](s32)
+ ; CHECK-NEXT: [[LSHR:%[0-9]+]]:_(s32) = G_LSHR [[TRUNC]], [[C]](s32)
+ ; CHECK-NEXT: [[ANYEXT:%[0-9]+]]:_(s64) = G_ANYEXT [[LSHR]](s32)
+ ; CHECK-NEXT: [[ANYEXT1:%[0-9]+]]:_(s64) = G_ANYEXT [[SHL]](s32)
+ ; CHECK-NEXT: [[OR:%[0-9]+]]:_(s64) = G_OR [[ANYEXT]], [[ANYEXT1]]
+ ; CHECK-NEXT: [[C1:%[0-9]+]]:_(s32) = G_CONSTANT i32 8
+ ; CHECK-NEXT: [[C2:%[0-9]+]]:_(s64) = G_CONSTANT i64 65280
+ ; CHECK-NEXT: [[AND:%[0-9]+]]:_(s64) = G_AND [[ASSERT_ZEXT]], [[C2]]
+ ; CHECK-NEXT: [[TRUNC1:%[0-9]+]]:_(s32) = G_TRUNC [[AND]](s64)
+ ; CHECK-NEXT: [[SHL1:%[0-9]+]]:_(s32) = G_SHL [[TRUNC1]], [[C1]](s32)
+ ; CHECK-NEXT: [[ANYEXT2:%[0-9]+]]:_(s64) = G_ANYEXT [[SHL1]](s32)
+ ; CHECK-NEXT: [[OR1:%[0-9]+]]:_(s64) = G_OR [[OR]], [[ANYEXT2]]
+ ; CHECK-NEXT: [[LSHR1:%[0-9]+]]:_(s32) = G_LSHR [[TRUNC]], [[C1]](s32)
+ ; CHECK-NEXT: [[ANYEXT3:%[0-9]+]]:_(s64) = G_ANYEXT [[LSHR1]](s32)
+ ; CHECK-NEXT: [[C3:%[0-9]+]]:_(s64) = G_CONSTANT i64 65280
+ ; CHECK-NEXT: [[AND1:%[0-9]+]]:_(s64) = G_AND [[ANYEXT3]], [[C3]]
+ ; CHECK-NEXT: [[OR2:%[0-9]+]]:_(s64) = G_OR [[OR1]], [[AND1]]
+ ; CHECK-NEXT: [[C4:%[0-9]+]]:_(s64) = G_CONSTANT i64 4294967295
+ ; CHECK-NEXT: [[AND2:%[0-9]+]]:_(s64) = G_AND [[OR2]], [[C4]]
+ ; CHECK-NEXT: $x10 = COPY [[AND2]](s64)
+ ; CHECK-NEXT: PseudoRET implicit $x10
+ %0:_(s64) = COPY $x10
+ %1:_(s64) = G_ASSERT_ZEXT %0, 32
+ %2:_(s32) = G_TRUNC %1(s64)
+ %3:_(s32) = G_BSWAP %2
+ %4:_(s64) = G_ZEXT %3(s32)
+ $x10 = COPY %4(s64)
+ PseudoRET implicit $x10
+...
+---
+name: bswap_i64
+body: |
+ bb.0:
+ liveins: $x10
+ ; CHECK-LABEL: name: bswap_i64
+ ; CHECK: liveins: $x10
+ ; CHECK-NEXT: {{ $}}
+ ; CHECK-NEXT: [[COPY:%[0-9]+]]:_(s64) = COPY $x10
+ ; CHECK-NEXT: [[C:%[0-9]+]]:_(s64) = G_CONSTANT i64 56
+ ; CHECK-NEXT: [[SHL:%[0-9]+]]:_(s64) = G_SHL [[COPY]], [[C]](s64)
+ ; CHECK-NEXT: [[LSHR:%[0-9]+]]:_(s64) = G_LSHR [[COPY]], [[C]](s64)
+ ; CHECK-NEXT: [[OR:%[0-9]+]]:_(s64) = G_OR [[LSHR]], [[SHL]]
+ ; CHECK-NEXT: [[C1:%[0-9]+]]:_(s64) = G_CONSTANT i64 65280
+ ; CHECK-NEXT: [[C2:%[0-9]+]]:_(s64) = G_CONSTANT i64 40
+ ; CHECK-NEXT: [[AND:%[0-9]+]]:_(s64) = G_AND [[COPY]], [[C1]]
+ ; CHECK-NEXT: [[SHL1:%[0-9]+]]:_(s64) = G_SHL [[AND]], [[C2]](s64)
+ ; CHECK-NEXT: [[OR1:%[0-9]+]]:_(s64) = G_OR [[OR]], [[SHL1]]
+ ; CHECK-NEXT: [[LSHR1:%[0-9]+]]:_(s64) = G_LSHR [[COPY]], [[C2]](s64)
+ ; CHECK-NEXT: [[AND1:%[0-9]+]]:_(s64) = G_AND [[LSHR1]], [[C1]]
+ ; CHECK-NEXT: [[OR2:%[0-9]+]]:_(s64) = G_OR [[OR1]], [[AND1]]
+ ; CHECK-NEXT: [[C3:%[0-9]+]]:_(s64) = G_CONSTANT i64 16711680
+ ; CHECK-NEXT: [[C4:%[0-9]+]]:_(s64) = G_CONSTANT i64 24
+ ; CHECK-NEXT: [[AND2:%[0-9]+]]:_(s64) = G_AND [[COPY]], [[C3]]
+ ; CHECK-NEXT: [[SHL2:%[0-9]+]]:_(s64) = G_SHL [[AND2]], [[C4]](s64)
+ ; CHECK-NEXT: [[OR3:%[0-9]+]]:_(s64) = G_OR [[OR2]], [[SHL2]]
+ ; CHECK-NEXT: [[LSHR2:%[0-9]+]]:_(s64) = G_LSHR [[COPY]], [[C4]](s64)
+ ; CHECK-NEXT: [[AND3:%[0-9]+]]:_(s64) = G_AND [[LSHR2]], [[C3]]
+ ; CHECK-NEXT: [[OR4:%[0-9]+]]:_(s64) = G_OR [[OR3]], [[AND3]]
+ ; CHECK-NEXT: [[C5:%[0-9]+]]:_(s64) = G_CONSTANT i64 -16777216
+ ; CHECK-NEXT: [[C6:%[0-9]+]]:_(s64) = G_CONSTANT i64 8
+ ; CHECK-NEXT: [[AND4:%[0-9]+]]:_(s64) = G_AND [[COPY]], [[C5]]
+ ; CHECK-NEXT: [[SHL3:%[0-9]+]]:_(s64) = G_SHL [[AND4]], [[C6]](s64)
+ ; CHECK-NEXT: [[OR5:%[0-9]+]]:_(s64) = G_OR [[OR4]], [[SHL3]]
+ ; CHECK-NEXT: [[LSHR3:%[0-9]+]]:_(s64) = G_LSHR [[COPY]], [[C6]](s64)
+ ; CHECK-NEXT: [[AND5:%[0-9]+]]:_(s64) = G_AND [[LSHR3]], [[C5]]
+ ; CHECK-NEXT: [[OR6:%[0-9]+]]:_(s64) = G_OR [[OR5]], [[AND5]]
+ ; CHECK-NEXT: $x10 = COPY [[OR6]](s64)
+ ; CHECK-NEXT: PseudoRET implicit $x10
+ %0:_(s64) = COPY $x10
+ %1:_(s64) = G_BSWAP %0
+ $x10 = COPY %1(s64)
+ PseudoRET implicit $x10
+...
+
|
✅ With the latest revision this PR passed the C/C++ code formatter. |
@@ -81,6 +81,11 @@ RISCVLegalizerInfo::RISCVLegalizerInfo(const RISCVSubtarget &ST) { | |||
.clampScalar(BigTyIdx, XLenLLT, XLenLLT); | |||
} | |||
|
|||
getActionDefinitionsBuilder(G_BSWAP) | |||
.widenScalarToNextPow2(0) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
why is the widenScalar and clampScalar needed? It doens't look like they are being tested.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
widenScalarToNextPow2 I grabbed from AArch64 who does not seem to be testing it as well. I agree that we can drop it because the intrinsic documentation says it will always be a power of 2. The clampScalar I think is needed in the case that there is a bswap.i64 on rv32? The last commit shows the impact on the diff. Do you think it was correct before the last commit?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Any plans to limit the legal types?
@tschuett did you have something else in mind?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
As far as I know bswap isn't required to be a power of 2. Just multiple of 16 bits.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I guess the clamp causes it to be split as a BSWAP instead of creating a bunch of i64 shifts, ands, ors that need to be independently split?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I thought of legalFor
, but ...
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
My understanding is that legalFor
would mark those types as legal and not need to lower them. At least that is what it looks like when I try it. Am I misunderstanding something?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No. You have currently no instructions, thus you must clamp and lower.
@@ -81,6 +81,10 @@ RISCVLegalizerInfo::RISCVLegalizerInfo(const RISCVSubtarget &ST) { | |||
.clampScalar(BigTyIdx, XLenLLT, XLenLLT); | |||
} | |||
|
|||
getActionDefinitionsBuilder(G_BSWAP) | |||
.clampScalar(0, s16, XLenLLT) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Use maxScalar to avoid the bottom clamp.
@@ -82,7 +82,7 @@ RISCVLegalizerInfo::RISCVLegalizerInfo(const RISCVSubtarget &ST) { | |||
} | |||
|
|||
getActionDefinitionsBuilder(G_BSWAP) | |||
.clampScalar(0, s16, XLenLLT) | |||
.minScalar(0, s16) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I was suggesting maxScalar(0, XLenLLT)
. G_BSWAP already requires at least s16 so there's no need for minScalar with s16.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Updated. Dropped the previous change to show it had no impact on diff.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Change to sXLen
c1ff04a
to
5e1e315
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
Lower G_BSWAP into simpler instructions that can be selected in instruction selection.
5e1e315
to
dcf767c
Compare
The test was not updated correctly in #70226. This patch resolves that problem.
Lower G_BSWAP into simpler instructions that can be selected in instruction selection. A future patch can handle when there is Zbb.