-
Notifications
You must be signed in to change notification settings - Fork 13.5k
[AMDGPU] Precommit si-fold-bitmask.mir #131310
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[AMDGPU] Precommit si-fold-bitmask.mir #131310
Conversation
Warning This pull request is not mergeable via GitHub because a downstack PR is open. Once all requirements are satisfied, merge this PR as a stack on Graphite.
This stack of pull requests is managed by Graphite. Learn more about stacking. |
@llvm/pr-subscribers-backend-amdgpu Author: Pierre van Houtryve (Pierre-vh) ChangesFull diff: https://github.com/llvm/llvm-project/pull/131310.diff 1 Files Affected:
diff --git a/llvm/test/CodeGen/AMDGPU/si-fold-bitmasks.mir b/llvm/test/CodeGen/AMDGPU/si-fold-bitmasks.mir
new file mode 100644
index 0000000000000..1edf970591179
--- /dev/null
+++ b/llvm/test/CodeGen/AMDGPU/si-fold-bitmasks.mir
@@ -0,0 +1,429 @@
+# NOTE: Assertions have been autogenerated by utils/update_mir_test_checks.py
+# RUN: llc -mtriple=amdgcn-amd-amdhsa -mcpu=gfx1010 -run-pass=si-fold-operands -verify-machineinstrs -o - %s | FileCheck --check-prefix=GCN %s
+
+# Test supported instructions
+
+---
+name: v_ashr_i32_e64__v_and_b32_e32
+tracksRegLiveness: true
+body: |
+ bb.0:
+ liveins: $vgpr0, $vgpr1
+
+ ; GCN-LABEL: name: v_ashr_i32_e64__v_and_b32_e32
+ ; GCN: liveins: $vgpr0, $vgpr1
+ ; GCN-NEXT: {{ $}}
+ ; GCN-NEXT: %src:vgpr_32 = COPY $vgpr0
+ ; GCN-NEXT: %shift:vgpr_32 = COPY $vgpr1
+ ; GCN-NEXT: %shiftmask:vgpr_32 = V_AND_B32_e32 65535, %shift, implicit $exec
+ ; GCN-NEXT: %ret:vgpr_32 = V_ASHR_I32_e64 %src, %shiftmask, implicit $exec
+ ; GCN-NEXT: $vgpr0 = COPY %ret
+ %src:vgpr_32 = COPY $vgpr0
+ %shift:vgpr_32 = COPY $vgpr1
+ %shiftmask:vgpr_32 = V_AND_B32_e32 65535, %shift, implicit $exec
+ %ret:vgpr_32 = V_ASHR_I32_e64 %src, %shiftmask, implicit $exec
+ $vgpr0 = COPY %ret
+...
+
+---
+name: v_lshr_b32_e64__v_and_b32_e32
+tracksRegLiveness: true
+body: |
+ bb.0:
+ liveins: $vgpr0, $vgpr1
+
+ ; GCN-LABEL: name: v_lshr_b32_e64__v_and_b32_e32
+ ; GCN: liveins: $vgpr0, $vgpr1
+ ; GCN-NEXT: {{ $}}
+ ; GCN-NEXT: %src:vgpr_32 = COPY $vgpr0
+ ; GCN-NEXT: %shift:vgpr_32 = COPY $vgpr1
+ ; GCN-NEXT: %shiftmask:vgpr_32 = V_AND_B32_e32 65535, %shift, implicit $exec
+ ; GCN-NEXT: %ret:vgpr_32 = V_LSHR_B32_e64 %src, %shiftmask, implicit $exec
+ ; GCN-NEXT: $vgpr0 = COPY %ret
+ %src:vgpr_32 = COPY $vgpr0
+ %shift:vgpr_32 = COPY $vgpr1
+ %shiftmask:vgpr_32 = V_AND_B32_e32 65535, %shift, implicit $exec
+ %ret:vgpr_32 = V_LSHR_B32_e64 %src, %shiftmask, implicit $exec
+ $vgpr0 = COPY %ret
+...
+
+---
+name: v_lshr_b32_e32__v_and_b32_e32
+tracksRegLiveness: true
+body: |
+ bb.0:
+ liveins: $vgpr0, $vgpr1
+
+ ; GCN-LABEL: name: v_lshr_b32_e32__v_and_b32_e32
+ ; GCN: liveins: $vgpr0, $vgpr1
+ ; GCN-NEXT: {{ $}}
+ ; GCN-NEXT: %src:vgpr_32 = COPY $vgpr0
+ ; GCN-NEXT: %shift:vgpr_32 = COPY $vgpr1
+ ; GCN-NEXT: %shiftmask:vgpr_32 = V_AND_B32_e64 65535, %shift, implicit $exec
+ ; GCN-NEXT: %ret:vgpr_32 = V_LSHR_B32_e32 %src, %shiftmask, implicit $exec
+ ; GCN-NEXT: $vgpr0 = COPY %ret
+ %src:vgpr_32 = COPY $vgpr0
+ %shift:vgpr_32 = COPY $vgpr1
+ %shiftmask:vgpr_32 = V_AND_B32_e64 65535, %shift, implicit $exec
+ %ret:vgpr_32 = V_LSHR_B32_e32 %src, %shiftmask, implicit $exec
+ $vgpr0 = COPY %ret
+...
+
+---
+name: v_lshl_b32_e64__v_and_b32_e32
+tracksRegLiveness: true
+body: |
+ bb.0:
+ liveins: $vgpr0, $vgpr1
+
+ ; GCN-LABEL: name: v_lshl_b32_e64__v_and_b32_e32
+ ; GCN: liveins: $vgpr0, $vgpr1
+ ; GCN-NEXT: {{ $}}
+ ; GCN-NEXT: %src:vgpr_32 = COPY $vgpr0
+ ; GCN-NEXT: %shift:vgpr_32 = COPY $vgpr1
+ ; GCN-NEXT: %shiftmask:vgpr_32 = V_AND_B32_e64 65535, %shift, implicit $exec
+ ; GCN-NEXT: %ret:vgpr_32 = V_LSHL_B32_e64 %src, %shiftmask, implicit $exec
+ ; GCN-NEXT: $vgpr0 = COPY %ret
+ %src:vgpr_32 = COPY $vgpr0
+ %shift:vgpr_32 = COPY $vgpr1
+ %shiftmask:vgpr_32 = V_AND_B32_e64 65535, %shift, implicit $exec
+ %ret:vgpr_32 = V_LSHL_B32_e64 %src, %shiftmask, implicit $exec
+ $vgpr0 = COPY %ret
+...
+
+---
+name: s_lshl_b32__s_and_b32
+tracksRegLiveness: true
+body: |
+ bb.0:
+ liveins: $sgpr0, $sgpr1
+
+ ; GCN-LABEL: name: s_lshl_b32__s_and_b32
+ ; GCN: liveins: $sgpr0, $sgpr1
+ ; GCN-NEXT: {{ $}}
+ ; GCN-NEXT: %src:sgpr_32 = COPY $sgpr0
+ ; GCN-NEXT: %shift:sgpr_32 = COPY $sgpr1
+ ; GCN-NEXT: %shiftmask:sgpr_32 = S_AND_B32 65535, %shift, implicit-def $scc
+ ; GCN-NEXT: %ret:sgpr_32 = S_LSHL_B32 %src, %shiftmask, implicit-def $scc
+ ; GCN-NEXT: $sgpr0 = COPY %ret
+ %src:sgpr_32 = COPY $sgpr0
+ %shift:sgpr_32 = COPY $sgpr1
+ %shiftmask:sgpr_32 = S_AND_B32 65535, %shift, implicit-def $scc
+ %ret:sgpr_32 = S_LSHL_B32 %src, %shiftmask, implicit-def $scc
+ $sgpr0 = COPY %ret
+...
+
+---
+name: s_lshr_b32__s_and_b32
+tracksRegLiveness: true
+body: |
+ bb.0:
+ liveins: $sgpr0, $sgpr1
+
+ ; GCN-LABEL: name: s_lshr_b32__s_and_b32
+ ; GCN: liveins: $sgpr0, $sgpr1
+ ; GCN-NEXT: {{ $}}
+ ; GCN-NEXT: %src:sgpr_32 = COPY $sgpr0
+ ; GCN-NEXT: %shift:sgpr_32 = COPY $sgpr1
+ ; GCN-NEXT: %shiftmask:sgpr_32 = S_AND_B32 65535, %shift, implicit-def $scc
+ ; GCN-NEXT: %ret:sgpr_32 = S_LSHR_B32 %src, %shiftmask, implicit-def $scc
+ ; GCN-NEXT: $sgpr0 = COPY %ret
+ %src:sgpr_32 = COPY $sgpr0
+ %shift:sgpr_32 = COPY $sgpr1
+ %shiftmask:sgpr_32 = S_AND_B32 65535, %shift, implicit-def $scc
+ %ret:sgpr_32 = S_LSHR_B32 %src, %shiftmask, implicit-def $scc
+ $sgpr0 = COPY %ret
+...
+
+---
+name: s_ashr_i32__s_and_b32
+tracksRegLiveness: true
+body: |
+ bb.0:
+ liveins: $sgpr0, $sgpr1
+
+ ; GCN-LABEL: name: s_ashr_i32__s_and_b32
+ ; GCN: liveins: $sgpr0, $sgpr1
+ ; GCN-NEXT: {{ $}}
+ ; GCN-NEXT: %src:sgpr_32 = COPY $sgpr0
+ ; GCN-NEXT: %shift:sgpr_32 = COPY $sgpr1
+ ; GCN-NEXT: %shiftmask:sgpr_32 = S_AND_B32 65535, %shift, implicit-def $scc
+ ; GCN-NEXT: %ret:sgpr_32 = S_ASHR_I32 %src, %shiftmask, implicit-def $scc
+ ; GCN-NEXT: $sgpr0 = COPY %ret
+ %src:sgpr_32 = COPY $sgpr0
+ %shift:sgpr_32 = COPY $sgpr1
+ %shiftmask:sgpr_32 = S_AND_B32 65535, %shift, implicit-def $scc
+ %ret:sgpr_32 = S_ASHR_I32 %src, %shiftmask, implicit-def $scc
+ $sgpr0 = COPY %ret
+...
+
+---
+name: s_lshl_b64__s_and_b32
+tracksRegLiveness: true
+body: |
+ bb.0:
+ liveins: $sgpr0_sgpr1, $sgpr2
+
+ ; GCN-LABEL: name: s_lshl_b64__s_and_b32
+ ; GCN: liveins: $sgpr0_sgpr1, $sgpr2
+ ; GCN-NEXT: {{ $}}
+ ; GCN-NEXT: %src:sgpr_64 = COPY $sgpr0_sgpr1
+ ; GCN-NEXT: %shift:sgpr_32 = COPY $sgpr2
+ ; GCN-NEXT: %shiftmask:sgpr_32 = S_AND_B32 63, %shift, implicit-def $scc
+ ; GCN-NEXT: %ret:sgpr_64 = S_LSHL_B64 %src, %shiftmask, implicit-def $scc
+ ; GCN-NEXT: $sgpr0_sgpr1 = COPY %ret
+ %src:sgpr_64 = COPY $sgpr0_sgpr1
+ %shift:sgpr_32 = COPY $sgpr2
+ %shiftmask:sgpr_32 = S_AND_B32 63, %shift, implicit-def $scc
+ %ret:sgpr_64 = S_LSHL_B64 %src, %shiftmask, implicit-def $scc
+ $sgpr0_sgpr1 = COPY %ret
+...
+
+---
+name: s_lshr_b64__s_and_b32
+tracksRegLiveness: true
+body: |
+ bb.0:
+ liveins: $sgpr0_sgpr1, $sgpr2
+
+ ; GCN-LABEL: name: s_lshr_b64__s_and_b32
+ ; GCN: liveins: $sgpr0_sgpr1, $sgpr2
+ ; GCN-NEXT: {{ $}}
+ ; GCN-NEXT: %src:sgpr_64 = COPY $sgpr0_sgpr1
+ ; GCN-NEXT: %shift:sgpr_32 = COPY $sgpr2
+ ; GCN-NEXT: %shiftmask:sgpr_32 = S_AND_B32 63, %shift, implicit-def $scc
+ ; GCN-NEXT: %ret:sgpr_64 = S_LSHR_B64 %src, %shiftmask, implicit-def $scc
+ ; GCN-NEXT: $sgpr0_sgpr1 = COPY %ret
+ %src:sgpr_64 = COPY $sgpr0_sgpr1
+ %shift:sgpr_32 = COPY $sgpr2
+ %shiftmask:sgpr_32 = S_AND_B32 63, %shift, implicit-def $scc
+ %ret:sgpr_64 = S_LSHR_B64 %src, %shiftmask, implicit-def $scc
+ $sgpr0_sgpr1 = COPY %ret
+...
+
+---
+name: s_ashr_i64__s_and_b32
+tracksRegLiveness: true
+body: |
+ bb.0:
+ liveins: $sgpr0_sgpr1, $sgpr2
+
+ ; GCN-LABEL: name: s_ashr_i64__s_and_b32
+ ; GCN: liveins: $sgpr0_sgpr1, $sgpr2
+ ; GCN-NEXT: {{ $}}
+ ; GCN-NEXT: %src:sgpr_64 = COPY $sgpr0_sgpr1
+ ; GCN-NEXT: %shift:sgpr_32 = COPY $sgpr2
+ ; GCN-NEXT: %shiftmask:sgpr_32 = S_AND_B32 63, %shift, implicit-def $scc
+ ; GCN-NEXT: %ret:sgpr_64 = S_ASHR_I64 %src, %shiftmask, implicit-def $scc
+ ; GCN-NEXT: $sgpr0_sgpr1 = COPY %ret
+ %src:sgpr_64 = COPY $sgpr0_sgpr1
+ %shift:sgpr_32 = COPY $sgpr2
+ %shiftmask:sgpr_32 = S_AND_B32 63, %shift, implicit-def $scc
+ %ret:sgpr_64 = S_ASHR_I64 %src, %shiftmask, implicit-def $scc
+ $sgpr0_sgpr1 = COPY %ret
+...
+
+---
+name: v_lshlrev_b32_e64__v_and_b32_e32
+tracksRegLiveness: true
+body: |
+ bb.0:
+ liveins: $vgpr0, $vgpr1
+
+ ; GCN-LABEL: name: v_lshlrev_b32_e64__v_and_b32_e32
+ ; GCN: liveins: $vgpr0, $vgpr1
+ ; GCN-NEXT: {{ $}}
+ ; GCN-NEXT: %src:vgpr_32 = COPY $vgpr0
+ ; GCN-NEXT: %shift:vgpr_32 = COPY $vgpr1
+ ; GCN-NEXT: %shiftmask:vgpr_32 = V_AND_B32_e32 65535, %shift, implicit $exec
+ ; GCN-NEXT: %ret:vgpr_32 = V_LSHLREV_B32_e64 %shiftmask, %src, implicit $exec
+ ; GCN-NEXT: $vgpr0 = COPY %ret
+ %src:vgpr_32 = COPY $vgpr0
+ %shift:vgpr_32 = COPY $vgpr1
+ %shiftmask:vgpr_32 = V_AND_B32_e32 65535, %shift, implicit $exec
+ %ret:vgpr_32 = V_LSHLREV_B32_e64 %shiftmask, %src, implicit $exec
+ $vgpr0 = COPY %ret
+...
+
+---
+name: v_lshrrev_b32_e64__v_and_b32_e32
+tracksRegLiveness: true
+body: |
+ bb.0:
+ liveins: $vgpr0, $vgpr1
+
+ ; GCN-LABEL: name: v_lshrrev_b32_e64__v_and_b32_e32
+ ; GCN: liveins: $vgpr0, $vgpr1
+ ; GCN-NEXT: {{ $}}
+ ; GCN-NEXT: %src:vgpr_32 = COPY $vgpr0
+ ; GCN-NEXT: %shift:vgpr_32 = COPY $vgpr1
+ ; GCN-NEXT: %shiftmask:vgpr_32 = V_AND_B32_e32 65535, %shift, implicit $exec
+ ; GCN-NEXT: %ret:vgpr_32 = V_LSHRREV_B32_e64 %shiftmask, %src, implicit $exec
+ ; GCN-NEXT: $vgpr0 = COPY %ret
+ %src:vgpr_32 = COPY $vgpr0
+ %shift:vgpr_32 = COPY $vgpr1
+ %shiftmask:vgpr_32 = V_AND_B32_e32 65535, %shift, implicit $exec
+ %ret:vgpr_32 = V_LSHRREV_B32_e64 %shiftmask, %src, implicit $exec
+ $vgpr0 = COPY %ret
+...
+
+---
+name: v_ashrrev_i32_e64__v_and_b32_e32
+tracksRegLiveness: true
+body: |
+ bb.0:
+ liveins: $vgpr0, $vgpr1
+
+ ; GCN-LABEL: name: v_ashrrev_i32_e64__v_and_b32_e32
+ ; GCN: liveins: $vgpr0, $vgpr1
+ ; GCN-NEXT: {{ $}}
+ ; GCN-NEXT: %src:vgpr_32 = COPY $vgpr0
+ ; GCN-NEXT: %shift:vgpr_32 = COPY $vgpr1
+ ; GCN-NEXT: %shiftmask:vgpr_32 = V_AND_B32_e32 65535, %shift, implicit $exec
+ ; GCN-NEXT: %ret:vgpr_32 = V_ASHRREV_I32_e64 %shiftmask, %src, implicit $exec
+ ; GCN-NEXT: $vgpr0 = COPY %ret
+ %src:vgpr_32 = COPY $vgpr0
+ %shift:vgpr_32 = COPY $vgpr1
+ %shiftmask:vgpr_32 = V_AND_B32_e32 65535, %shift, implicit $exec
+ %ret:vgpr_32 = V_ASHRREV_I32_e64 %shiftmask, %src, implicit $exec
+ $vgpr0 = COPY %ret
+...
+
+# Test interesting cases
+
+---
+name: flipped_operands
+tracksRegLiveness: true
+body: |
+ bb.0:
+ liveins: $vgpr0, $vgpr1
+
+ ; GCN-LABEL: name: flipped_operands
+ ; GCN: liveins: $vgpr0, $vgpr1
+ ; GCN-NEXT: {{ $}}
+ ; GCN-NEXT: %src:vgpr_32 = COPY $vgpr0
+ ; GCN-NEXT: %shift:vgpr_32 = COPY $vgpr1
+ ; GCN-NEXT: %shiftmask:vgpr_32 = V_AND_B32_e32 65535, %shift, implicit $exec
+ ; GCN-NEXT: %ret:vgpr_32 = V_ASHR_I32_e64 %shiftmask, %src, implicit $exec
+ ; GCN-NEXT: $vgpr0 = COPY %ret
+ %src:vgpr_32 = COPY $vgpr0
+ %shift:vgpr_32 = COPY $vgpr1
+ %shiftmask:vgpr_32 = V_AND_B32_e32 65535, %shift, implicit $exec
+ %ret:vgpr_32 = V_ASHR_I32_e64 %shiftmask, %src, implicit $exec
+ $vgpr0 = COPY %ret
+...
+
+---
+name: flipped_operands_rev
+tracksRegLiveness: true
+body: |
+ bb.0:
+ liveins: $vgpr0, $vgpr1
+
+ ; GCN-LABEL: name: flipped_operands_rev
+ ; GCN: liveins: $vgpr0, $vgpr1
+ ; GCN-NEXT: {{ $}}
+ ; GCN-NEXT: %src:vgpr_32 = COPY $vgpr0
+ ; GCN-NEXT: %shift:vgpr_32 = COPY $vgpr1
+ ; GCN-NEXT: %shiftmask:vgpr_32 = V_AND_B32_e32 65535, %shift, implicit $exec
+ ; GCN-NEXT: %ret:vgpr_32 = V_ASHRREV_I32_e64 %src, %shiftmask, implicit $exec
+ ; GCN-NEXT: $vgpr0 = COPY %ret
+ %src:vgpr_32 = COPY $vgpr0
+ %shift:vgpr_32 = COPY $vgpr1
+ %shiftmask:vgpr_32 = V_AND_B32_e32 65535, %shift, implicit $exec
+ %ret:vgpr_32 = V_ASHRREV_I32_e64 %src, %shiftmask, implicit $exec
+ $vgpr0 = COPY %ret
+...
+
+# 30 = 11110 = doesn't cover all bits.
+---
+name: shift32_mask_too_small
+tracksRegLiveness: true
+body: |
+ bb.0:
+ liveins: $sgpr0, $sgpr1
+
+ ; GCN-LABEL: name: shift32_mask_too_small
+ ; GCN: liveins: $sgpr0, $sgpr1
+ ; GCN-NEXT: {{ $}}
+ ; GCN-NEXT: %src:sgpr_32 = COPY $sgpr0
+ ; GCN-NEXT: %shift:sgpr_32 = COPY $sgpr1
+ ; GCN-NEXT: %shiftmask:sgpr_32 = S_AND_B32 30, %shift, implicit-def $scc
+ ; GCN-NEXT: %ret:sgpr_32 = S_LSHL_B32 %src, %shiftmask, implicit-def $scc
+ ; GCN-NEXT: $sgpr0 = COPY %ret
+ %src:sgpr_32 = COPY $sgpr0
+ %shift:sgpr_32 = COPY $sgpr1
+ %shiftmask:sgpr_32 = S_AND_B32 30, %shift, implicit-def $scc
+ %ret:sgpr_32 = S_LSHL_B32 %src, %shiftmask, implicit-def $scc
+ $sgpr0 = COPY %ret
+...
+
+# 90 = 1011010 = doesn't cover all bits.
+---
+name: shift32_mask_has_holes
+tracksRegLiveness: true
+body: |
+ bb.0:
+ liveins: $sgpr0, $sgpr1
+
+ ; GCN-LABEL: name: shift32_mask_has_holes
+ ; GCN: liveins: $sgpr0, $sgpr1
+ ; GCN-NEXT: {{ $}}
+ ; GCN-NEXT: %src:sgpr_32 = COPY $sgpr0
+ ; GCN-NEXT: %shift:sgpr_32 = COPY $sgpr1
+ ; GCN-NEXT: %shiftmask:sgpr_32 = S_AND_B32 90, %shift, implicit-def $scc
+ ; GCN-NEXT: %ret:sgpr_32 = S_LSHL_B32 %src, %shiftmask, implicit-def $scc
+ ; GCN-NEXT: $sgpr0 = COPY %ret
+ %src:sgpr_32 = COPY $sgpr0
+ %shift:sgpr_32 = COPY $sgpr1
+ %shiftmask:sgpr_32 = S_AND_B32 90, %shift, implicit-def $scc
+ %ret:sgpr_32 = S_LSHL_B32 %src, %shiftmask, implicit-def $scc
+ $sgpr0 = COPY %ret
+...
+
+# 30 = 11110 = doesn't cover all bits.
+---
+name: shift64_mask_too_small
+tracksRegLiveness: true
+body: |
+ bb.0:
+ liveins: $sgpr0_sgpr1, $sgpr2
+
+ ; GCN-LABEL: name: shift64_mask_too_small
+ ; GCN: liveins: $sgpr0_sgpr1, $sgpr2
+ ; GCN-NEXT: {{ $}}
+ ; GCN-NEXT: %src:sgpr_64 = COPY $sgpr0_sgpr1
+ ; GCN-NEXT: %shift:sgpr_32 = COPY $sgpr2
+ ; GCN-NEXT: %shiftmask:sgpr_32 = S_AND_B32 30, %shift, implicit-def $scc
+ ; GCN-NEXT: %ret:sgpr_64 = S_LSHR_B64 %src, %shiftmask, implicit-def $scc
+ ; GCN-NEXT: $sgpr0_sgpr1 = COPY %ret
+ %src:sgpr_64 = COPY $sgpr0_sgpr1
+ %shift:sgpr_32 = COPY $sgpr2
+ %shiftmask:sgpr_32 = S_AND_B32 30, %shift, implicit-def $scc
+ %ret:sgpr_64 = S_LSHR_B64 %src, %shiftmask, implicit-def $scc
+ $sgpr0_sgpr1 = COPY %ret
+...
+
+
+# 90 = 1011010 = doesn't cover all bits.
+---
+name: shift64_mask_has_holes
+tracksRegLiveness: true
+body: |
+ bb.0:
+ liveins: $sgpr0_sgpr1, $sgpr2
+
+ ; GCN-LABEL: name: shift64_mask_has_holes
+ ; GCN: liveins: $sgpr0_sgpr1, $sgpr2
+ ; GCN-NEXT: {{ $}}
+ ; GCN-NEXT: %src:sgpr_64 = COPY $sgpr0_sgpr1
+ ; GCN-NEXT: %shift:sgpr_32 = COPY $sgpr2
+ ; GCN-NEXT: %shiftmask:sgpr_32 = S_AND_B32 90, %shift, implicit-def $scc
+ ; GCN-NEXT: %ret:sgpr_64 = S_LSHR_B64 %src, %shiftmask, implicit-def $scc
+ ; GCN-NEXT: $sgpr0_sgpr1 = COPY %ret
+ %src:sgpr_64 = COPY $sgpr0_sgpr1
+ %shift:sgpr_32 = COPY $sgpr2
+ %shiftmask:sgpr_32 = S_AND_B32 90, %shift, implicit-def $scc
+ %ret:sgpr_64 = S_LSHR_B64 %src, %shiftmask, implicit-def $scc
+ $sgpr0_sgpr1 = COPY %ret
+...
|
b87a9db
to
fcd5623
Compare
ee917df
to
c30cc50
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hopefully this is to deal with DAG regressions? I would hope globalisel doesn't need this
@@ -0,0 +1,429 @@ | |||
# NOTE: Assertions have been autogenerated by utils/update_mir_test_checks.py | |||
# RUN: llc -mtriple=amdgcn-amd-amdhsa -mcpu=gfx1010 -run-pass=si-fold-operands -verify-machineinstrs -o - %s | FileCheck --check-prefix=GCN %s |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
# RUN: llc -mtriple=amdgcn-amd-amdhsa -mcpu=gfx1010 -run-pass=si-fold-operands -verify-machineinstrs -o - %s | FileCheck --check-prefix=GCN %s | |
# RUN: llc -mtriple=amdgcn-amd-amdhsa -mcpu=gfx1010 -run-pass=si-fold-operands -o - %s | FileCheck --check-prefix=GCN %s |
GlobalISel unfortunately needs it. We can end up with things like a I tried fixing this elsewhere but it's messy. I don't think we can do it before lowering unless we start replacing all Doing it during ISel by looking through zext is another option but after spending a day trying to bend the DAG emitter to do what I want, I gave up on that. We have so many patterns using our |
It should always be post legalize / post regbankselect combinable. Things are strictly more difficult after selection |
The main issue I was having was with code that had <32 bit arguments in registers.
Then %2 being used as the shift amount. We can't eliminate the zext/trunc because the generic opcode has no mention of reading only the lower bits, AFAIK. I tried experimenting with multiple approaches but I didn't find anything better than doing it in SIFoldOperand |
We can fold the clamp of the shift amount into the shift instruction during selection as we know the instruction ignores the high bits. We do that in the DAG path already. I think it special cases the and & (bitwidth - 1) pattern, which should form canonically. In principle it could do a general simplify demand bits |
Where and how should that be implemented ? I struggled with that. I tried adding a new special case in TableGen but I just couldn't find the right way to do it. |
fcd5623
to
6db5fe8
Compare
090fa3e
to
2dc7126
Compare
6db5fe8
to
d4b257d
Compare
2dc7126
to
d65db02
Compare
d4b257d
to
65d5012
Compare
It already exists as a complex pattern, isUnneededShiftMask. The combiners should be trying to get the clamping code into this form which expects the and |
I tried it but the DAG immediately transforms |
Then isUnneededShiftMask should probably recognize more forms of the pattern. I'd assume this only forms the zext if it is legal |
Yes, but with a cast operation it gets tricky. If I add |
No description provided.