Skip to content

AMDGPU: No live subrange at use for an undef subreg right after LIS creation with partially defined register #125948

Open
@rampitec

Description

@rampitec

The following testcase fails verification:

# RUN: llc -march=amdgcn -mcpu=gfx1100 -verify-coalescing -o - -run-pass=register-coalescer %s

---
name:            test
tracksRegLiveness: true
liveins:
  - { reg: '$vgpr0', virtual-reg: '%0' }
body:             |
  bb.0:
    liveins: $vgpr0

    %0:vgpr_32(s32) = COPY killed $vgpr0
    %1:vgpr_32 = V_AND_B32_e32 1023, killed %0(s32), implicit $exec
    undef %2.sub0:vreg_128_align2 = COPY %1
    undef %3.sub1:vreg_256_align2 = COPY killed %1
    %4:sgpr_32 = V_READFIRSTLANE_B32 killed %2.sub0, implicit $exec
    %5:sgpr_32 = S_MOV_B32 0
    undef %6.sub0:sgpr_128 = COPY killed %4
    %6.sub1:sgpr_128 = COPY %5
    %6.sub2:sgpr_128 = COPY %5
    %6.sub3:sgpr_128 = COPY %5
    %7:sgpr_32 = S_MOV_B32 65536
    %8:sgpr_32 = V_READFIRSTLANE_B32 %3.sub1, implicit $exec
    %9:sgpr_32 = V_READFIRSTLANE_B32 %3.sub3, implicit $exec
    %10:sgpr_32 = V_READFIRSTLANE_B32 %3.sub5, implicit $exec
    %11:sgpr_32 = V_READFIRSTLANE_B32 %3.sub6, implicit $exec
    %12:sgpr_32 = V_READFIRSTLANE_B32 killed %3.sub7, implicit $exec
    undef %13.sub0:sgpr_256 = COPY killed %7
    %13.sub1:sgpr_256 = COPY killed %8
    %13.sub2:sgpr_256 = COPY %5
    %13.sub3:sgpr_256 = COPY killed %9
    %13.sub4:sgpr_256 = COPY killed %5
    %13.sub5:sgpr_256 = COPY killed %10
    %13.sub6:sgpr_256 = COPY killed %11
    %13.sub7:sgpr_256 = COPY killed %12
    S_ENDPGM 0, implicit killed %6, implicit killed %13

...

It has multiple verification errors, essentially like this:

# Before register coalescing
...
*** Bad machine code: No live subrange at use ***
- function:    test
- basic block: %bb.0  (0x57249c37bb48) [0B;416B)
- instruction: 208B     %9:sgpr_32 = V_READFIRSTLANE_B32 %3.sub3:vreg_256_align2, implicit $exec
- operand 1:   %3.sub3:vreg_256_align2
- interval:    %3 [64r,256r:0) 0@64r  L000000000000000C [64r,192r:0) 0@64r  weight:0.000000e+00
- at:          208B

This is indeed true, only %3.sub1 is defined. These uses should be marked as undef, then no error occurs, but apparently there is nothing to set undef flag of these. There is detect-dead-lanes pass, but it only works with a small amount of opcodes. In similar cases register coalescer marks operands undef, but not here.

This is the same testcase, stopped before the detect-dead-lanes:

# RUN: llc -march=amdgcn -mcpu=gfx1100 -start-before=detect-dead-lanes -verify-coalescing -o - %s

---
name:            test
tracksRegLiveness: true
liveins:
  - { reg: '$vgpr0', virtual-reg: '%0' }
body:             |
  bb.0:
    liveins: $vgpr0

    %0:vgpr_32(s32) = COPY $vgpr0
    %6:vgpr_32 = V_AND_B32_e32 1023, %0(s32), implicit $exec
    %30:vgpr_32 = V_MOV_B32_e32 0, implicit $exec
    %29:vreg_128_align2 = REG_SEQUENCE %6, %subreg.sub0, %30, %subreg.sub1, %30, %subreg.sub2, %30, %subreg.sub3
    %11:sreg_32 = IMPLICIT_DEF
    %13:sreg_32 = IMPLICIT_DEF
    %15:sreg_32 = IMPLICIT_DEF
    %17:sreg_32 = IMPLICIT_DEF
    %22:vgpr_32 = V_MOV_B32_e32 65536, implicit $exec
    %24:vgpr_32 = COPY %11
    %26:vgpr_32 = COPY %13
    %27:vgpr_32 = COPY %15
    %28:vgpr_32 = COPY %17
    %21:vreg_256_align2 = REG_SEQUENCE killed %22, %subreg.sub0, %6, %subreg.sub1, %30, %subreg.sub2, killed %24, %subreg.sub3, %30, %subreg.sub4, killed %26, %subreg.sub5, killed %27, %subreg.sub6, killed %28, %subreg.sub7
    %34:sgpr_32 = V_READFIRSTLANE_B32 %29.sub0, implicit $exec
    %35:sgpr_32 = S_MOV_B32 0
    %33:sgpr_128 = REG_SEQUENCE %34, %subreg.sub0, %35, %subreg.sub1, %35, %subreg.sub2, %35, %subreg.sub3
    %39:sgpr_32 = S_MOV_B32 65536
    %40:sgpr_32 = V_READFIRSTLANE_B32 %21.sub1, implicit $exec
    %42:sgpr_32 = V_READFIRSTLANE_B32 %21.sub3, implicit $exec
    %44:sgpr_32 = V_READFIRSTLANE_B32 %21.sub5, implicit $exec
    %45:sgpr_32 = V_READFIRSTLANE_B32 %21.sub6, implicit $exec
    %46:sgpr_32 = V_READFIRSTLANE_B32 %21.sub7, implicit $exec
    %38:sgpr_256 = REG_SEQUENCE %39, %subreg.sub0, %40, %subreg.sub1, %35, %subreg.sub2, %42, %subreg.sub3, %35, %subreg.sub4, %44, %subreg.sub5, %45, %subreg.sub6, %46, %subreg.sub7
    S_ENDPGM 0, implicit killed %33, implicit killed %38

...

Should it be marked undef with LIS creation itself?

Metadata

Metadata

Assignees

No one assigned

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions