AMDGPU generates v_cndmask/readfirstlane for uniform select

Test case:
```ll
define amdgpu_ps i32 @_amdgpu_ps_main(i32 inreg %arg) {
bb:
  %i = icmp eq i32 %arg, 0
  %i1 = zext i1 %i to i64
  %i2 = getelementptr i8, ptr addrspace(4) null, i64 %i1
  %i3 = load i32, ptr addrspace(4) %i2, align 8
  ret i32 %i3
}
```
If I compile with `llc -march=amdgcn -mcpu=gfx900` I get:
```asm
_amdgpu_ps_main:                        ; @_amdgpu_ps_main
; %bb.0:                                ; %bb
	s_cmp_eq_u32 s0, 0
	s_cselect_b64 s[2:3], -1, 0
	v_cndmask_b32_e64 v0, 0, 1, s[2:3]
	s_mov_b32 s1, 0
	v_readfirstlane_b32 s0, v0
	s_load_dword s0, s[0:1], 0x0
	s_waitcnt lgkmcnt(0)
	; return to shader part epilog
```
All computations are uniform, so the use of `v_cndmask_b32_e64` and `v_readfirstlane_b32` is wasteful and inefficient.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

AMDGPU generates v_cndmask/readfirstlane for uniform select #59869

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

AMDGPU generates v_cndmask/readfirstlane for uniform select #59869

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions