Skip to content

AMDGPU: Move attributor into optimization pipeline #83131

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 15 commits into from
Jul 14, 2024
Merged
Show file tree
Hide file tree
Changes from 1 commit
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
The table of contents is too big for display.
Diff view
Diff view
  •  
  •  
  •  
13 changes: 8 additions & 5 deletions llvm/lib/Target/AMDGPU/AMDGPUTargetMachine.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -781,6 +781,14 @@ void AMDGPUTargetMachine::registerPassBuilderCallbacks(

PM.addPass(createCGSCCToFunctionPassAdaptor(std::move(FPM)));
});

// FIXME: Why is AMDGPUAttributor not in CGSCC?
PB.registerOptimizerLastEPCallback(
[this](ModulePassManager &MPM, OptimizationLevel Level) {
if (Level != OptimizationLevel::O0) {
MPM.addPass(AMDGPUAttributorPass(*this));
}
});
}

int64_t AMDGPUTargetMachine::getNullPointerValue(unsigned AddrSpace) {
Expand Down Expand Up @@ -1043,11 +1051,6 @@ void AMDGPUPassConfig::addIRPasses() {
addPass(createAMDGPULowerModuleLDSLegacyPass(&TM));
}

// AMDGPUAttributor infers lack of llvm.amdgcn.lds.kernel.id calls, so run
// after their introduction
if (TM.getOptLevel() > CodeGenOptLevel::None)
addPass(createAMDGPUAttributorLegacyPass());

if (TM.getOptLevel() > CodeGenOptLevel::None)
addPass(createInferAddressSpacesPass());

Expand Down
6 changes: 6 additions & 0 deletions llvm/lib/Target/AMDGPU/SIFrameLowering.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -678,6 +678,12 @@ void SIFrameLowering::emitEntryFunctionPrologue(MachineFunction &MF,
break;
}
}

// FIXME: We can spill incoming arguments and restore at the end of the
// prolog.
if (!ScratchWaveOffsetReg)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this change related?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Kind of. One test hit the assert due to the sgpr limiting attribute

report_fatal_error(
"could not find temporary scratch offset register in prolog");
} else {
ScratchWaveOffsetReg = PreloadedScratchWaveOffsetReg;
}
Expand Down
16 changes: 8 additions & 8 deletions llvm/test/CodeGen/AMDGPU/GlobalISel/addsubu64.ll
Original file line number Diff line number Diff line change
Expand Up @@ -6,8 +6,8 @@ define amdgpu_kernel void @s_add_u64(ptr addrspace(1) %out, i64 %a, i64 %b) {
; GFX11-LABEL: s_add_u64:
; GFX11: ; %bb.0: ; %entry
; GFX11-NEXT: s_clause 0x1
; GFX11-NEXT: s_load_b128 s[4:7], s[0:1], 0x24
; GFX11-NEXT: s_load_b64 s[0:1], s[0:1], 0x34
; GFX11-NEXT: s_load_b128 s[4:7], s[2:3], 0x24
; GFX11-NEXT: s_load_b64 s[0:1], s[2:3], 0x34
; GFX11-NEXT: v_mov_b32_e32 v2, 0
; GFX11-NEXT: s_waitcnt lgkmcnt(0)
; GFX11-NEXT: s_add_u32 s0, s6, s0
Expand All @@ -22,8 +22,8 @@ define amdgpu_kernel void @s_add_u64(ptr addrspace(1) %out, i64 %a, i64 %b) {
; GFX12-LABEL: s_add_u64:
; GFX12: ; %bb.0: ; %entry
; GFX12-NEXT: s_clause 0x1
; GFX12-NEXT: s_load_b128 s[4:7], s[0:1], 0x24
; GFX12-NEXT: s_load_b64 s[0:1], s[0:1], 0x34
; GFX12-NEXT: s_load_b128 s[4:7], s[2:3], 0x24
; GFX12-NEXT: s_load_b64 s[0:1], s[2:3], 0x34
; GFX12-NEXT: v_mov_b32_e32 v2, 0
; GFX12-NEXT: s_wait_kmcnt 0x0
; GFX12-NEXT: s_add_nc_u64 s[0:1], s[6:7], s[0:1]
Expand Down Expand Up @@ -58,8 +58,8 @@ define amdgpu_kernel void @s_sub_u64(ptr addrspace(1) %out, i64 %a, i64 %b) {
; GFX11-LABEL: s_sub_u64:
; GFX11: ; %bb.0: ; %entry
; GFX11-NEXT: s_clause 0x1
; GFX11-NEXT: s_load_b128 s[4:7], s[0:1], 0x24
; GFX11-NEXT: s_load_b64 s[0:1], s[0:1], 0x34
; GFX11-NEXT: s_load_b128 s[4:7], s[2:3], 0x24
; GFX11-NEXT: s_load_b64 s[0:1], s[2:3], 0x34
; GFX11-NEXT: v_mov_b32_e32 v2, 0
; GFX11-NEXT: s_waitcnt lgkmcnt(0)
; GFX11-NEXT: s_sub_u32 s0, s6, s0
Expand All @@ -74,8 +74,8 @@ define amdgpu_kernel void @s_sub_u64(ptr addrspace(1) %out, i64 %a, i64 %b) {
; GFX12-LABEL: s_sub_u64:
; GFX12: ; %bb.0: ; %entry
; GFX12-NEXT: s_clause 0x1
; GFX12-NEXT: s_load_b128 s[4:7], s[0:1], 0x24
; GFX12-NEXT: s_load_b64 s[0:1], s[0:1], 0x34
; GFX12-NEXT: s_load_b128 s[4:7], s[2:3], 0x24
; GFX12-NEXT: s_load_b64 s[0:1], s[2:3], 0x34
; GFX12-NEXT: v_mov_b32_e32 v2, 0
; GFX12-NEXT: s_wait_kmcnt 0x0
; GFX12-NEXT: s_sub_nc_u64 s[0:1], s[6:7], s[0:1]
Expand Down
Loading