Skip to content

[clang][CodeGen] sret args should always point to the alloca AS, so use that #114062

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 55 commits into from
Feb 14, 2025
Merged
Show file tree
Hide file tree
Changes from 4 commits
Commits
Show all changes
55 commits
Select commit Hold shift + click to select a range
d2d2d3d
`sret` args should always point to the `alloca` AS, so we can use that.
AlexVlx Oct 29, 2024
693253d
Merge branch 'main' of https://github.com/llvm/llvm-project into sret…
AlexVlx Oct 29, 2024
b5a7df0
Fix broken tests.
AlexVlx Oct 29, 2024
f6cff66
Merge branch 'main' of https://github.com/llvm/llvm-project into sret…
AlexVlx Oct 29, 2024
2de33d4
Handle passing an `alloca`ed `sret` arg directly to a callee that exp…
AlexVlx Oct 30, 2024
6d9cb89
Merge branch 'main' of https://github.com/llvm/llvm-project into sret…
AlexVlx Nov 1, 2024
b209d67
Add query for a possible target specific indirect arg AS.
AlexVlx Nov 2, 2024
ac6367b
Add more context to test.
AlexVlx Nov 2, 2024
c8f03e7
Merge branch 'main' of https://github.com/llvm/llvm-project into sret…
AlexVlx Nov 4, 2024
24d8edb
Merge branch 'main' of https://github.com/llvm/llvm-project into sret…
AlexVlx Nov 5, 2024
5ccd554
Merge branch 'main' of https://github.com/llvm/llvm-project into sret…
AlexVlx Nov 6, 2024
9ff1d0d
Extend Indirect Args to carry an address space.
AlexVlx Nov 6, 2024
1c3e67c
Fix formatting.
AlexVlx Nov 6, 2024
c9288fc
Drop vestigial target hook.
AlexVlx Nov 7, 2024
99e03a2
Merge branch 'main' of https://github.com/llvm/llvm-project into sret…
AlexVlx Nov 13, 2024
013790c
Tweak handling potential AS mismatches.
AlexVlx Nov 15, 2024
c4bdeab
Fix formatting.
AlexVlx Nov 15, 2024
5afb40e
Merge branch 'main' of https://github.com/llvm/llvm-project into sret…
AlexVlx Nov 18, 2024
d07d63d
Merge branch 'main' of https://github.com/llvm/llvm-project into sret…
AlexVlx Nov 22, 2024
eeb54e4
Remove lie.
AlexVlx Nov 24, 2024
6c0ef88
Merge branch 'main' of https://github.com/llvm/llvm-project into sret…
AlexVlx Dec 4, 2024
abab201
Merge branch 'main' of https://github.com/llvm/llvm-project into sret…
AlexVlx Dec 4, 2024
6e78db1
Merge branch 'main' of https://github.com/llvm/llvm-project into sret…
AlexVlx Dec 4, 2024
7d45638
Merge branch 'main' of https://github.com/llvm/llvm-project into sret…
AlexVlx Dec 5, 2024
f16d1d9
Generalise placing `sret`/returns in the alloca AS; remove risky defa…
AlexVlx Dec 5, 2024
0277516
Fix formatting.
AlexVlx Dec 5, 2024
056c9ec
Merge branch 'main' of https://github.com/llvm/llvm-project into sret…
AlexVlx Dec 28, 2024
207a2ae
Merge branch 'main' of https://github.com/llvm/llvm-project into sret…
AlexVlx Dec 29, 2024
d8bd7ab
Merge branch 'main' of https://github.com/llvm/llvm-project into sret…
AlexVlx Jan 5, 2025
f6c8e01
Add helper accessor for `LangAS::Default -> TargetAS` queries.
AlexVlx Jan 5, 2025
0f724f8
Align AMDGPU argument classification.
AlexVlx Jan 5, 2025
7158b8d
Merge branch 'main' of https://github.com/llvm/llvm-project into sret…
AlexVlx Jan 7, 2025
8f472f3
Tweak Swift's use of AS aware `getIndirect`.
AlexVlx Jan 7, 2025
2bdb085
Fix formatting.
AlexVlx Jan 7, 2025
99101fb
Merge branch 'main' of https://github.com/llvm/llvm-project into sret…
AlexVlx Jan 8, 2025
86093c2
Merge branch 'main' of https://github.com/llvm/llvm-project into sret…
AlexVlx Jan 8, 2025
4b47cd7
Remove helper, switch to using the AllocaAS for all indirects.
AlexVlx Jan 8, 2025
d103255
Fix Swift mismatch.
AlexVlx Jan 8, 2025
e325239
Merge branch 'main' of https://github.com/llvm/llvm-project into sret…
AlexVlx Jan 8, 2025
27ef889
Merge branch 'main' of https://github.com/llvm/llvm-project into sret…
AlexVlx Jan 14, 2025
260e96d
Merge branch 'main' of https://github.com/llvm/llvm-project into sret…
AlexVlx Jan 23, 2025
5227aef
Fix leftover LangAS::Default.
AlexVlx Jan 23, 2025
94b51d5
Fix leftover use of LangAS::Default.
AlexVlx Jan 23, 2025
53d8462
Apply formatting suggestions.
AlexVlx Jan 23, 2025
4d2b9f7
Fix formatting.
AlexVlx Jan 23, 2025
d9595fc
Merge branch 'main' of https://github.com/llvm/llvm-project into sret…
AlexVlx Jan 23, 2025
3acc4ff
Fix typo.
AlexVlx Jan 23, 2025
69b7937
Add test.
AlexVlx Jan 23, 2025
ddaccb8
Fix formatting (again).
AlexVlx Jan 23, 2025
f442024
Merge branch 'main' of https://github.com/llvm/llvm-project into sret…
AlexVlx Jan 28, 2025
939af07
Merge branch 'main' of https://github.com/llvm/llvm-project into sret…
AlexVlx Feb 3, 2025
05f0701
Merge branch 'main' of https://github.com/llvm/llvm-project into sret…
AlexVlx Feb 10, 2025
3e10da3
Update clang/test/CodeGenOpenCL/implicit-addrspacecast-function-param…
arsenm Feb 13, 2025
0867735
Merge branch 'main' into sret_fixes
arsenm Feb 13, 2025
553ac57
Merge branch 'main' of https://github.com/llvm/llvm-project into sret…
AlexVlx Feb 13, 2025
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
15 changes: 8 additions & 7 deletions clang/lib/CodeGen/CGCall.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -1672,8 +1672,7 @@ CodeGenTypes::GetFunctionType(const CGFunctionInfo &FI) {

// Add type for sret argument.
if (IRFunctionArgs.hasSRetArg()) {
QualType Ret = FI.getReturnType();
unsigned AddressSpace = CGM.getTypes().getTargetAddressSpace(Ret);
unsigned AddressSpace = CGM.getDataLayout().getAllocaAddrSpace();
ArgTypes[IRFunctionArgs.getSRetArgNo()] =
llvm::PointerType::get(getLLVMContext(), AddressSpace);
}
Expand Down Expand Up @@ -5145,7 +5144,6 @@ RValue CodeGenFunction::EmitCall(const CGFunctionInfo &CallInfo,
// If the call returns a temporary with struct return, create a temporary
// alloca to hold the result, unless one is given to us.
Address SRetPtr = Address::invalid();
RawAddress SRetAlloca = RawAddress::invalid();
llvm::Value *UnusedReturnSizePtr = nullptr;
if (RetAI.isIndirect() || RetAI.isInAlloca() || RetAI.isCoerceAndExpand()) {
// For virtual function pointer thunks and musttail calls, we must always
Expand All @@ -5159,16 +5157,19 @@ RValue CodeGenFunction::EmitCall(const CGFunctionInfo &CallInfo,
} else if (!ReturnValue.isNull()) {
SRetPtr = ReturnValue.getAddress();
} else {
SRetPtr = CreateMemTemp(RetTy, "tmp", &SRetAlloca);
SRetPtr = CreateMemTempWithoutCast(RetTy, "tmp");
if (HaveInsertPoint() && ReturnValue.isUnused()) {
llvm::TypeSize size =
CGM.getDataLayout().getTypeAllocSize(ConvertTypeForMem(RetTy));
UnusedReturnSizePtr = EmitLifetimeStart(size, SRetAlloca.getPointer());
UnusedReturnSizePtr = EmitLifetimeStart(size, SRetPtr.getBasePointer());
}
}
if (IRFunctionArgs.hasSRetArg()) {
// If the caller allocated the return slot, it is possible that the
// alloca was AS casted to the default as, so we ensure the cast is
// stripped before binding to the sret arg, which is in the allocaAS.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry, what? It seems really wrong to be blindly stripping pointer casts here. Can you explain what code pattern is leading to us not having a pointer in the right address space?

Copy link
Contributor Author

@AlexVlx AlexVlx Nov 14, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's not really blind (albeit it might be somewhat tightly coupling sret with alloca), this is actually captured in current tests e.g. CodeGen/sret.c, please see: https://gcc.godbolt.org/z/TWd83dbdE. This currently works because sret gets arbitrarily placed in the default AS, switching it over to anything but will break it. This happens when we receive a pre-allocaed return value slot, which gets created in AggExprEmitter::withReturnValueSlot iff we cannot elide the temporary. This uses CreateMemTemp which inserts a cast to the default AS. An alternative would be to instead use CreateMemTempWithoutCast and to also handle the case where the slot has been pre-allocated.

IRCallArgs[IRFunctionArgs.getSRetArgNo()] =
getAsNaturalPointerTo(SRetPtr, RetTy);
getAsNaturalPointerTo(SRetPtr, RetTy)->stripPointerCasts();
} else if (RetAI.isInAlloca()) {
Address Addr =
Builder.CreateStructGEP(ArgMemory, RetAI.getInAllocaFieldIndex());
Expand Down Expand Up @@ -5740,7 +5741,7 @@ RValue CodeGenFunction::EmitCall(const CGFunctionInfo &CallInfo,
// pop this cleanup later on. Being eager about this is OK, since this
// temporary is 'invisible' outside of the callee.
if (UnusedReturnSizePtr)
pushFullExprCleanup<CallLifetimeEnd>(NormalEHLifetimeMarker, SRetAlloca,
pushFullExprCleanup<CallLifetimeEnd>(NormalEHLifetimeMarker, SRetPtr,
UnusedReturnSizePtr);

llvm::BasicBlock *InvokeDest = CannotThrow ? nullptr : getInvokeDest();
Expand Down
4 changes: 2 additions & 2 deletions clang/test/CodeGen/partial-reinitialization2.c
Original file line number Diff line number Diff line change
Expand Up @@ -91,8 +91,8 @@ void test5(void)
// CHECK-LABEL: test6
void test6(void)
{
// CHECK: [[LP:%[a-z0-9]+]] = getelementptr{{.*}}%struct.LLP2P2, ptr{{.*}}, i32 0, i32 0
// CHECK: call {{.*}}get456789(ptr {{.*}}[[LP]])
// CHECK: [[VAR:%[a-z0-9]+]] = alloca
// CHECK: call {{.*}}get456789(ptr {{.*}}sret{{.*}} [[VAR]])

// CHECK: [[CALL:%[a-z0-9]+]] = call {{.*}}@get235()
// CHECK: store{{.*}}[[CALL]], {{.*}}[[TMP0:%[a-z0-9.]+]]
Expand Down
11 changes: 11 additions & 0 deletions clang/test/CodeGen/sret.c
Original file line number Diff line number Diff line change
@@ -1,23 +1,34 @@
// RUN: %clang_cc1 %s -Wno-strict-prototypes -emit-llvm -o - | FileCheck %s
// RUN: %clang_cc1 %s -Wno-strict-prototypes -triple amdgcn-amd-amdhsa -emit-llvm -o - | FileCheck --check-prefix=NONZEROALLOCAAS %s

struct abc {
long a;
long b;
long c;
long d;
long e;
long f;
long g;
long h;
long i;
long j;
};

struct abc foo1(void);
// CHECK-DAG: declare {{.*}} @foo1(ptr dead_on_unwind writable sret(%struct.abc)
// NONZEROALLOCAAS-DAG: declare {{.*}} @foo1(ptr addrspace(5) dead_on_unwind writable sret(%struct.abc)
struct abc foo2();
// CHECK-DAG: declare {{.*}} @foo2(ptr dead_on_unwind writable sret(%struct.abc)
// NONZEROALLOCAAS-DAG: declare {{.*}} @foo2(ptr addrspace(5) dead_on_unwind writable sret(%struct.abc)
struct abc foo3(void){}
// CHECK-DAG: define {{.*}} @foo3(ptr dead_on_unwind noalias writable sret(%struct.abc)
// NONZEROALLOCAAS-DAG: define {{.*}} @foo3(ptr addrspace(5) dead_on_unwind noalias writable sret(%struct.abc)

void bar(void) {
struct abc dummy1 = foo1();
// CHECK-DAG: call {{.*}} @foo1(ptr dead_on_unwind writable sret(%struct.abc)
// NONZEROALLOCAAS-DAG: call {{.*}} @foo1(ptr addrspace(5) dead_on_unwind writable sret(%struct.abc)
struct abc dummy2 = foo2();
// CHECK-DAG: call {{.*}} @foo2(ptr dead_on_unwind writable sret(%struct.abc)
// NONZEROALLOCAAS-DAG: call {{.*}} @foo2(ptr addrspace(5) dead_on_unwind writable sret(%struct.abc)
}
4 changes: 2 additions & 2 deletions clang/test/CodeGenOpenCL/addr-space-struct-arg.cl
Original file line number Diff line number Diff line change
Expand Up @@ -250,7 +250,7 @@ kernel void ker(global Mat3X3 *in, global Mat4X4 *out) {
// AMDGCN-NEXT: ret void
//
// AMDGCN20-LABEL: define dso_local void @foo_large(
// AMDGCN20-SAME: ptr dead_on_unwind noalias writable sret([[STRUCT_MAT64X64:%.*]]) align 4 [[AGG_RESULT:%.*]], ptr addrspace(5) noundef byref([[STRUCT_MAT32X32:%.*]]) align 4 [[TMP0:%.*]]) #[[ATTR0]] {
// AMDGCN20-SAME: ptr addrspace(5) dead_on_unwind noalias writable sret([[STRUCT_MAT64X64:%.*]]) align 4 [[AGG_RESULT:%.*]], ptr addrspace(5) noundef byref([[STRUCT_MAT32X32:%.*]]) align 4 [[TMP0:%.*]]) #[[ATTR0]] {
// AMDGCN20-NEXT: [[ENTRY:.*:]]
// AMDGCN20-NEXT: [[COERCE:%.*]] = alloca [[STRUCT_MAT32X32]], align 4, addrspace(5)
// AMDGCN20-NEXT: [[IN:%.*]] = addrspacecast ptr addrspace(5) [[COERCE]] to ptr
Expand Down Expand Up @@ -335,7 +335,7 @@ Mat64X64 __attribute__((noinline)) foo_large(Mat32X32 in) {
// AMDGCN20-NEXT: [[TMP1:%.*]] = load ptr addrspace(1), ptr [[IN_ADDR_ASCAST]], align 8
// AMDGCN20-NEXT: [[ARRAYIDX1:%.*]] = getelementptr inbounds [[STRUCT_MAT32X32]], ptr addrspace(1) [[TMP1]], i64 1
// AMDGCN20-NEXT: call void @llvm.memcpy.p5.p1.i64(ptr addrspace(5) align 4 [[BYVAL_TEMP]], ptr addrspace(1) align 4 [[ARRAYIDX1]], i64 4096, i1 false)
// AMDGCN20-NEXT: call void @foo_large(ptr dead_on_unwind writable sret([[STRUCT_MAT64X64]]) align 4 [[TMP_ASCAST]], ptr addrspace(5) noundef byref([[STRUCT_MAT32X32]]) align 4 [[BYVAL_TEMP]]) #[[ATTR3]]
// AMDGCN20-NEXT: call void @foo_large(ptr addrspace(5) dead_on_unwind writable sret([[STRUCT_MAT64X64]]) align 4 [[TMP]], ptr addrspace(5) noundef byref([[STRUCT_MAT32X32]]) align 4 [[BYVAL_TEMP]]) #[[ATTR3]]
// AMDGCN20-NEXT: call void @llvm.memcpy.p1.p0.i64(ptr addrspace(1) align 4 [[ARRAYIDX]], ptr align 4 [[TMP_ASCAST]], i64 16384, i1 false)
// AMDGCN20-NEXT: ret void
//
Expand Down
4 changes: 2 additions & 2 deletions clang/test/CodeGenOpenCL/amdgpu-abi-struct-arg-byref.cl
Original file line number Diff line number Diff line change
Expand Up @@ -91,7 +91,7 @@ kernel void ker(global Mat3X3 *in, global Mat4X4 *out) {
}

// AMDGCN-LABEL: define dso_local void @foo_large(
// AMDGCN-SAME: ptr dead_on_unwind noalias writable sret([[STRUCT_MAT64X64:%.*]]) align 4 [[AGG_RESULT:%.*]], ptr addrspace(5) noundef byref([[STRUCT_MAT32X32:%.*]]) align 4 [[TMP0:%.*]]) #[[ATTR0]] {
// AMDGCN-SAME: ptr addrspace(5) dead_on_unwind noalias writable sret([[STRUCT_MAT64X64:%.*]]) align 4 [[AGG_RESULT:%.*]], ptr addrspace(5) noundef byref([[STRUCT_MAT32X32:%.*]]) align 4 [[TMP0:%.*]]) #[[ATTR0]] {
// AMDGCN-NEXT: [[ENTRY:.*:]]
// AMDGCN-NEXT: [[COERCE:%.*]] = alloca [[STRUCT_MAT32X32]], align 4, addrspace(5)
// AMDGCN-NEXT: [[IN:%.*]] = addrspacecast ptr addrspace(5) [[COERCE]] to ptr
Expand Down Expand Up @@ -120,7 +120,7 @@ Mat64X64 __attribute__((noinline)) foo_large(Mat32X32 in) {
// AMDGCN-NEXT: [[TMP1:%.*]] = load ptr addrspace(1), ptr [[IN_ADDR_ASCAST]], align 8
// AMDGCN-NEXT: [[ARRAYIDX1:%.*]] = getelementptr inbounds [[STRUCT_MAT32X32]], ptr addrspace(1) [[TMP1]], i64 1
// AMDGCN-NEXT: call void @llvm.memcpy.p5.p1.i64(ptr addrspace(5) align 4 [[BYVAL_TEMP]], ptr addrspace(1) align 4 [[ARRAYIDX1]], i64 4096, i1 false)
// AMDGCN-NEXT: call void @foo_large(ptr dead_on_unwind writable sret([[STRUCT_MAT64X64]]) align 4 [[TMP_ASCAST]], ptr addrspace(5) noundef byref([[STRUCT_MAT32X32]]) align 4 [[BYVAL_TEMP]]) #[[ATTR3]]
// AMDGCN-NEXT: call void @foo_large(ptr addrspace(5) dead_on_unwind writable sret([[STRUCT_MAT64X64]]) align 4 [[TMP]], ptr addrspace(5) noundef byref([[STRUCT_MAT32X32]]) align 4 [[BYVAL_TEMP]]) #[[ATTR3]]
// AMDGCN-NEXT: call void @llvm.memcpy.p1.p0.i64(ptr addrspace(1) align 4 [[ARRAYIDX]], ptr align 4 [[TMP_ASCAST]], i64 16384, i1 false)
// AMDGCN-NEXT: ret void
//
Expand Down
Loading