-
Notifications
You must be signed in to change notification settings - Fork 13.6k
[clang][CodeGen] sret
args should always point to the alloca
AS, so use that
#114062
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Changes from 8 commits
d2d2d3d
693253d
b5a7df0
f6cff66
2de33d4
6d9cb89
b209d67
ac6367b
c8f03e7
24d8edb
5ccd554
9ff1d0d
1c3e67c
c9288fc
99e03a2
013790c
c4bdeab
5afb40e
d07d63d
eeb54e4
6c0ef88
abab201
6e78db1
7d45638
f16d1d9
0277516
056c9ec
207a2ae
d8bd7ab
f6c8e01
0f724f8
7158b8d
8f472f3
2bdb085
99101fb
86093c2
4b47cd7
d103255
e325239
27ef889
260e96d
5227aef
94b51d5
53d8462
4d2b9f7
d9595fc
3acc4ff
69b7937
ddaccb8
f442024
939af07
05f0701
3e10da3
0867735
553ac57
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -1672,10 +1672,11 @@ CodeGenTypes::GetFunctionType(const CGFunctionInfo &FI) { | |
|
||
// Add type for sret argument. | ||
if (IRFunctionArgs.hasSRetArg()) { | ||
QualType Ret = FI.getReturnType(); | ||
unsigned AddressSpace = CGM.getTypes().getTargetAddressSpace(Ret); | ||
auto AddressSpace = CGM.getTarget().getIndirectArgAddressSpace(); | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I would expect this to come from the ABIArgInfo/retAI, for the specific value and not a new side hook. Actually, is the address space already correct in retAI.getIndirectAddrSpace? There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Sadly no, that's not usable here as is, that's only for There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. That's what I was thinking, yeah. There should be plenty of space for that without inflating There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I've taken a swipe at doing this; it's a bit noisier than I'd have hoped but it's mostly NFC except for AMDGPU, with other targets retaining current behaviour and having the option to adjust in the future. It does annoyingly add another relatively blind default use of AS0, but that was already there. Perhaps we can flip to defaulting to the AllocaAS with targets having the option to override the indirect AS when they classify a return type. |
||
if (!AddressSpace) | ||
AddressSpace = getDataLayout().getAllocaAddrSpace(); | ||
ArgTypes[IRFunctionArgs.getSRetArgNo()] = | ||
llvm::PointerType::get(getLLVMContext(), AddressSpace); | ||
llvm::PointerType::get(getLLVMContext(), *AddressSpace); | ||
} | ||
|
||
// Add type for inalloca argument. | ||
|
@@ -5145,7 +5146,6 @@ RValue CodeGenFunction::EmitCall(const CGFunctionInfo &CallInfo, | |
// If the call returns a temporary with struct return, create a temporary | ||
// alloca to hold the result, unless one is given to us. | ||
Address SRetPtr = Address::invalid(); | ||
RawAddress SRetAlloca = RawAddress::invalid(); | ||
llvm::Value *UnusedReturnSizePtr = nullptr; | ||
if (RetAI.isIndirect() || RetAI.isInAlloca() || RetAI.isCoerceAndExpand()) { | ||
// For virtual function pointer thunks and musttail calls, we must always | ||
|
@@ -5159,16 +5159,19 @@ RValue CodeGenFunction::EmitCall(const CGFunctionInfo &CallInfo, | |
} else if (!ReturnValue.isNull()) { | ||
SRetPtr = ReturnValue.getAddress(); | ||
} else { | ||
SRetPtr = CreateMemTemp(RetTy, "tmp", &SRetAlloca); | ||
SRetPtr = CreateMemTempWithoutCast(RetTy, "tmp"); | ||
if (HaveInsertPoint() && ReturnValue.isUnused()) { | ||
llvm::TypeSize size = | ||
CGM.getDataLayout().getTypeAllocSize(ConvertTypeForMem(RetTy)); | ||
UnusedReturnSizePtr = EmitLifetimeStart(size, SRetAlloca.getPointer()); | ||
UnusedReturnSizePtr = EmitLifetimeStart(size, SRetPtr.getBasePointer()); | ||
} | ||
} | ||
if (IRFunctionArgs.hasSRetArg()) { | ||
// If the caller allocated the return slot, it is possible that the | ||
AlexVlx marked this conversation as resolved.
Show resolved
Hide resolved
|
||
// alloca was AS casted to the default as, so we ensure the cast is | ||
// stripped before binding to the sret arg, which is in the allocaAS. | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Sorry, what? It seems really wrong to be blindly stripping pointer casts here. Can you explain what code pattern is leading to us not having a pointer in the right address space? There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. It's not really blind (albeit it might be somewhat tightly coupling |
||
IRCallArgs[IRFunctionArgs.getSRetArgNo()] = | ||
getAsNaturalPointerTo(SRetPtr, RetTy); | ||
getAsNaturalPointerTo(SRetPtr, RetTy)->stripPointerCasts(); | ||
} else if (RetAI.isInAlloca()) { | ||
Address Addr = | ||
Builder.CreateStructGEP(ArgMemory, RetAI.getInAllocaFieldIndex()); | ||
|
@@ -5390,11 +5393,19 @@ RValue CodeGenFunction::EmitCall(const CGFunctionInfo &CallInfo, | |
V->getType()->isIntegerTy()) | ||
V = Builder.CreateZExt(V, ArgInfo.getCoerceToType()); | ||
|
||
// If the argument doesn't match, perform a bitcast to coerce it. This | ||
// can happen due to trivial type mismatches. | ||
// If the argument doesn't match, we are either trying to pass an | ||
// alloca-ed sret argument directly, and the alloca AS does not match | ||
// the default AS, case in which we AS cast it, or we have a trivial | ||
// type mismatch, and thus perform a bitcast to coerce it. | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Inserting the cast might not be correct. Might need to create another temporary with the other address space, and memcpy. Is this only the inalloca case? That's the weird windows only thing? There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. No this is not the Re: not inserting the cast, you're right it's probably not correct to insert it blindly. I think the only thing we can safely handle is if the mismatched arg is a pointer to the default AS, and should error out otherwise. The only mechanism we have for creating temporaries is There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. The cast is probably unavoidable here. You need to support flat addressing for any of c++ to work anyway, so that's fine for the GPU case |
||
if (FirstIRArg < IRFuncTy->getNumParams() && | ||
V->getType() != IRFuncTy->getParamType(FirstIRArg)) | ||
V = Builder.CreateBitCast(V, IRFuncTy->getParamType(FirstIRArg)); | ||
V->getType() != IRFuncTy->getParamType(FirstIRArg)) { | ||
auto IRTy = IRFuncTy->getParamType(FirstIRArg); | ||
auto MaybeSRetArg = dyn_cast_or_null<llvm::Argument>(V); | ||
if (MaybeSRetArg && MaybeSRetArg->hasStructRetAttr()) | ||
V = Builder.CreateAddrSpaceCast(V, IRTy); | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. As a general rule, we try to use the target hook to perform address-space conversions. That target hook is expressed in terms of AST address spaces, which is one reason I think we need to thread a LangAS through. If we need to do the same for the other indirect cases, so be it. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. That's fair, but I'm not entirely sure that isn't simply excessive here - we already have the types, and the only mismatch for There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. This is the prevailing existing practice in Clang CodeGen; you'll notice we do the same thing in There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. OK, I think that I've found a possibly acceptable middle ground (both for this and your other objection). Note that I am not rejecting the fact that we probably want |
||
else | ||
V = Builder.CreateBitCast(V, IRTy); | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I'm assuming this is a pointer bitcast, which isn't necessary anymore |
||
} | ||
|
||
if (ArgHasMaybeUndefAttr) | ||
V = Builder.CreateFreeze(V); | ||
|
@@ -5740,7 +5751,7 @@ RValue CodeGenFunction::EmitCall(const CGFunctionInfo &CallInfo, | |
// pop this cleanup later on. Being eager about this is OK, since this | ||
// temporary is 'invisible' outside of the callee. | ||
if (UnusedReturnSizePtr) | ||
pushFullExprCleanup<CallLifetimeEnd>(NormalEHLifetimeMarker, SRetAlloca, | ||
pushFullExprCleanup<CallLifetimeEnd>(NormalEHLifetimeMarker, SRetPtr, | ||
UnusedReturnSizePtr); | ||
|
||
llvm::BasicBlock *InvokeDest = CannotThrow ? nullptr : getInvokeDest(); | ||
|
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -1,23 +1,34 @@ | ||
// RUN: %clang_cc1 %s -Wno-strict-prototypes -emit-llvm -o - | FileCheck %s | ||
// RUN: %clang_cc1 %s -Wno-strict-prototypes -triple amdgcn-amd-amdhsa -emit-llvm -o - | FileCheck --check-prefix=NONZEROALLOCAAS %s | ||
|
||
struct abc { | ||
long a; | ||
long b; | ||
long c; | ||
long d; | ||
long e; | ||
long f; | ||
long g; | ||
long h; | ||
long i; | ||
long j; | ||
}; | ||
|
||
struct abc foo1(void); | ||
// CHECK-DAG: declare {{.*}} @foo1(ptr dead_on_unwind writable sret(%struct.abc) | ||
// NONZEROALLOCAAS-DAG: declare {{.*}} @foo1(ptr addrspace(5) dead_on_unwind writable sret(%struct.abc) | ||
struct abc foo2(); | ||
// CHECK-DAG: declare {{.*}} @foo2(ptr dead_on_unwind writable sret(%struct.abc) | ||
// NONZEROALLOCAAS-DAG: declare {{.*}} @foo2(ptr addrspace(5) dead_on_unwind writable sret(%struct.abc) | ||
struct abc foo3(void){} | ||
// CHECK-DAG: define {{.*}} @foo3(ptr dead_on_unwind noalias writable sret(%struct.abc) | ||
// NONZEROALLOCAAS-DAG: define {{.*}} @foo3(ptr addrspace(5) dead_on_unwind noalias writable sret(%struct.abc) | ||
|
||
void bar(void) { | ||
struct abc dummy1 = foo1(); | ||
// CHECK-DAG: call {{.*}} @foo1(ptr dead_on_unwind writable sret(%struct.abc) | ||
// NONZEROALLOCAAS-DAG: call {{.*}} @foo1(ptr addrspace(5) dead_on_unwind writable sret(%struct.abc) | ||
struct abc dummy2 = foo2(); | ||
// CHECK-DAG: call {{.*}} @foo2(ptr dead_on_unwind writable sret(%struct.abc) | ||
// NONZEROALLOCAAS-DAG: call {{.*}} @foo2(ptr addrspace(5) dead_on_unwind writable sret(%struct.abc) | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I would expect this to be a
LangAS
, since that's what our address-space conversion lowerings are generally expressed in terms of. This also has the advantage of avoiding a lot of heartache with implicit conversions aroundABIInfo::getIndirect
, sinceLangAS
is a scoped enum. AndLangAS::Default
is a much more reasonable default argument for things likeABIArgInfo::getIndirect
than IR addrspace 0.There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This also shouldn't be optional. There always must be a definitive IR address space.
Also I'm not sure I follow why this is still necessary if you've modified getIndirect to carry the address space. RetAI should have this info now?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Err, I forgot to delete this. RE:
unsigned
vsLangAS
, I used the former for symmetry with other interfaces. I agree with @rjmccall thatLangAS
would be safer / would make more sense, howevergetIndirectAliased
already uses numeric vs typed.