-
Notifications
You must be signed in to change notification settings - Fork 13.6k
[clang][CodeGen][OpenCL] Fix alloca
handling
#113930
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Changes from 2 commits
5a2ea20
59ebd50
945535c
819392b
966029b
4cb702b
baacb4b
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -108,11 +108,15 @@ RawAddress CodeGenFunction::CreateTempAlloca(llvm::Type *Ty, CharUnits Align, | |
if (AllocaAddr) | ||
*AllocaAddr = Alloca; | ||
llvm::Value *V = Alloca.getPointer(); | ||
assert((!getLangOpts().OpenCL || | ||
CGM.getTarget().getTargetAddressSpace(getASTAllocaAddressSpace()) == | ||
CGM.getTarget().getTargetAddressSpace(LangAS::opencl_private)) && | ||
"For OpenCL allocas must allocate in the private address space!"); | ||
Comment on lines
+106
to
+109
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. This assertion probably does not hold for nvptx. Alloca returns a generic pointer and is later lowered to the whatever they call private + cast in the backend There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Are you certain? That's not how I'm reading its ASMAP, think this'd work just fine it aliases generic and private (and appears they designed their ASMap to handle precisely this case). Am I missing something in particular? The above is only a concern in Clang, once it's made it to the BE it's free to do whatever, really. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. That is the nature of the PTX hack. They pretend everything is 0, and then fix up stack later. The backend then runs NVPTXLowerAlloca to introduce a bunch of casts to show that it really was an addrspace(5) pointer to begin with around most use contexts There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Then this is safe / the assert won't flare and it matches existing behaviour, does it not? There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Yes (but will be invalid if they ever fix up that hack, as there have been posts suggesting favor of in the past) |
||
// Alloca always returns a pointer in alloca address space, which may | ||
// be different from the type defined by the language. For example, | ||
// in C++ the auto variables are in the default address space. Therefore | ||
// cast alloca to the default address space when necessary. | ||
if (getASTAllocaAddressSpace() != LangAS::Default) { | ||
if (!getLangOpts().OpenCL && getASTAllocaAddressSpace() != LangAS::Default) { | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Don't see why the language mode would factor in. The below code also seems more complex than necessary. I would expect there to be a language typed value, and there to be an IR value. You would simply insert the cast if they didn't match, so not sure what the performAddrSpaceCast hook is for. Is that a holdover from before addrspacecast existed? There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. The language factors in here because you cannot multi-purpose / multi-map LangAS::Default, which is why we (and others) used the hacked AS map that had private as default. Maybe there's a more elegant solution to this that does things upstream, and perhaps we'll implement it later, but this is borderline NFC to deal with dropping the privateIsDefault hacked AS maps. As for the rest of the code, I've not authored it, so I cannot comment one way or the other. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. What language type value? The issue is not with the allocated value's type, it's around the allocation's type, or rather how it is presented. This code has been there for years, I've not added it, I'm not going to touch it. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. This whole casting block should be deleted. A memory temporary should be the natural alloca type. Any user that wants the cast should handle this in the narrow use case. This cannot be language dependent. The current behavior is ABI breaking for aggregates passed on the stack (see the device enqueue tests for an example, the byref temporary argument is broken to be a flat pointer). We are also just creating a lot of unnecessary casting spam for later passes to clean up in most contexts There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Ok , this is the version with the cast. I think the naming / default should be inverted, but that's an issue for another day. There still shouldn't be modality here, and at least one other context is using the wrong version There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Switching CreateMemTemp to use CreateTempAllocaWithoutCast seems to help There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Baby step that fixes my immediate issue: #129837 |
||
auto DestAddrSpace = getContext().getTargetAddressSpace(LangAS::Default); | ||
llvm::IRBuilderBase::InsertPointGuard IPG(Builder); | ||
// When ArraySize is nullptr, alloca is inserted at AllocaInsertPt, | ||
|
Large diffs are not rendered by default.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This seems like extra steps around directly calling getContext().getTargetAddressSpace(LangAS::opencl_private)
But more importantly, I would expect this to have been handled by the target API lowering code. The language wouldn't factor in
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That's a reasonable expectation but it's not handled at the moment, and it appears that people just gave up on it and relied on the terrible PrivateIsDefault AS maps. I'm not really adding much functionality here, just ensuring that we can safely survive moving away from that.