Skip to content

[OpenMP][LLVM] Update alloca IP after PrivCB in OMPIRBUIlder #93920

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 9 commits into from
Jun 5, 2024
Original file line number Diff line number Diff line change
@@ -0,0 +1,23 @@
! Tests the OMPIRBuilder can handle multiple privatization regions that contain
! multiple BBs (for example, for allocatables).

! RUN: %flang -S -emit-llvm -fopenmp -mmlir --openmp-enable-delayed-privatization \
! RUN: -o - %s 2>&1 | FileCheck %s

subroutine foo(x)
integer, allocatable :: x, y
!$omp parallel private(x, y)
x = y
!$omp end parallel
end

! CHECK-LABEL: define void @foo_
! CHECK: ret void
! CHECK-NEXT: }

! CHECK-LABEL: define internal void @foo_..omp_par
! CHECK-DAG: call ptr @malloc
! CHECK-DAG: call ptr @malloc
! CHECK-DAG: call void @free
! CHECK-DAG: call void @free
! CHECK: }
3 changes: 3 additions & 0 deletions llvm/lib/Frontend/OpenMP/OMPIRBuilder.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -1583,6 +1583,9 @@ IRBuilder<>::InsertPoint OpenMPIRBuilder::createParallel(
} else {
Builder.restoreIP(
PrivCB(InnerAllocaIP, Builder.saveIP(), V, *Inner, ReplacementValue));
InnerAllocaIP = {InnerAllocaIP.getPoint()->getParent(),
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So, the problem leading to this is that we've split the Alloca block [because the variable is allocatable], and the actual insert point is now the branch of the second block.

I don't think we want the new AllocaIP to be in that second block, in case someone adds more alloca ops - they should go into the first block.

My PR, that Kiran mentioned, has several of this kind of fixes [I will check if it fixes this case too,]. Whenever possible, I try to make the new location at the terminator of the original AllocaIP block.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

My PR, that Kiran mentioned, has several of this kind of fixes [I will check if it fixes this case too,]. Whenever possible, I try to make the new location at the terminator of the original AllocaIP block.

Thanks for looking at this @Leporacanthicus. I tried your PR with a reproducing example and so far it does not handle this case. If you like, you can take the test I added in this PR and add it to yours and handle that case in your PR as well. Either way is fine with me.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@kiranchandramohan @Leporacanthicus based on the above comment, and since neither PRs solves the "1 BB for all allocas" problem and since #92430 does not solve the crash handled by this PR, I think there is not conflict between this PR and #92430. I think we can proceed with both. WDYT?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm hitting further problems (in the real code that we're trying to make work), so I'm OK with this going in if it is helping you get the job done. I personally would prefer it going into the oriiginal allocaIP BB - I'm well aware that not ALL things go in the original BB, but we don't need to make it worse... :)

Also, if all places that need this fix use code like builder.SetInsertPoint(allocaIP.getBlock()->getTerminator()); or perhaps:

    allocaIP =
        InsertPointTy(allocaIP.getBlock(),
                      allocaIP.getBlock()->getTerminator()->getIterator());

then it makes it a little easier to find ALL such places, and not have to find half a dozen different variants that do slightly different things.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

    allocaIP =
        InsertPointTy(allocaIP.getBlock(),
                      allocaIP.getBlock()->getTerminator()->getIterator());

I can do that. I actually tried that locally and did not commit it since it did not make much difference in terms of generated IR. However, it makes sense to commit it given the point you mentioned about consistency.

InnerAllocaIP.getPoint()};

assert(ReplacementValue &&
"Expected copy/create callback to set replacement value!");
if (ReplacementValue == &V)
Expand Down
Loading