Skip to content

[WIP] Delayed privatization. #79862

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
wants to merge 2 commits into from
Closed
Show file tree
Hide file tree
Changes from 1 commit
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
1 change: 1 addition & 0 deletions flang/include/flang/Lower/SymbolMap.h
Original file line number Diff line number Diff line change
Expand Up @@ -312,6 +312,7 @@ class SymMap {
lookupVariableDefinition(semantics::SymbolRef sym) {
if (auto symBox = lookupSymbol(sym))
return symBox.getIfFortranVariableOpInterface();

return std::nullopt;
}

Expand Down
5 changes: 4 additions & 1 deletion flang/lib/Lower/Bridge.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -1052,7 +1052,10 @@ class FirConverter : public Fortran::lower::AbstractConverter {
if (sym.detailsIf<Fortran::semantics::CommonBlockDetails>())
return symMap->lookupSymbol(sym);

return {};
// With delayed privatization, Fortran symbols might now be mapped to
// simple `mlir::Value`s (arguments to the `omp.private` ops in this
// case). Therefore, it is possible that none of the above cases applies.
// return {};
}
if (Fortran::lower::SymbolBox v = symMap->lookupSymbol(sym))
return v;
Expand Down
18 changes: 16 additions & 2 deletions flang/lib/Optimizer/CodeGen/CodeGen.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -3505,6 +3505,18 @@ struct ZeroOpConversion : public FIROpConversion<fir::ZeroOp> {
}
};

class DeclareOpConversion : public FIROpConversion<fir::DeclareOp> {
Copy link
Member Author

@ergawy ergawy Feb 7, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

While experimenting with delayed privation for hlfir (so far I have been experimenting with fir), I came across a small issue which is that fir.declare is not support when converting from fir to mlir dialect. That's why I had to copy this conversion pattern from PreCGRewrite.cpp. Since DeclareOpConversion does not convert to op in the fir::cg dialect, I think it would be nice to move it here to make conversion from fir to llvm dialect more complete.

Do you folks have any input on this?

If you agree to share the pass, I will open a separate PR with proper testing of course.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am surprised by this. In a quick experiment flang-new -mmlir -mlir-pass-statistics -c [...] gives me the same output both with and without -fopenmp. Why isn't CodeGenRewrite doing the job for you? This pass should run earlier than FIRToLLVMLowering (CodeGen.cpp).

In @jeanPerier's commit adding fir::DeclareOp to PreCGRewrite.cpp, he mentions that it was placed here to avoid adding support for fir.shape to codegen. See c852174

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There is fir::createOpenMPFIRPassPipeline, so I expect that I should be seeing some extra passes for OpenMP. It is weird that I didn't.

Anyway that runs immediately after lowering so I don't think it should effect codegen for fir.declare.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @tblah for taking a look. I was referring to lowering using fir-opt --fir-to-llvm-ir for testing purposes. I expected that the fir-opt --fir-to-llvm-ir should provide a more-or-less complete conversion pipeline from fir to llvm. This is mostly to make testing and experimenting easier.

I am a bit of newbie to the project so I might not be using the different tools and their flags properly yet.

Copy link
Contributor

@tblah tblah Feb 9, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No fir-oppt --fir-to-llvm-ir is not sufficient. The FIR pass pipeline is quite long (see https://github.com/llvm/llvm-project/blob/main/flang/test/Driver/mlir-pass-pipeline.f90). A few of these are optimizations (e.g. SimplifyIntrinsics, CSE, AliasTags), but several different passes are needed to convert FIR into LLVM (even more for HLFIR).

Roughly, to get from FIR to LLVM you need the passes in createDefaultFIRCodeGenPassPipeline:

inline void createDefaultFIRCodeGenPassPipeline(

From the command line you can pipe FIR into tco and it will output LLVM IR. fir-opt is more for running individual passes. The help text is a bit hard to follow, you can find tco specific options at the top of this file https://github.com/llvm/llvm-project/blob/main/flang/tools/tco/tco.cpp#L41

For example, bbc -emit-hlfir -o - file.f90 | tco is a bit like running flang-new -fc1 -O2 -emit-llvm -o - file.f90

public:
using FIROpConversion::FIROpConversion;

mlir::LogicalResult
matchAndRewrite(fir::DeclareOp declareOp, OpAdaptor,
mlir::ConversionPatternRewriter &rewriter) const override {
rewriter.replaceOp(declareOp, declareOp.getMemref());
return mlir::success();
}
};

/// `fir.unreachable` --> `llvm.unreachable`
struct UnreachableOpConversion : public FIROpConversion<fir::UnreachableOp> {
using FIROpConversion::FIROpConversion;
Expand Down Expand Up @@ -3856,6 +3868,7 @@ class RenameMSVCLibmFuncs
return mlir::success();
}
};

} // namespace

namespace {
Expand Down Expand Up @@ -3949,7 +3962,7 @@ class FIRToLLVMLowering
UnboxCharOpConversion, UnboxProcOpConversion, UndefOpConversion,
UnreachableOpConversion, UnrealizedConversionCastOpConversion,
XArrayCoorOpConversion, XEmboxOpConversion, XReboxOpConversion,
ZeroOpConversion>(typeConverter, options);
ZeroOpConversion, DeclareOpConversion>(typeConverter, options);
mlir::populateFuncToLLVMConversionPatterns(typeConverter, pattern);
mlir::populateOpenMPToLLVMConversionPatterns(typeConverter, pattern);
mlir::arith::populateArithToLLVMConversionPatterns(typeConverter, pattern);
Expand Down Expand Up @@ -4002,7 +4015,8 @@ class FIRToLLVMLowering
signalPassFailure();
}

// Run pass to add comdats to functions that have weak linkage on relevant platforms
// Run pass to add comdats to functions that have weak linkage on relevant
// platforms
if (fir::getTargetTriple(mod).supportsCOMDAT()) {
mlir::OpPassManager comdatPM("builtin.module");
comdatPM.addPass(mlir::LLVM::createLLVMAddComdats());
Expand Down
1 change: 1 addition & 0 deletions flang/test/Lower/OpenMP/FIR/delayed_privatization.f90
Original file line number Diff line number Diff line change
Expand Up @@ -3,6 +3,7 @@
! RUN: bbc -fopenmp -emit-fir --openmp-enable-delayed-privatization -hlfir=false %s -o -

subroutine delayed_privatization()
implicit none
integer :: var1
integer :: var2

Expand Down
71 changes: 71 additions & 0 deletions flang/test/Lower/OpenMP/FIR/delayed_privatization_hlfir.f90
Original file line number Diff line number Diff line change
@@ -0,0 +1,71 @@
! TODO Convert this file into a bunch of lit tests for each conversion step.

! RUN: bbc -fopenmp -emit-hlfir --openmp-enable-delayed-privatization %s -o -

subroutine delayed_privatization()
implicit none
integer :: var1
integer :: var2

var1 = 111
var2 = 222

!$OMP PARALLEL FIRSTPRIVATE(var1, var2)
var1 = var1 + var2 + 2
!$OMP END PARALLEL

end subroutine


! -----------------------------------------
! ## This is what flang emits with the PoC:
! -----------------------------------------
!
! ----------------------------
! ### Conversion to HLFIR + OMP:
! ----------------------------
!module {
! func.func @_QPdelayed_privatization() {
! %0 = fir.alloca i32 {bindc_name = "var1", uniq_name = "_QFdelayed_privatizationEvar1"}
! %1:2 = hlfir.declare %0 {uniq_name = "_QFdelayed_privatizationEvar1"} : (!fir.ref<i32>) -> (!fir.ref<i32>, !fir.ref<i32>)
! %2 = fir.alloca i32 {bindc_name = "var2", uniq_name = "_QFdelayed_privatizationEvar2"}
! %3:2 = hlfir.declare %2 {uniq_name = "_QFdelayed_privatizationEvar2"} : (!fir.ref<i32>) -> (!fir.ref<i32>, !fir.ref<i32>)
! %c111_i32 = arith.constant 111 : i32
! hlfir.assign %c111_i32 to %1#0 : i32, !fir.ref<i32>
! %c222_i32 = arith.constant 222 : i32
! hlfir.assign %c222_i32 to %3#0 : i32, !fir.ref<i32>
! omp.parallel private(@var1.privatizer_0 %1#0, @var2.privatizer_0 %3#0 : !fir.ref<i32>, !fir.ref<i32>) {
! ^bb0(%arg0: !fir.ref<i32>, %arg1: !fir.ref<i32>):
! %4:2 = hlfir.declare %arg0 {uniq_name = "_QFdelayed_privatizationEvar1"} : (!fir.ref<i32>) -> (!fir.ref<i32>, !fir.ref<i32>)
! %5:2 = hlfir.declare %arg1 {uniq_name = "_QFdelayed_privatizationEvar2"} : (!fir.ref<i32>) -> (!fir.ref<i32>, !fir.ref<i32>)
! %6 = fir.load %4#0 : !fir.ref<i32>
! %7 = fir.load %5#0 : !fir.ref<i32>
! %8 = arith.addi %6, %7 : i32
! %c2_i32 = arith.constant 2 : i32
! %9 = arith.addi %8, %c2_i32 : i32
! hlfir.assign %9 to %4#0 : i32, !fir.ref<i32>
! omp.terminator
! }
! return
! }
! "omp.private"() <{function_type = (!fir.ref<i32>) -> !fir.ref<i32>, sym_name = "var1.privatizer_0"}> ({
! ^bb0(%arg0: !fir.ref<i32>):
! %0 = fir.alloca i32 {bindc_name = "var1", pinned, uniq_name = "_QFdelayed_privatizationEvar1"}
! %1:2 = hlfir.declare %0 {uniq_name = "_QFdelayed_privatizationEvar1"} : (!fir.ref<i32>) -> (!fir.ref<i32>, !fir.ref<i32>)
! %2 = fir.load %arg0 : !fir.ref<i32>
! hlfir.assign %2 to %1#0 temporary_lhs : i32, !fir.ref<i32>
! omp.yield(%1#0 : !fir.ref<i32>)
! }) : () -> ()
! "omp.private"() <{function_type = (!fir.ref<i32>) -> !fir.ref<i32>, sym_name = "var2.privatizer_0"}> ({
! ^bb0(%arg0: !fir.ref<i32>):
! %0 = fir.alloca i32 {bindc_name = "var2", pinned, uniq_name = "_QFdelayed_privatizationEvar2"}
! %1:2 = hlfir.declare %0 {uniq_name = "_QFdelayed_privatizationEvar2"} : (!fir.ref<i32>) -> (!fir.ref<i32>, !fir.ref<i32>)
! %2 = fir.load %arg0 : !fir.ref<i32>
! hlfir.assign %2 to %1#0 temporary_lhs : i32, !fir.ref<i32>
! omp.yield(%1#0 : !fir.ref<i32>)
! }) : () -> ()
!}
!
!
! ### After lowring `hlfir` to `fir`, conversion to LLVM + OMP -> LLVM IR produces the exact same result as for
! `delayed_privatization.f90`.