-
Notifications
You must be signed in to change notification settings - Fork 13.5k
[WIP] Delayed privatization. #79862
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[WIP] Delayed privatization. #79862
Conversation
@llvm/pr-subscribers-flang-fir-hlfir @llvm/pr-subscribers-mlir-func Author: Kareem Ergawy (ergawy) ChangesThis is a PoC for delayed privatization in OpenMP. Instead of directly emitting privatization code in the frontend, we add a new op to outline the privatization logic for a symbol and call-like mapping that maps from the host symbol to a block argument in the OpenMP region. Example:
Would be code-generated by flang as:
Later, we would inline the delayed privatizer function-like op in the OpenMP region to basically get the same code generated directly by the fronend at the moment. So far this PoC implements the following:
Still TODO:
Patch is 22.41 KiB, truncated to 20.00 KiB below, full version: https://github.com/llvm/llvm-project/pull/79862.diff 8 Files Affected:
diff --git a/flang/include/flang/Lower/AbstractConverter.h b/flang/include/flang/Lower/AbstractConverter.h
index c19dcbdcdb39022..b3ca804256ee093 100644
--- a/flang/include/flang/Lower/AbstractConverter.h
+++ b/flang/include/flang/Lower/AbstractConverter.h
@@ -16,6 +16,7 @@
#include "flang/Common/Fortran.h"
#include "flang/Lower/LoweringOptions.h"
#include "flang/Lower/PFTDefs.h"
+#include "flang/Lower/SymbolMap.h"
#include "flang/Optimizer/Builder/BoxValue.h"
#include "flang/Semantics/symbol.h"
#include "mlir/IR/Builders.h"
@@ -92,7 +93,8 @@ class AbstractConverter {
/// Binds the symbol to an fir extended value. The symbol binding will be
/// added or replaced at the inner-most level of the local symbol map.
- virtual void bindSymbol(SymbolRef sym, const fir::ExtendedValue &exval) = 0;
+ virtual void bindSymbol(SymbolRef sym, const fir::ExtendedValue &exval,
+ Fortran::lower::SymMap *symMap = nullptr) = 0;
/// Override lowering of expression with pre-lowered values.
/// Associate mlir::Value to evaluate::Expr. All subsequent call to
@@ -111,14 +113,16 @@ class AbstractConverter {
/// For a given symbol which is host-associated, create a clone using
/// parameters from the host-associated symbol.
virtual bool
- createHostAssociateVarClone(const Fortran::semantics::Symbol &sym) = 0;
+ createHostAssociateVarClone(const Fortran::semantics::Symbol &sym,
+ Fortran::lower::SymMap *symMap = nullptr) = 0;
virtual void
createHostAssociateVarCloneDealloc(const Fortran::semantics::Symbol &sym) = 0;
- virtual void copyHostAssociateVar(
- const Fortran::semantics::Symbol &sym,
- mlir::OpBuilder::InsertPoint *copyAssignIP = nullptr) = 0;
+ virtual void
+ copyHostAssociateVar(const Fortran::semantics::Symbol &sym,
+ mlir::OpBuilder::InsertPoint *copyAssignIP = nullptr,
+ Fortran::lower::SymMap *symMap = nullptr) = 0;
/// For a given symbol, check if it is present in the inner-most
/// level of the symbol map.
@@ -295,6 +299,10 @@ class AbstractConverter {
return loweringOptions;
}
+ virtual Fortran::lower::SymbolBox
+ lookupOneLevelUpSymbol(const Fortran::semantics::Symbol &sym,
+ Fortran::lower::SymMap *symMap = nullptr) = 0;
+
private:
/// Options controlling lowering behavior.
const Fortran::lower::LoweringOptions &loweringOptions;
diff --git a/flang/include/flang/Lower/SymbolMap.h b/flang/include/flang/Lower/SymbolMap.h
index a55e4b133fe0a8d..834ab747e1a4ad9 100644
--- a/flang/include/flang/Lower/SymbolMap.h
+++ b/flang/include/flang/Lower/SymbolMap.h
@@ -101,6 +101,9 @@ struct SymbolBox : public fir::details::matcher<SymbolBox> {
[](const fir::FortranVariableOpInterface &x) {
return fir::FortranVariableOpInterface(x).getBase();
},
+ [](const fir::MutableBoxValue &x) {
+ return x.getAddr();
+ },
[](const auto &x) { return x.getAddr(); });
}
diff --git a/flang/lib/Lower/Bridge.cpp b/flang/lib/Lower/Bridge.cpp
index b3aeb99fc5d57c5..38c9c76b6793fda 100644
--- a/flang/lib/Lower/Bridge.cpp
+++ b/flang/lib/Lower/Bridge.cpp
@@ -498,16 +498,18 @@ class FirConverter : public Fortran::lower::AbstractConverter {
/// Add the symbol binding to the inner-most level of the symbol map and
/// return true if it is not already present. Otherwise, return false.
bool bindIfNewSymbol(Fortran::lower::SymbolRef sym,
- const fir::ExtendedValue &exval) {
- if (shallowLookupSymbol(sym))
+ const fir::ExtendedValue &exval,
+ Fortran::lower::SymMap *symMap = nullptr) {
+ if (shallowLookupSymbol(sym, symMap))
return false;
- bindSymbol(sym, exval);
+ bindSymbol(sym, exval, symMap);
return true;
}
void bindSymbol(Fortran::lower::SymbolRef sym,
- const fir::ExtendedValue &exval) override final {
- addSymbol(sym, exval, /*forced=*/true);
+ const fir::ExtendedValue &exval,
+ Fortran::lower::SymMap *symMap = nullptr) override final {
+ addSymbol(sym, exval, /*forced=*/true, symMap);
}
void
@@ -610,14 +612,15 @@ class FirConverter : public Fortran::lower::AbstractConverter {
}
bool createHostAssociateVarClone(
- const Fortran::semantics::Symbol &sym) override final {
+ const Fortran::semantics::Symbol &sym,
+ Fortran::lower::SymMap *symMap = nullptr) override final {
mlir::Location loc = genLocation(sym.name());
mlir::Type symType = genType(sym);
const auto *details = sym.detailsIf<Fortran::semantics::HostAssocDetails>();
assert(details && "No host-association found");
const Fortran::semantics::Symbol &hsym = details->symbol();
mlir::Type hSymType = genType(hsym);
- Fortran::lower::SymbolBox hsb = lookupSymbol(hsym);
+ Fortran::lower::SymbolBox hsb = lookupSymbol(hsym, symMap);
auto allocate = [&](llvm::ArrayRef<mlir::Value> shape,
llvm::ArrayRef<mlir::Value> typeParams) -> mlir::Value {
@@ -720,7 +723,7 @@ class FirConverter : public Fortran::lower::AbstractConverter {
// Do nothing
});
- return bindIfNewSymbol(sym, exv);
+ return bindIfNewSymbol(sym, exv, symMap);
}
void createHostAssociateVarCloneDealloc(
@@ -745,16 +748,17 @@ class FirConverter : public Fortran::lower::AbstractConverter {
void copyHostAssociateVar(
const Fortran::semantics::Symbol &sym,
- mlir::OpBuilder::InsertPoint *copyAssignIP = nullptr) override final {
+ mlir::OpBuilder::InsertPoint *copyAssignIP = nullptr,
+ Fortran::lower::SymMap *symMap = nullptr) override final {
// 1) Fetch the original copy of the variable.
assert(sym.has<Fortran::semantics::HostAssocDetails>() &&
"No host-association found");
const Fortran::semantics::Symbol &hsym = sym.GetUltimate();
- Fortran::lower::SymbolBox hsb = lookupOneLevelUpSymbol(hsym);
+ Fortran::lower::SymbolBox hsb = lookupOneLevelUpSymbol(hsym, symMap);
assert(hsb && "Host symbol box not found");
// 2) Fetch the copied one that will mask the original.
- Fortran::lower::SymbolBox sb = shallowLookupSymbol(sym);
+ Fortran::lower::SymbolBox sb = shallowLookupSymbol(sym, symMap);
assert(sb && "Host-associated symbol box not found");
assert(hsb.getAddr() != sb.getAddr() &&
"Host and associated symbol boxes are the same");
@@ -763,8 +767,9 @@ class FirConverter : public Fortran::lower::AbstractConverter {
mlir::OpBuilder::InsertPoint insPt = builder->saveInsertionPoint();
if (copyAssignIP && copyAssignIP->isSet())
builder->restoreInsertionPoint(*copyAssignIP);
- else
+ else {
builder->setInsertionPointAfter(sb.getAddr().getDefiningOp());
+ }
Fortran::lower::SymbolBox *lhs_sb, *rhs_sb;
if (copyAssignIP && copyAssignIP->isSet() &&
@@ -1060,8 +1065,10 @@ class FirConverter : public Fortran::lower::AbstractConverter {
/// Find the symbol in the inner-most level of the local map or return null.
Fortran::lower::SymbolBox
- shallowLookupSymbol(const Fortran::semantics::Symbol &sym) {
- if (Fortran::lower::SymbolBox v = localSymbols.shallowLookupSymbol(sym))
+ shallowLookupSymbol(const Fortran::semantics::Symbol &sym,
+ Fortran::lower::SymMap *symMap = nullptr) {
+ auto &map = (symMap == nullptr ? localSymbols : *symMap);
+ if (Fortran::lower::SymbolBox v = map.shallowLookupSymbol(sym))
return v;
return {};
}
@@ -1069,8 +1076,10 @@ class FirConverter : public Fortran::lower::AbstractConverter {
/// Find the symbol in one level up of symbol map such as for host-association
/// in OpenMP code or return null.
Fortran::lower::SymbolBox
- lookupOneLevelUpSymbol(const Fortran::semantics::Symbol &sym) {
- if (Fortran::lower::SymbolBox v = localSymbols.lookupOneLevelUpSymbol(sym))
+ lookupOneLevelUpSymbol(const Fortran::semantics::Symbol &sym,
+ Fortran::lower::SymMap *symMap = nullptr) override {
+ auto &map = (symMap == nullptr ? localSymbols : *symMap);
+ if (Fortran::lower::SymbolBox v = map.lookupOneLevelUpSymbol(sym))
return v;
return {};
}
@@ -1079,15 +1088,16 @@ class FirConverter : public Fortran::lower::AbstractConverter {
/// already in the map and \p forced is `false`, the map is not updated.
/// Instead the value `false` is returned.
bool addSymbol(const Fortran::semantics::SymbolRef sym,
- fir::ExtendedValue val, bool forced = false) {
- if (!forced && lookupSymbol(sym))
+ fir::ExtendedValue val, bool forced = false,
+ Fortran::lower::SymMap *symMap = nullptr) {
+ auto &map = (symMap == nullptr ? localSymbols : *symMap);
+ if (!forced && lookupSymbol(sym, &map))
return false;
if (lowerToHighLevelFIR()) {
- Fortran::lower::genDeclareSymbol(*this, localSymbols, sym, val,
- fir::FortranVariableFlagsEnum::None,
- forced);
+ Fortran::lower::genDeclareSymbol(
+ *this, map, sym, val, fir::FortranVariableFlagsEnum::None, forced);
} else {
- localSymbols.addSymbol(sym, val, forced);
+ map.addSymbol(sym, val, forced);
}
return true;
}
diff --git a/flang/lib/Lower/OpenMP.cpp b/flang/lib/Lower/OpenMP.cpp
index be2117efbabc0a0..edf0e6b2c0d90cd 100644
--- a/flang/lib/Lower/OpenMP.cpp
+++ b/flang/lib/Lower/OpenMP.cpp
@@ -169,12 +169,15 @@ class DataSharingProcessor {
void collectSymbolsForPrivatization();
void insertBarrier();
void collectDefaultSymbols();
- void privatize();
+ void
+ privatize(llvm::SetVector<mlir::omp::PrivateClauseOp> *privateInitializers);
void defaultPrivatize();
void copyLastPrivatize(mlir::Operation *op);
void insertLastPrivateCompare(mlir::Operation *op);
- void cloneSymbol(const Fortran::semantics::Symbol *sym);
- void copyFirstPrivateSymbol(const Fortran::semantics::Symbol *sym);
+ void cloneSymbol(const Fortran::semantics::Symbol *sym,
+ Fortran::lower::SymMap *symMap = nullptr);
+ void copyFirstPrivateSymbol(const Fortran::semantics::Symbol *sym,
+ Fortran::lower::SymMap *symMap);
void copyLastPrivateSymbol(const Fortran::semantics::Symbol *sym,
mlir::OpBuilder::InsertPoint *lastPrivIP);
void insertDeallocs();
@@ -197,7 +200,8 @@ class DataSharingProcessor {
// Step2 performs the copying for lastprivates and requires knowledge of the
// MLIR operation to insert the last private update. Step2 adds
// dealocation code as well.
- void processStep1();
+ void processStep1(llvm::SetVector<mlir::omp::PrivateClauseOp>
+ *privateInitializers = nullptr);
void processStep2(mlir::Operation *op, bool isLoop);
void setLoopIV(mlir::Value iv) {
@@ -206,10 +210,11 @@ class DataSharingProcessor {
}
};
-void DataSharingProcessor::processStep1() {
+void DataSharingProcessor::processStep1(
+ llvm::SetVector<mlir::omp::PrivateClauseOp> *privateInitializers) {
collectSymbolsForPrivatization();
collectDefaultSymbols();
- privatize();
+ privatize(privateInitializers);
defaultPrivatize();
insertBarrier();
}
@@ -239,20 +244,23 @@ void DataSharingProcessor::insertDeallocs() {
}
}
-void DataSharingProcessor::cloneSymbol(const Fortran::semantics::Symbol *sym) {
+void DataSharingProcessor::cloneSymbol(const Fortran::semantics::Symbol *sym,
+ Fortran::lower::SymMap *symMap) {
// Privatization for symbols which are pre-determined (like loop index
// variables) happen separately, for everything else privatize here.
if (sym->test(Fortran::semantics::Symbol::Flag::OmpPreDetermined))
return;
- bool success = converter.createHostAssociateVarClone(*sym);
+ bool success = converter.createHostAssociateVarClone(*sym, symMap);
(void)success;
assert(success && "Privatization failed due to existing binding");
}
void DataSharingProcessor::copyFirstPrivateSymbol(
- const Fortran::semantics::Symbol *sym) {
- if (sym->test(Fortran::semantics::Symbol::Flag::OmpFirstPrivate))
- converter.copyHostAssociateVar(*sym);
+ const Fortran::semantics::Symbol *sym,
+ Fortran::lower::SymMap *symMap = nullptr) {
+ if (sym->test(Fortran::semantics::Symbol::Flag::OmpFirstPrivate)) {
+ converter.copyHostAssociateVar(*sym, nullptr, symMap);
+ }
}
void DataSharingProcessor::copyLastPrivateSymbol(
@@ -487,8 +495,11 @@ void DataSharingProcessor::collectDefaultSymbols() {
}
}
-void DataSharingProcessor::privatize() {
+void DataSharingProcessor::privatize(
+ llvm::SetVector<mlir::omp::PrivateClauseOp> *privateInitializers) {
+
for (const Fortran::semantics::Symbol *sym : privatizedSymbols) {
+
if (const auto *commonDet =
sym->detailsIf<Fortran::semantics::CommonBlockDetails>()) {
for (const auto &mem : commonDet->objects()) {
@@ -496,6 +507,41 @@ void DataSharingProcessor::privatize() {
copyFirstPrivateSymbol(&*mem);
}
} else {
+ if (privateInitializers != nullptr) {
+ auto ip = firOpBuilder.saveInsertionPoint();
+
+ auto moduleOp = firOpBuilder.getInsertionBlock()
+ ->getParentOp()
+ ->getParentOfType<mlir::ModuleOp>();
+
+ firOpBuilder.setInsertionPoint(&moduleOp.getBodyRegion().front(),
+ moduleOp.getBodyRegion().front().end());
+
+ Fortran::lower::SymbolBox hsb = converter.lookupOneLevelUpSymbol(*sym);
+ assert(hsb && "Host symbol box not found");
+
+ auto clauseOp = firOpBuilder.create<mlir::omp::PrivateClauseOp>(
+ hsb.getAddr().getLoc(), hsb.getAddr().getType(),
+ sym->name().ToString());
+ firOpBuilder.setInsertionPointToEnd(&clauseOp.getBody().front());
+
+ Fortran::semantics::Symbol cp = *sym;
+ Fortran::lower::SymMap map;
+ map.addSymbol(cp, clauseOp.getArgument(0));
+ map.pushScope();
+
+ cloneSymbol(&cp, &map);
+ copyFirstPrivateSymbol(&cp, &map);
+
+ firOpBuilder.create<mlir::omp::YieldOp>(
+ hsb.getAddr().getLoc(), map.shallowLookupSymbol(cp).getAddr());
+
+ firOpBuilder.restoreInsertionPoint(ip);
+ }
+
+ // TODO: This will eventually be an else to the `if` above it. For now, I
+ // emit both the outlined privatizer AND directly emitted cloning and
+ // copying ops while I am testing.
cloneSymbol(sym);
copyFirstPrivateSymbol(sym);
}
@@ -2272,6 +2318,7 @@ static void createBodyOfOp(
llvm::SmallVector<mlir::Type> tiv(args.size(), loopVarType);
llvm::SmallVector<mlir::Location> locs(args.size(), loc);
firOpBuilder.createBlock(&op.getRegion(), {}, tiv, locs);
+
// The argument is not currently in memory, so make a temporary for the
// argument, and store it there, then bind that location to the argument.
mlir::Operation *storeOp = nullptr;
@@ -2291,10 +2338,11 @@ static void createBodyOfOp(
// If it is an unstructured region and is not the outer region of a combined
// construct, create empty blocks for all evaluations.
- if (eval.lowerAsUnstructured() && !outerCombined)
+ if (eval.lowerAsUnstructured() && !outerCombined) {
Fortran::lower::createEmptyRegionBlocks<mlir::omp::TerminatorOp,
mlir::omp::YieldOp>(
firOpBuilder, eval.getNestedEvaluations());
+ }
// Start with privatization, so that the lowering of the nested
// code will use the right symbols.
@@ -2307,12 +2355,14 @@ static void createBodyOfOp(
if (privatize) {
if (!dsp) {
tempDsp.emplace(converter, *clauses, eval);
- tempDsp->processStep1();
+ llvm::SetVector<mlir::omp::PrivateClauseOp> privateInitializers;
+ tempDsp->processStep1(&privateInitializers);
}
}
if constexpr (std::is_same_v<Op, mlir::omp::ParallelOp>) {
threadPrivatizeVars(converter, eval);
+
if (clauses) {
firOpBuilder.setInsertionPoint(marker);
ClauseProcessor(converter, *clauses).processCopyin();
@@ -2361,6 +2411,7 @@ static void createBodyOfOp(
if (exits.size() == 1)
return exits[0];
mlir::Block *exit = firOpBuilder.createBlock(®ion);
+
for (mlir::Block *b : exits) {
firOpBuilder.setInsertionPointToEnd(b);
firOpBuilder.create<mlir::cf::BranchOp>(loc, exit);
@@ -2382,8 +2433,9 @@ static void createBodyOfOp(
assert(tempDsp.has_value());
tempDsp->processStep2(op, isLoop);
} else {
- if (isLoop && args.size() > 0)
+ if (isLoop && args.size() > 0) {
dsp->setLoopIV(converter.getSymbolAddress(*args[0]));
+ }
dsp->processStep2(op, isLoop);
}
}
diff --git a/flang/test/Lower/OpenMP/FIR/delayed_privatization.f90 b/flang/test/Lower/OpenMP/FIR/delayed_privatization.f90
new file mode 100644
index 000000000000000..7b4d9135c6e070c
--- /dev/null
+++ b/flang/test/Lower/OpenMP/FIR/delayed_privatization.f90
@@ -0,0 +1,42 @@
+subroutine private_clause_allocatable()
+ integer :: xxx
+ integer :: yyy
+
+!$OMP PARALLEL FIRSTPRIVATE(xxx, yyy)
+!$OMP END PARALLEL
+
+end subroutine
+
+! This is what flang emits with the PoC:
+! --------------------------------------
+!
+!func.func @_QPprivate_clause_allocatable() {
+! %0 = fir.alloca i32 {bindc_name = "xxx", uniq_name = "_QFprivate_clause_allocatableExxx"}
+! %1 = fir.alloca i32 {bindc_name = "yyy", uniq_name = "_QFprivate_clause_allocatableEyyy"}
+! omp.parallel {
+! %2 = fir.alloca i32 {bindc_name = "xxx", pinned, uniq_name = "_QFprivate_clause_allocatableExxx"}
+! %3 = fir.load %0 : !fir.ref<i32>
+! fir.store %3 to %2 : !fir.ref<i32>
+! %4 = fir.alloca i32 {bindc_name = "yyy", pinned, uniq_name = "_QFprivate_clause_allocatableEyyy"}
+! %5 = fir.load %1 : !fir.ref<i32>
+! fir.store %5 to %4 : !fir.ref<i32>
+! omp.terminator
+! }
+! return
+!}
+!
+!"omp.private"() <{function_type = (!fir.ref<i32>) -> !fir.ref<i32>, sym_name = "xxx.privatizer"}> ({
+!^bb0(%arg0: !fir.ref<i32>):
+! %0 = fir.alloca i32 {bindc_name = "xxx", pinned, uniq_name = "_QFprivate_clause_allocatableExxx"}
+! %1 = fir.load %arg0 : !fir.ref<i32>
+! fir.store %1 to %0 : !fir.ref<i32>
+! omp.yield(%0 : !fir.ref<i32>)
+!}) : () -> ()
+!
+!"omp.private"() <{function_type = (!fir.ref<i32>) -> !fir.ref<i32>, sym_name = "yyy.privatizer"}> ({
+!^bb0(%arg0: !fir.ref<i32>):
+! %0 = fir.alloca i32 {bindc_name = "yyy", pinned, uniq_name = "_QFprivate_clause_allocatableEyyy"}
+! %1 = fir.load %arg0 : !fir.ref<i32>
+! fir.store %1 to %0 : !fir.ref<i32>
+! omp.yield(%0 : !fir.ref<i32>)
+!}) : () -> ()
diff --git a/mlir/include/mlir/Dialect/OpenMP/OpenMPOps.td b/mlir/include/mlir/Dialect/OpenMP/OpenMPOps.td
index 96c15e775a3024b..cc1bc97faaebd4d 100644
--- a/mlir/include/mlir/Dialect/OpenMP/OpenMPOps.td
+++ b/mlir/include/mlir/Dialect/OpenMP/OpenMPOps.td
@@ -16,6 +16,7 @@
include "mlir/IR/EnumAttr.td"
include "mlir/IR/OpBase.td"
+include "mlir/Interfaces/FunctionInterfaces.td"
include "mlir/Interfaces/SideEffectInterfaces.td"
include "mlir/Interfaces/ControlFlowInterfaces.td"
include "mlir/IR/SymbolInterfaces.td"
@@ -621,7 +622,7 @@ def SimdLoopOp : OpenMP_Op<"simdloop", [AttrSizedOperandSegments,
def YieldOp : OpenMP_Op<"yield",
[Pure, ReturnLike, Terminator,
ParentOneOf<["WsLoopOp", "ReductionDeclareOp",
- "AtomicUpdateOp", "SimdLoopOp"]>]> {
+ "AtomicUpdateOp", "SimdLoopOp", "PrivateClauseOp"]>]> {
let summary = "loop yield and termination operation";
let description = [{
"omp.yield" yields SSA values from the OpenMP dialect op region and
@@ -1478,6 +1479,38 @@ def Target_UpdateDataOp: OpenMP_Op<"target_update_data",
//===----------------------------------------------------------------------===//
// 2.14.5 target construct
//===----------------------------------------------------------------------===//
+def PrivateClauseOp : OpenMP_Op<"private", [
+ IsolatedFromAbove, FunctionOpInterface
+ ]> {
+ let summary = "xxxx";
+ ...
[truncated]
|
@llvm/pr-subscribers-mlir-openmp Author: Kareem Ergawy (ergawy) ChangesThis is a PoC for delayed privatization in OpenMP. Instead of directly emitting privatization code in the frontend, we add a new op to outline the privatization logic for a symbol and call-like mapping that maps from the host symbol to a block argument in the OpenMP region. Example:
Would be code-generated by flang as:
Later, we would inline the delayed privatizer function-like op in the OpenMP region to basically get the same code generated directly by the fronend at the moment. So far this PoC implements the following:
Still TODO:
Patch is 22.41 KiB, truncated to 20.00 KiB below, full version: https://github.com/llvm/llvm-project/pull/79862.diff 8 Files Affected:
diff --git a/flang/include/flang/Lower/AbstractConverter.h b/flang/include/flang/Lower/AbstractConverter.h
index c19dcbdcdb39022..b3ca804256ee093 100644
--- a/flang/include/flang/Lower/AbstractConverter.h
+++ b/flang/include/flang/Lower/AbstractConverter.h
@@ -16,6 +16,7 @@
#include "flang/Common/Fortran.h"
#include "flang/Lower/LoweringOptions.h"
#include "flang/Lower/PFTDefs.h"
+#include "flang/Lower/SymbolMap.h"
#include "flang/Optimizer/Builder/BoxValue.h"
#include "flang/Semantics/symbol.h"
#include "mlir/IR/Builders.h"
@@ -92,7 +93,8 @@ class AbstractConverter {
/// Binds the symbol to an fir extended value. The symbol binding will be
/// added or replaced at the inner-most level of the local symbol map.
- virtual void bindSymbol(SymbolRef sym, const fir::ExtendedValue &exval) = 0;
+ virtual void bindSymbol(SymbolRef sym, const fir::ExtendedValue &exval,
+ Fortran::lower::SymMap *symMap = nullptr) = 0;
/// Override lowering of expression with pre-lowered values.
/// Associate mlir::Value to evaluate::Expr. All subsequent call to
@@ -111,14 +113,16 @@ class AbstractConverter {
/// For a given symbol which is host-associated, create a clone using
/// parameters from the host-associated symbol.
virtual bool
- createHostAssociateVarClone(const Fortran::semantics::Symbol &sym) = 0;
+ createHostAssociateVarClone(const Fortran::semantics::Symbol &sym,
+ Fortran::lower::SymMap *symMap = nullptr) = 0;
virtual void
createHostAssociateVarCloneDealloc(const Fortran::semantics::Symbol &sym) = 0;
- virtual void copyHostAssociateVar(
- const Fortran::semantics::Symbol &sym,
- mlir::OpBuilder::InsertPoint *copyAssignIP = nullptr) = 0;
+ virtual void
+ copyHostAssociateVar(const Fortran::semantics::Symbol &sym,
+ mlir::OpBuilder::InsertPoint *copyAssignIP = nullptr,
+ Fortran::lower::SymMap *symMap = nullptr) = 0;
/// For a given symbol, check if it is present in the inner-most
/// level of the symbol map.
@@ -295,6 +299,10 @@ class AbstractConverter {
return loweringOptions;
}
+ virtual Fortran::lower::SymbolBox
+ lookupOneLevelUpSymbol(const Fortran::semantics::Symbol &sym,
+ Fortran::lower::SymMap *symMap = nullptr) = 0;
+
private:
/// Options controlling lowering behavior.
const Fortran::lower::LoweringOptions &loweringOptions;
diff --git a/flang/include/flang/Lower/SymbolMap.h b/flang/include/flang/Lower/SymbolMap.h
index a55e4b133fe0a8d..834ab747e1a4ad9 100644
--- a/flang/include/flang/Lower/SymbolMap.h
+++ b/flang/include/flang/Lower/SymbolMap.h
@@ -101,6 +101,9 @@ struct SymbolBox : public fir::details::matcher<SymbolBox> {
[](const fir::FortranVariableOpInterface &x) {
return fir::FortranVariableOpInterface(x).getBase();
},
+ [](const fir::MutableBoxValue &x) {
+ return x.getAddr();
+ },
[](const auto &x) { return x.getAddr(); });
}
diff --git a/flang/lib/Lower/Bridge.cpp b/flang/lib/Lower/Bridge.cpp
index b3aeb99fc5d57c5..38c9c76b6793fda 100644
--- a/flang/lib/Lower/Bridge.cpp
+++ b/flang/lib/Lower/Bridge.cpp
@@ -498,16 +498,18 @@ class FirConverter : public Fortran::lower::AbstractConverter {
/// Add the symbol binding to the inner-most level of the symbol map and
/// return true if it is not already present. Otherwise, return false.
bool bindIfNewSymbol(Fortran::lower::SymbolRef sym,
- const fir::ExtendedValue &exval) {
- if (shallowLookupSymbol(sym))
+ const fir::ExtendedValue &exval,
+ Fortran::lower::SymMap *symMap = nullptr) {
+ if (shallowLookupSymbol(sym, symMap))
return false;
- bindSymbol(sym, exval);
+ bindSymbol(sym, exval, symMap);
return true;
}
void bindSymbol(Fortran::lower::SymbolRef sym,
- const fir::ExtendedValue &exval) override final {
- addSymbol(sym, exval, /*forced=*/true);
+ const fir::ExtendedValue &exval,
+ Fortran::lower::SymMap *symMap = nullptr) override final {
+ addSymbol(sym, exval, /*forced=*/true, symMap);
}
void
@@ -610,14 +612,15 @@ class FirConverter : public Fortran::lower::AbstractConverter {
}
bool createHostAssociateVarClone(
- const Fortran::semantics::Symbol &sym) override final {
+ const Fortran::semantics::Symbol &sym,
+ Fortran::lower::SymMap *symMap = nullptr) override final {
mlir::Location loc = genLocation(sym.name());
mlir::Type symType = genType(sym);
const auto *details = sym.detailsIf<Fortran::semantics::HostAssocDetails>();
assert(details && "No host-association found");
const Fortran::semantics::Symbol &hsym = details->symbol();
mlir::Type hSymType = genType(hsym);
- Fortran::lower::SymbolBox hsb = lookupSymbol(hsym);
+ Fortran::lower::SymbolBox hsb = lookupSymbol(hsym, symMap);
auto allocate = [&](llvm::ArrayRef<mlir::Value> shape,
llvm::ArrayRef<mlir::Value> typeParams) -> mlir::Value {
@@ -720,7 +723,7 @@ class FirConverter : public Fortran::lower::AbstractConverter {
// Do nothing
});
- return bindIfNewSymbol(sym, exv);
+ return bindIfNewSymbol(sym, exv, symMap);
}
void createHostAssociateVarCloneDealloc(
@@ -745,16 +748,17 @@ class FirConverter : public Fortran::lower::AbstractConverter {
void copyHostAssociateVar(
const Fortran::semantics::Symbol &sym,
- mlir::OpBuilder::InsertPoint *copyAssignIP = nullptr) override final {
+ mlir::OpBuilder::InsertPoint *copyAssignIP = nullptr,
+ Fortran::lower::SymMap *symMap = nullptr) override final {
// 1) Fetch the original copy of the variable.
assert(sym.has<Fortran::semantics::HostAssocDetails>() &&
"No host-association found");
const Fortran::semantics::Symbol &hsym = sym.GetUltimate();
- Fortran::lower::SymbolBox hsb = lookupOneLevelUpSymbol(hsym);
+ Fortran::lower::SymbolBox hsb = lookupOneLevelUpSymbol(hsym, symMap);
assert(hsb && "Host symbol box not found");
// 2) Fetch the copied one that will mask the original.
- Fortran::lower::SymbolBox sb = shallowLookupSymbol(sym);
+ Fortran::lower::SymbolBox sb = shallowLookupSymbol(sym, symMap);
assert(sb && "Host-associated symbol box not found");
assert(hsb.getAddr() != sb.getAddr() &&
"Host and associated symbol boxes are the same");
@@ -763,8 +767,9 @@ class FirConverter : public Fortran::lower::AbstractConverter {
mlir::OpBuilder::InsertPoint insPt = builder->saveInsertionPoint();
if (copyAssignIP && copyAssignIP->isSet())
builder->restoreInsertionPoint(*copyAssignIP);
- else
+ else {
builder->setInsertionPointAfter(sb.getAddr().getDefiningOp());
+ }
Fortran::lower::SymbolBox *lhs_sb, *rhs_sb;
if (copyAssignIP && copyAssignIP->isSet() &&
@@ -1060,8 +1065,10 @@ class FirConverter : public Fortran::lower::AbstractConverter {
/// Find the symbol in the inner-most level of the local map or return null.
Fortran::lower::SymbolBox
- shallowLookupSymbol(const Fortran::semantics::Symbol &sym) {
- if (Fortran::lower::SymbolBox v = localSymbols.shallowLookupSymbol(sym))
+ shallowLookupSymbol(const Fortran::semantics::Symbol &sym,
+ Fortran::lower::SymMap *symMap = nullptr) {
+ auto &map = (symMap == nullptr ? localSymbols : *symMap);
+ if (Fortran::lower::SymbolBox v = map.shallowLookupSymbol(sym))
return v;
return {};
}
@@ -1069,8 +1076,10 @@ class FirConverter : public Fortran::lower::AbstractConverter {
/// Find the symbol in one level up of symbol map such as for host-association
/// in OpenMP code or return null.
Fortran::lower::SymbolBox
- lookupOneLevelUpSymbol(const Fortran::semantics::Symbol &sym) {
- if (Fortran::lower::SymbolBox v = localSymbols.lookupOneLevelUpSymbol(sym))
+ lookupOneLevelUpSymbol(const Fortran::semantics::Symbol &sym,
+ Fortran::lower::SymMap *symMap = nullptr) override {
+ auto &map = (symMap == nullptr ? localSymbols : *symMap);
+ if (Fortran::lower::SymbolBox v = map.lookupOneLevelUpSymbol(sym))
return v;
return {};
}
@@ -1079,15 +1088,16 @@ class FirConverter : public Fortran::lower::AbstractConverter {
/// already in the map and \p forced is `false`, the map is not updated.
/// Instead the value `false` is returned.
bool addSymbol(const Fortran::semantics::SymbolRef sym,
- fir::ExtendedValue val, bool forced = false) {
- if (!forced && lookupSymbol(sym))
+ fir::ExtendedValue val, bool forced = false,
+ Fortran::lower::SymMap *symMap = nullptr) {
+ auto &map = (symMap == nullptr ? localSymbols : *symMap);
+ if (!forced && lookupSymbol(sym, &map))
return false;
if (lowerToHighLevelFIR()) {
- Fortran::lower::genDeclareSymbol(*this, localSymbols, sym, val,
- fir::FortranVariableFlagsEnum::None,
- forced);
+ Fortran::lower::genDeclareSymbol(
+ *this, map, sym, val, fir::FortranVariableFlagsEnum::None, forced);
} else {
- localSymbols.addSymbol(sym, val, forced);
+ map.addSymbol(sym, val, forced);
}
return true;
}
diff --git a/flang/lib/Lower/OpenMP.cpp b/flang/lib/Lower/OpenMP.cpp
index be2117efbabc0a0..edf0e6b2c0d90cd 100644
--- a/flang/lib/Lower/OpenMP.cpp
+++ b/flang/lib/Lower/OpenMP.cpp
@@ -169,12 +169,15 @@ class DataSharingProcessor {
void collectSymbolsForPrivatization();
void insertBarrier();
void collectDefaultSymbols();
- void privatize();
+ void
+ privatize(llvm::SetVector<mlir::omp::PrivateClauseOp> *privateInitializers);
void defaultPrivatize();
void copyLastPrivatize(mlir::Operation *op);
void insertLastPrivateCompare(mlir::Operation *op);
- void cloneSymbol(const Fortran::semantics::Symbol *sym);
- void copyFirstPrivateSymbol(const Fortran::semantics::Symbol *sym);
+ void cloneSymbol(const Fortran::semantics::Symbol *sym,
+ Fortran::lower::SymMap *symMap = nullptr);
+ void copyFirstPrivateSymbol(const Fortran::semantics::Symbol *sym,
+ Fortran::lower::SymMap *symMap);
void copyLastPrivateSymbol(const Fortran::semantics::Symbol *sym,
mlir::OpBuilder::InsertPoint *lastPrivIP);
void insertDeallocs();
@@ -197,7 +200,8 @@ class DataSharingProcessor {
// Step2 performs the copying for lastprivates and requires knowledge of the
// MLIR operation to insert the last private update. Step2 adds
// dealocation code as well.
- void processStep1();
+ void processStep1(llvm::SetVector<mlir::omp::PrivateClauseOp>
+ *privateInitializers = nullptr);
void processStep2(mlir::Operation *op, bool isLoop);
void setLoopIV(mlir::Value iv) {
@@ -206,10 +210,11 @@ class DataSharingProcessor {
}
};
-void DataSharingProcessor::processStep1() {
+void DataSharingProcessor::processStep1(
+ llvm::SetVector<mlir::omp::PrivateClauseOp> *privateInitializers) {
collectSymbolsForPrivatization();
collectDefaultSymbols();
- privatize();
+ privatize(privateInitializers);
defaultPrivatize();
insertBarrier();
}
@@ -239,20 +244,23 @@ void DataSharingProcessor::insertDeallocs() {
}
}
-void DataSharingProcessor::cloneSymbol(const Fortran::semantics::Symbol *sym) {
+void DataSharingProcessor::cloneSymbol(const Fortran::semantics::Symbol *sym,
+ Fortran::lower::SymMap *symMap) {
// Privatization for symbols which are pre-determined (like loop index
// variables) happen separately, for everything else privatize here.
if (sym->test(Fortran::semantics::Symbol::Flag::OmpPreDetermined))
return;
- bool success = converter.createHostAssociateVarClone(*sym);
+ bool success = converter.createHostAssociateVarClone(*sym, symMap);
(void)success;
assert(success && "Privatization failed due to existing binding");
}
void DataSharingProcessor::copyFirstPrivateSymbol(
- const Fortran::semantics::Symbol *sym) {
- if (sym->test(Fortran::semantics::Symbol::Flag::OmpFirstPrivate))
- converter.copyHostAssociateVar(*sym);
+ const Fortran::semantics::Symbol *sym,
+ Fortran::lower::SymMap *symMap = nullptr) {
+ if (sym->test(Fortran::semantics::Symbol::Flag::OmpFirstPrivate)) {
+ converter.copyHostAssociateVar(*sym, nullptr, symMap);
+ }
}
void DataSharingProcessor::copyLastPrivateSymbol(
@@ -487,8 +495,11 @@ void DataSharingProcessor::collectDefaultSymbols() {
}
}
-void DataSharingProcessor::privatize() {
+void DataSharingProcessor::privatize(
+ llvm::SetVector<mlir::omp::PrivateClauseOp> *privateInitializers) {
+
for (const Fortran::semantics::Symbol *sym : privatizedSymbols) {
+
if (const auto *commonDet =
sym->detailsIf<Fortran::semantics::CommonBlockDetails>()) {
for (const auto &mem : commonDet->objects()) {
@@ -496,6 +507,41 @@ void DataSharingProcessor::privatize() {
copyFirstPrivateSymbol(&*mem);
}
} else {
+ if (privateInitializers != nullptr) {
+ auto ip = firOpBuilder.saveInsertionPoint();
+
+ auto moduleOp = firOpBuilder.getInsertionBlock()
+ ->getParentOp()
+ ->getParentOfType<mlir::ModuleOp>();
+
+ firOpBuilder.setInsertionPoint(&moduleOp.getBodyRegion().front(),
+ moduleOp.getBodyRegion().front().end());
+
+ Fortran::lower::SymbolBox hsb = converter.lookupOneLevelUpSymbol(*sym);
+ assert(hsb && "Host symbol box not found");
+
+ auto clauseOp = firOpBuilder.create<mlir::omp::PrivateClauseOp>(
+ hsb.getAddr().getLoc(), hsb.getAddr().getType(),
+ sym->name().ToString());
+ firOpBuilder.setInsertionPointToEnd(&clauseOp.getBody().front());
+
+ Fortran::semantics::Symbol cp = *sym;
+ Fortran::lower::SymMap map;
+ map.addSymbol(cp, clauseOp.getArgument(0));
+ map.pushScope();
+
+ cloneSymbol(&cp, &map);
+ copyFirstPrivateSymbol(&cp, &map);
+
+ firOpBuilder.create<mlir::omp::YieldOp>(
+ hsb.getAddr().getLoc(), map.shallowLookupSymbol(cp).getAddr());
+
+ firOpBuilder.restoreInsertionPoint(ip);
+ }
+
+ // TODO: This will eventually be an else to the `if` above it. For now, I
+ // emit both the outlined privatizer AND directly emitted cloning and
+ // copying ops while I am testing.
cloneSymbol(sym);
copyFirstPrivateSymbol(sym);
}
@@ -2272,6 +2318,7 @@ static void createBodyOfOp(
llvm::SmallVector<mlir::Type> tiv(args.size(), loopVarType);
llvm::SmallVector<mlir::Location> locs(args.size(), loc);
firOpBuilder.createBlock(&op.getRegion(), {}, tiv, locs);
+
// The argument is not currently in memory, so make a temporary for the
// argument, and store it there, then bind that location to the argument.
mlir::Operation *storeOp = nullptr;
@@ -2291,10 +2338,11 @@ static void createBodyOfOp(
// If it is an unstructured region and is not the outer region of a combined
// construct, create empty blocks for all evaluations.
- if (eval.lowerAsUnstructured() && !outerCombined)
+ if (eval.lowerAsUnstructured() && !outerCombined) {
Fortran::lower::createEmptyRegionBlocks<mlir::omp::TerminatorOp,
mlir::omp::YieldOp>(
firOpBuilder, eval.getNestedEvaluations());
+ }
// Start with privatization, so that the lowering of the nested
// code will use the right symbols.
@@ -2307,12 +2355,14 @@ static void createBodyOfOp(
if (privatize) {
if (!dsp) {
tempDsp.emplace(converter, *clauses, eval);
- tempDsp->processStep1();
+ llvm::SetVector<mlir::omp::PrivateClauseOp> privateInitializers;
+ tempDsp->processStep1(&privateInitializers);
}
}
if constexpr (std::is_same_v<Op, mlir::omp::ParallelOp>) {
threadPrivatizeVars(converter, eval);
+
if (clauses) {
firOpBuilder.setInsertionPoint(marker);
ClauseProcessor(converter, *clauses).processCopyin();
@@ -2361,6 +2411,7 @@ static void createBodyOfOp(
if (exits.size() == 1)
return exits[0];
mlir::Block *exit = firOpBuilder.createBlock(®ion);
+
for (mlir::Block *b : exits) {
firOpBuilder.setInsertionPointToEnd(b);
firOpBuilder.create<mlir::cf::BranchOp>(loc, exit);
@@ -2382,8 +2433,9 @@ static void createBodyOfOp(
assert(tempDsp.has_value());
tempDsp->processStep2(op, isLoop);
} else {
- if (isLoop && args.size() > 0)
+ if (isLoop && args.size() > 0) {
dsp->setLoopIV(converter.getSymbolAddress(*args[0]));
+ }
dsp->processStep2(op, isLoop);
}
}
diff --git a/flang/test/Lower/OpenMP/FIR/delayed_privatization.f90 b/flang/test/Lower/OpenMP/FIR/delayed_privatization.f90
new file mode 100644
index 000000000000000..7b4d9135c6e070c
--- /dev/null
+++ b/flang/test/Lower/OpenMP/FIR/delayed_privatization.f90
@@ -0,0 +1,42 @@
+subroutine private_clause_allocatable()
+ integer :: xxx
+ integer :: yyy
+
+!$OMP PARALLEL FIRSTPRIVATE(xxx, yyy)
+!$OMP END PARALLEL
+
+end subroutine
+
+! This is what flang emits with the PoC:
+! --------------------------------------
+!
+!func.func @_QPprivate_clause_allocatable() {
+! %0 = fir.alloca i32 {bindc_name = "xxx", uniq_name = "_QFprivate_clause_allocatableExxx"}
+! %1 = fir.alloca i32 {bindc_name = "yyy", uniq_name = "_QFprivate_clause_allocatableEyyy"}
+! omp.parallel {
+! %2 = fir.alloca i32 {bindc_name = "xxx", pinned, uniq_name = "_QFprivate_clause_allocatableExxx"}
+! %3 = fir.load %0 : !fir.ref<i32>
+! fir.store %3 to %2 : !fir.ref<i32>
+! %4 = fir.alloca i32 {bindc_name = "yyy", pinned, uniq_name = "_QFprivate_clause_allocatableEyyy"}
+! %5 = fir.load %1 : !fir.ref<i32>
+! fir.store %5 to %4 : !fir.ref<i32>
+! omp.terminator
+! }
+! return
+!}
+!
+!"omp.private"() <{function_type = (!fir.ref<i32>) -> !fir.ref<i32>, sym_name = "xxx.privatizer"}> ({
+!^bb0(%arg0: !fir.ref<i32>):
+! %0 = fir.alloca i32 {bindc_name = "xxx", pinned, uniq_name = "_QFprivate_clause_allocatableExxx"}
+! %1 = fir.load %arg0 : !fir.ref<i32>
+! fir.store %1 to %0 : !fir.ref<i32>
+! omp.yield(%0 : !fir.ref<i32>)
+!}) : () -> ()
+!
+!"omp.private"() <{function_type = (!fir.ref<i32>) -> !fir.ref<i32>, sym_name = "yyy.privatizer"}> ({
+!^bb0(%arg0: !fir.ref<i32>):
+! %0 = fir.alloca i32 {bindc_name = "yyy", pinned, uniq_name = "_QFprivate_clause_allocatableEyyy"}
+! %1 = fir.load %arg0 : !fir.ref<i32>
+! fir.store %1 to %0 : !fir.ref<i32>
+! omp.yield(%0 : !fir.ref<i32>)
+!}) : () -> ()
diff --git a/mlir/include/mlir/Dialect/OpenMP/OpenMPOps.td b/mlir/include/mlir/Dialect/OpenMP/OpenMPOps.td
index 96c15e775a3024b..cc1bc97faaebd4d 100644
--- a/mlir/include/mlir/Dialect/OpenMP/OpenMPOps.td
+++ b/mlir/include/mlir/Dialect/OpenMP/OpenMPOps.td
@@ -16,6 +16,7 @@
include "mlir/IR/EnumAttr.td"
include "mlir/IR/OpBase.td"
+include "mlir/Interfaces/FunctionInterfaces.td"
include "mlir/Interfaces/SideEffectInterfaces.td"
include "mlir/Interfaces/ControlFlowInterfaces.td"
include "mlir/IR/SymbolInterfaces.td"
@@ -621,7 +622,7 @@ def SimdLoopOp : OpenMP_Op<"simdloop", [AttrSizedOperandSegments,
def YieldOp : OpenMP_Op<"yield",
[Pure, ReturnLike, Terminator,
ParentOneOf<["WsLoopOp", "ReductionDeclareOp",
- "AtomicUpdateOp", "SimdLoopOp"]>]> {
+ "AtomicUpdateOp", "SimdLoopOp", "PrivateClauseOp"]>]> {
let summary = "loop yield and termination operation";
let description = [{
"omp.yield" yields SSA values from the OpenMP dialect op region and
@@ -1478,6 +1479,38 @@ def Target_UpdateDataOp: OpenMP_Op<"target_update_data",
//===----------------------------------------------------------------------===//
// 2.14.5 target construct
//===----------------------------------------------------------------------===//
+def PrivateClauseOp : OpenMP_Op<"private", [
+ IsolatedFromAbove, FunctionOpInterface
+ ]> {
+ let summary = "xxxx";
+ ...
[truncated]
|
✅ With the latest revision this PR passed the C/C++ code formatter. |
4def3b5
to
9ca8c49
Compare
852ff4e
to
113dd3b
Compare
113dd3b
to
0b423d3
Compare
0b423d3
to
ef847ab
Compare
@clementval @kiranchandramohan thanks a lot for taking a look. I think this WIP is getting closer and closer to having a complete end-to-end implementation. If I not missing anything, what remains is to actually inline the the privatizer in the op before converting to LLVM IR which I am looking into now. Of course, I still need to support more complex symbols (e.g. allocatables) but I am focusing on simple variables for now to get something working end-to-end. |
Thanks for the work here. Forgot to mention that the OpenMPIRBuilder has a privatization call back that is designed to implement privatization. You can consider using that.
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Overall, your approach to implement delayed privatization looks good to me,
requiring only a minor modification to AbstractConverter/Bridge.
I just have a couple comments.
ef847ab
to
0d05d4a
Compare
8e5d75f
to
b1b1bc1
Compare
@kiranchandramohan @clementval @luporl can you 🙏 have a look at the proposed plan above and let me know if you agree with it? Apologies for the noise but I just want to make sure we are aligned since this touches a lot of places in the pipeline already. I am specially interested in your opinion about the "feature flag" thing to enable/disable delayed privatization both in |
b1b1bc1
to
0f72db6
Compare
Thanks @ergawy for pushing this forward. This will add more capabilities to the dialect and make lowering easier. Your approach of gradually rolling out the feature makes a lot of sense. We should also ensure that applications that are enabled now continues to work (spec-2017 speed, subset of specomp2012, subset of NPB, SNAP) and that there is no regression in the gfortran testsuite. When introducing the changes to OpenMP dialect, could you add a reply to https://discourse.llvm.org/t/rfc-privatisation-in-openmp-dialect/3526 saying that we are adding support for privatisation clauses in the dialect. We see benefits for this in: I think it is fine to add a single flag to the file that can enable |
0f72db6
to
89ab3cf
Compare
This PR outlines the arguments of the open CodeGen functions into 2 separate structs. This was, in part, motivated by the delayed privatization WIP llvm#79862 where we had to extend the signatures of both functions containing quite a bit of default values (`nullptr`, `false`). This PR does not add any new arguments yet though, just outlines the existing ones.
This PR outlines the arguments of the open CodeGen functions into 2 separate structs. This was, in part, motivated by the delayed privatization WIP llvm#79862 where we had to extend the signatures of both functions containing quite a bit of default values (`nullptr`, `false`). This PR does not add any new arguments yet though, just outlines the existing ones.
…80817) This PR outlines the arguments of the open CodeGen functions into 2 separate structs. This was, in part, motivated by the delayed privatization WIP #79862 where we had to extend the signatures of both functions containing quite a bit of default values (`nullptr`, `false`). This PR does not add any new arguments yet though, just outlines the existing ones.
This PR adds a new op to the OpenMP dialect: `PrivateClauseOp`. This op will be later used to model `[first]private` clauses for differnt OpenMP directives. This is part of productizing the "delayed privatization" PoC wich can be found in llvm#79862.
This is a PoC for delayed privatization in OpenMP. Instead of directly emitting privatization code in the frontend, we add a new op to outline the privatization logic for a symbol and call-like mapping that maps from the host symbol to an outlined function-like privatizer op. Later, we would inline the delayed privatizer function-like op in the OpenMP region to basically get the same code generated directly by the fronend at the moment.
This PR adds a new op to the OpenMP dialect: `PrivateClauseOp`. This op will be later used to model `[first]private` clauses for differnt OpenMP directives. This is part of productizing the "delayed privatization" PoC wich can be found in llvm#79862.
d6c453f
to
717b57a
Compare
@@ -3505,6 +3505,18 @@ struct ZeroOpConversion : public FIROpConversion<fir::ZeroOp> { | |||
} | |||
}; | |||
|
|||
class DeclareOpConversion : public FIROpConversion<fir::DeclareOp> { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
While experimenting with delayed privation for hlfir
(so far I have been experimenting with fir
), I came across a small issue which is that fir.declare
is not support when converting from fir
to mlir
dialect. That's why I had to copy this conversion pattern from PreCGRewrite.cpp
. Since DeclareOpConversion
does not convert to op in the fir::cg
dialect, I think it would be nice to move it here to make conversion from fir
to llvm
dialect more complete.
Do you folks have any input on this?
If you agree to share the pass, I will open a separate PR with proper testing of course.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I am surprised by this. In a quick experiment flang-new -mmlir -mlir-pass-statistics -c [...]
gives me the same output both with and without -fopenmp
. Why isn't CodeGenRewrite
doing the job for you? This pass should run earlier than FIRToLLVMLowering
(CodeGen.cpp
).
In @jeanPerier's commit adding fir::DeclareOp
to PreCGRewrite.cpp
, he mentions that it was placed here to avoid adding support for fir.shape
to codegen. See c852174
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There is fir::createOpenMPFIRPassPipeline
, so I expect that I should be seeing some extra passes for OpenMP. It is weird that I didn't.
Anyway that runs immediately after lowering so I don't think it should effect codegen for fir.declare
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks @tblah for taking a look. I was referring to lowering using fir-opt --fir-to-llvm-ir
for testing purposes. I expected that the fir-opt --fir-to-llvm-ir
should provide a more-or-less complete conversion pipeline from fir
to llvm
. This is mostly to make testing and experimenting easier.
I am a bit of newbie to the project so I might not be using the different tools and their flags properly yet.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No fir-oppt --fir-to-llvm-ir
is not sufficient. The FIR pass pipeline is quite long (see https://github.com/llvm/llvm-project/blob/main/flang/test/Driver/mlir-pass-pipeline.f90). A few of these are optimizations (e.g. SimplifyIntrinsics, CSE, AliasTags), but several different passes are needed to convert FIR into LLVM (even more for HLFIR).
Roughly, to get from FIR to LLVM you need the passes in createDefaultFIRCodeGenPassPipeline
:
inline void createDefaultFIRCodeGenPassPipeline( |
From the command line you can pipe FIR into tco
and it will output LLVM IR. fir-opt
is more for running individual passes. The help text is a bit hard to follow, you can find tco
specific options at the top of this file https://github.com/llvm/llvm-project/blob/main/flang/tools/tco/tco.cpp#L41
For example, bbc -emit-hlfir -o - file.f90 | tco
is a bit like running flang-new -fc1 -O2 -emit-llvm -o - file.f90
This PR adds a new op to the OpenMP dialect: `PrivateClauseOp`. This op will be later used to model `[first]private` clauses for differnt OpenMP directives. This is part of productizing the "delayed privatization" PoC wich can be found in llvm#79862.
This PR adds a new op to the OpenMP dialect: `PrivateClauseOp`. This op will be later used to model `[first]private` clauses for differnt OpenMP directives. This is part of productizing the "delayed privatization" PoC wich can be found in llvm#79862.
This PR adds a new op to the OpenMP dialect: `PrivateClauseOp`. This op will be later used to model `[first]private` clauses for differnt OpenMP directives. This is part of productizing the "delayed privatization" PoC wich can be found in llvm#79862.
This PR adds a new op to the OpenMP dialect: `PrivateClauseOp`. This op will be later used to model `[first]private` clauses for differnt OpenMP directives. This is part of productizing the "delayed privatization" PoC wich can be found in llvm#79862.
This PR adds a new op to the OpenMP dialect: `PrivateClauseOp`. This op will be later used to model `[first]private` clauses for differnt OpenMP directives. This is part of productizing the "delayed privatization" PoC wich can be found in llvm#79862.
This PR adds a new op to the OpenMP dialect: `PrivateClauseOp`. This op will be later used to model `[first]private` clauses for differnt OpenMP directives. This is part of productizing the "delayed privatization" PoC wich can be found in llvm#79862.
This PR adds a new op to the OpenMP dialect: `PrivateClauseOp`. This op will be later used to model `[first]private` clauses for differnt OpenMP directives. This is part of productizing the "delayed privatization" PoC which can be found in #79862.
Abandoning since I am working on productizing the whole thing in separate PRs (see description of this PR for a detailed plan and tracking of that plan). I will keep updating the PR description to keep notes of the progress. |
This is a PoC for delayed privatization in OpenMP. Instead of directly
emitting privatization code in the frontend, we add a new op to outline
the privatization logic for a symbol and call-like mapping that maps
from the host symbol to an outlined function-like privatizer op.
Later, we would inline the delayed privatizer function-like op in the
OpenMP region to basically get the same code generated directly by the
fronend at the moment.
Note: This is still a work-in-progress but giving comments on the taken approach or any advice is more than welcome :).
This is my plan to productize this WIP (✅ means the item is already done, ⌛ means it is in progress):
genOpWithBody
introduced here. This will introduce thegenRegionEntryCB
callback and use it for worksharing loop ops. ([flang][OpenMP][NFC] OutlinegenOpWithBody
&createBodyOfOp
args #80817 [flang][OpenMP][NFC] Further refactoring forgenOpWithBody
& #80839) ✅omp.private
op along with a roundtripping lit test to test parsing, printing, and verification. ([MLIR][OpenMP] Addomp.private
op #80955) ✅omp.parallel
op (i.e. adding theprivate
clause) along with a roundtripping lit test as well. ( [MLIR][OpenMP] Addprivate
clause toomp.parallel
#81452) ✅OpenMPToLLVM.cpp
conversion pattern:RegionOpConversion
. ([MLIR][OpenMP] Supportllvm
conversion foromp.private
regions #81414) ✅OpenMPToLLVMIRTranslation.cpp
: thePrivCB
andprepareOmpParallel
. ( [MLIR][OpenMP] Support basic materialization foromp.private
ops #81715) ✅omp.private
ops and modifying the creation of theomp.parallel
op to use the new clause. These are most of the changes inDataSharingProcessor
andgenParallelOp
. ( [flang][OpenMP][MLIR] Basic support for delayed privatization code-gen #81833) ✅target
). ([MLIR][OpenMP] Extendomp.private
materialization support:firstprivate
#83761 [flang][OpenMP] Only use HLFIR base in privatization logic #84123 [flang][OpenMP] Add%flang_fc1
RUN
to delayed privatization tests #84296 [flang][MLIR][OpenMP] Extend delayed privatization for scalar allocatables and pointers #84740 [flang][MLIR][OpenMP] Extend delayed privatization for arrays and characters #85023) ⌛Please let me know if you disagree with anything.