Skip to content

[flang] Insert stacksave/stackrestore when alloca are present in loops #95173

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
wants to merge 1 commit into from
Closed
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
19 changes: 19 additions & 0 deletions flang/lib/Optimizer/Transforms/ControlFlowConverter.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -6,6 +6,8 @@
//
//===----------------------------------------------------------------------===//

#include "flang/Optimizer/Builder/FIRBuilder.h"
#include "flang/Optimizer/Builder/LowLevelIntrinsics.h"
#include "flang/Optimizer/Dialect/FIRDialect.h"
#include "flang/Optimizer/Dialect/FIROps.h"
#include "flang/Optimizer/Dialect/FIROpsSupport.h"
Expand Down Expand Up @@ -51,6 +53,7 @@ class CfgLoopConv : public mlir::OpRewritePattern<fir::DoLoopOp> {
matchAndRewrite(DoLoopOp loop,
mlir::PatternRewriter &rewriter) const override {
auto loc = loop.getLoc();
bool hasAllocas = !loop.getBody()->getOps<fir::AllocaOp>().empty();
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please also note that @VijayKandiah is going to make changes in the code-gen to hoist constant-sized allocas to an entry block, so stacksave/stackrestore might be avoided in such cases.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes. This should be refined to count only dyn-sized alloca.

mlir::arith::IntegerOverflowFlags flags{};
if (setNSW)
flags = bitEnumSet(flags, mlir::arith::IntegerOverflowFlags::nsw);
Expand All @@ -72,6 +75,22 @@ class CfgLoopConv : public mlir::OpRewritePattern<fir::DoLoopOp> {
rewriter.splitBlock(conditionalBlock, conditionalBlock->begin());
auto *lastBlock = &loop.getRegion().back();

// Insert stacksave/stackrestore if there is fir.alloca operation in the
// loop.
if (hasAllocas) {
auto mod = loop.getOperation()->getParentOfType<mlir::ModuleOp>();
fir::FirOpBuilder builder(rewriter, mod);
builder.setInsertionPointToStart(firstBlock);
mlir::func::FuncOp stackSave = fir::factory::getLlvmStackSave(builder);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This seems to require making the pass a module pass. This is quite unfortunate.

We can try to use LLVM dialect stacksave/stackrestore intrinsics operations, but they do not look too reliable to me, e.g. they do not have any side effects that prevent moving them around fir.alloca. At the same time, fir.call seems to have the same problem.

Would that be possible to insert stacksave/stackrestore calls during hlfir.concat bufferization? It might be inefficient, but we will get correct code at least. The redundant stacksave/stackrestore might be optimized after that.

Copy link
Contributor Author

@clementval clementval Jun 11, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This seems to require making the pass a module pass. This is quite unfortunate.

Yeah if we can avoid that it would probably be better.

Would that be possible to insert stacksave/stackrestore calls during hlfir.concat bufferization? It might be inefficient, but we will get correct code at least. The redundant stacksave/stackrestore might be optimized after that.

Yes we can be then we would miss other case (if any) of fir.alloca with dynamic size in loops.

Copy link
Contributor

@jeanPerier jeanPerier Jun 12, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

they do not have any side effects that prevent moving them around fir.alloca

Are you sure? It seems the ops do not define any side effect interface, which means they can have any side effects and cannot be moved by a generic pass.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You are right, Jean.

mlir::func::FuncOp stackRestore =
fir::factory::getLlvmStackRestore(builder);
mlir::Value stackPtr =
builder.create<fir::CallOp>(loc, stackSave).getResult(0);
auto *terminator = lastBlock->getTerminator();
builder.setInsertionPoint(terminator);
builder.create<fir::CallOp>(loc, stackRestore, stackPtr);
}

// Move the blocks from the DoLoopOp between initBlock and endBlock
rewriter.inlineRegionBefore(loop.getRegion(), endBlock);

Expand Down
19 changes: 19 additions & 0 deletions flang/test/Fir/loop01.fir
Original file line number Diff line number Diff line change
Expand Up @@ -542,3 +542,22 @@ func.func @y5(%lo : index, %up : index) -> index {
// NSW: fir.call @f3(%[[VAL_7]]) : (i16) -> ()
// NSW: return %[[VAL_5]] : index
// NSW: }

// -----

func.func @alloca_in_loop(%lb : index, %ub : index, %step : index, %b : i1, %addr : !fir.ref<index>) {
fir.do_loop %iv = %lb to %ub step %step unordered {
%0 = fir.alloca !fir.box<!fir.heap<!fir.char<1,?>>>
}
return
}

// CHECK-LABEL: func.func @alloca_in_loop
// CHECK: ^bb1
// CHECK: ^bb2
// CHECK: %[[STACKPTR:.*]] = fir.call @llvm.stacksave.p0() : () -> !fir.ref<i8>
// CHECK: fir.alloca !fir.box<!fir.heap<!fir.char<1,?>>>
// CHECK: fir.call @llvm.stackrestore.p0(%[[STACKPTR]]) : (!fir.ref<i8>) -> ()
// CHECK: cf.br ^bb1
// CHECK: ^bb3:
// CHECK: return
Loading