-
Notifications
You must be signed in to change notification settings - Fork 13.5k
[RISCV] Lower llvm.clear_cache to __riscv_flush_icache for glibc targets #93481
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[RISCV] Lower llvm.clear_cache to __riscv_flush_icache for glibc targets #93481
Conversation
@llvm/pr-subscribers-llvm-ir @llvm/pr-subscribers-backend-risc-v Author: Roger Ferrer Ibáñez (rofirrim) ChangesWe are working on Fortran applications on RISC-V Linux and some of them rely on passing a pointer to an internal (aka nested) function that uses the enclosing context (e.g., a variable of the enclosing function). Flang uses trampolines which rely on llvm.init.trampoline and llvm.adjust.trampoline. Those rely on having an executable stack. This change is a preliminary step to support trampolines on RISC-V. In this change we lower I could not find where it is specified that RISC-V on glibc targets must call Full diff: https://github.com/llvm/llvm-project/pull/93481.diff 5 Files Affected:
diff --git a/llvm/include/llvm/CodeGen/TargetLowering.h b/llvm/include/llvm/CodeGen/TargetLowering.h
index 50a8c7eb75af5..3f052449b0a6f 100644
--- a/llvm/include/llvm/CodeGen/TargetLowering.h
+++ b/llvm/include/llvm/CodeGen/TargetLowering.h
@@ -4764,8 +4764,15 @@ class TargetLowering : public TargetLoweringBase {
return false;
}
+ /// Returns true if the target needs to lower __builtin___clear_cache in a
+ /// specific way that is incompatible with the clear_cache
+ /// signature. When returning false, the lowering will invoke
+ /// getClearCacheBuiltinName.
+ virtual bool isClearCacheBuiltinTargetSpecific() const { return false; }
+
/// Return the builtin name for the __builtin___clear_cache intrinsic
- /// Default is to invoke the clear cache library call
+ /// This is only used if isClearCacheBuiltinTargetSpecific returns false.
+ /// If nullptr is returned, the builtin is lowered to no code.
virtual const char * getClearCacheBuiltinName() const {
return "__clear_cache";
}
diff --git a/llvm/lib/CodeGen/SelectionDAG/SelectionDAGBuilder.cpp b/llvm/lib/CodeGen/SelectionDAG/SelectionDAGBuilder.cpp
index ca352da5d36eb..320ec0cdfffb6 100644
--- a/llvm/lib/CodeGen/SelectionDAG/SelectionDAGBuilder.cpp
+++ b/llvm/lib/CodeGen/SelectionDAG/SelectionDAGBuilder.cpp
@@ -7518,8 +7518,13 @@ void SelectionDAGBuilder::visitIntrinsicCall(const CallInst &I,
return;
case Intrinsic::clear_cache:
/// FunctionName may be null.
- if (const char *FunctionName = TLI.getClearCacheBuiltinName())
- lowerCallToExternalSymbol(I, FunctionName);
+ if (!TLI.isClearCacheBuiltinTargetSpecific()) {
+ if (const char *FunctionName = TLI.getClearCacheBuiltinName())
+ lowerCallToExternalSymbol(I, FunctionName);
+ } else {
+ // Turn this into a target intrinsic node.
+ visitTargetIntrinsic(I, Intrinsic);
+ }
return;
case Intrinsic::donothing:
case Intrinsic::seh_try_begin:
diff --git a/llvm/lib/Target/RISCV/RISCVISelLowering.cpp b/llvm/lib/Target/RISCV/RISCVISelLowering.cpp
index f0e5a7d393b6c..0e591034698b6 100644
--- a/llvm/lib/Target/RISCV/RISCVISelLowering.cpp
+++ b/llvm/lib/Target/RISCV/RISCVISelLowering.cpp
@@ -662,6 +662,12 @@ RISCVTargetLowering::RISCVTargetLowering(const TargetMachine &TM,
setBooleanContents(ZeroOrOneBooleanContent);
+ if (Subtarget.getTargetTriple().isOSGlibc()) {
+ // Custom lowering of llvm.clear_cache
+ setOperationAction({ISD::INTRINSIC_VOID, ISD::INTRINSIC_VOID}, MVT::Other,
+ Custom);
+ }
+
if (Subtarget.hasVInstructions()) {
setBooleanVectorContents(ZeroOrOneBooleanContent);
@@ -7120,6 +7126,39 @@ SDValue RISCVTargetLowering::LowerOperation(SDValue Op,
}
}
+SDValue RISCVTargetLowering::emitFlushICache(SelectionDAG &DAG, SDValue InChain,
+ SDValue Start, SDValue End,
+ SDValue Flags, SDLoc DL) const {
+ TargetLowering::ArgListTy Args;
+ TargetLowering::ArgListEntry Entry;
+
+ // start
+ Entry.Node = Start;
+ Entry.Ty = PointerType::getUnqual(*DAG.getContext());
+ Args.push_back(Entry);
+
+ // end
+ Entry.Node = End;
+ Entry.Ty = PointerType::getUnqual(*DAG.getContext());
+ Args.push_back(Entry);
+
+ // flags
+ Entry.Node = Flags;
+ Entry.Ty = Type::getIntNTy(*DAG.getContext(), Subtarget.getXLen());
+ Args.push_back(Entry);
+
+ TargetLowering::CallLoweringInfo CLI(DAG);
+ EVT Ty = getPointerTy(DAG.getDataLayout());
+ CLI.setDebugLoc(DL).setChain(InChain).setLibCallee(
+ CallingConv::C, Type::getVoidTy(*DAG.getContext()),
+ DAG.getExternalSymbol("__riscv_flush_icache", Ty), std::move(Args));
+
+ std::pair<SDValue, SDValue> CallResult = LowerCallTo(CLI);
+
+ // This function returns void so only the out chain matters.
+ return CallResult.second;
+}
+
static SDValue getTargetNode(GlobalAddressSDNode *N, const SDLoc &DL, EVT Ty,
SelectionDAG &DAG, unsigned Flags) {
return DAG.getTargetGlobalAddress(N->getGlobal(), DL, Ty, 0, Flags);
@@ -9497,6 +9536,15 @@ SDValue RISCVTargetLowering::LowerINTRINSIC_VOID(SDValue Op,
return getVCIXISDNodeVOID(Op, DAG, RISCVISD::SF_VC_VVW_SE);
case Intrinsic::riscv_sf_vc_fvw_se:
return getVCIXISDNodeVOID(Op, DAG, RISCVISD::SF_VC_FVW_SE);
+ case Intrinsic::clear_cache: {
+ if (Subtarget.getTargetTriple().isOSGlibc()) {
+ SDLoc DL(Op);
+ SDValue Flags = DAG.getConstant(0, DL, Subtarget.getXLenVT());
+ return emitFlushICache(DAG, Op.getOperand(0), Op.getOperand(2),
+ Op.getOperand(3), Flags, DL);
+ }
+ break;
+ }
}
return lowerVectorIntrinsicScalars(Op, DAG, Subtarget);
@@ -21684,6 +21732,14 @@ SDValue RISCVTargetLowering::expandIndirectJTBranch(const SDLoc &dl,
return TargetLowering::expandIndirectJTBranch(dl, Value, Addr, JTI, DAG);
}
+bool RISCVTargetLowering::isClearCacheBuiltinTargetSpecific() const {
+ // We do a manual lowering for glibc-based targets to call
+ // __riscv_flush_icache instead.
+ if (Subtarget.getTargetTriple().isOSGlibc())
+ return true;
+ return TargetLowering::isClearCacheBuiltinTargetSpecific();
+}
+
namespace llvm::RISCVVIntrinsicsTable {
#define GET_RISCVVIntrinsicsTable_IMPL
diff --git a/llvm/lib/Target/RISCV/RISCVISelLowering.h b/llvm/lib/Target/RISCV/RISCVISelLowering.h
index 856ce06ba1c4f..a5301a997bf6b 100644
--- a/llvm/lib/Target/RISCV/RISCVISelLowering.h
+++ b/llvm/lib/Target/RISCV/RISCVISelLowering.h
@@ -897,6 +897,8 @@ class RISCVTargetLowering : public TargetLowering {
const RISCVTargetLowering &TLI,
RVVArgDispatcher &RVVDispatcher);
+ bool isClearCacheBuiltinTargetSpecific() const override;
+
private:
void analyzeInputArgs(MachineFunction &MF, CCState &CCInfo,
const SmallVectorImpl<ISD::InputArg> &Ins, bool IsRet,
@@ -1033,6 +1035,9 @@ class RISCVTargetLowering : public TargetLowering {
const APInt &AndMask) const override;
unsigned getMinimumJumpTableEntries() const override;
+
+ SDValue emitFlushICache(SelectionDAG &DAG, SDValue InChain, SDValue Start,
+ SDValue End, SDValue Flags, SDLoc DL) const;
};
/// As per the spec, the rules for passing vector arguments are as follows:
diff --git a/llvm/test/CodeGen/RISCV/clear-cache.ll b/llvm/test/CodeGen/RISCV/clear-cache.ll
new file mode 100644
index 0000000000000..84db1eb0d3bda
--- /dev/null
+++ b/llvm/test/CodeGen/RISCV/clear-cache.ll
@@ -0,0 +1,49 @@
+; NOTE: Assertions have been autogenerated by utils/update_llc_test_checks.py UTC_ARGS: --version 5
+; RUN: llc -mtriple=riscv32 < %s | FileCheck --check-prefix=RV32 %s
+; RUN: llc -mtriple=riscv64 < %s | FileCheck --check-prefix=RV64 %s
+; RUN: llc -mtriple=riscv32-unknown-linux-gnu < %s | FileCheck --check-prefix=RV32-GLIBC %s
+; RUN: llc -mtriple=riscv64-unknown-linux-gnu < %s | FileCheck --check-prefix=RV64-GLIBC %s
+
+declare void @llvm.clear_cache(ptr, ptr)
+
+define void @foo(ptr %a, ptr %b) nounwind {
+; RV32-LABEL: foo:
+; RV32: # %bb.0:
+; RV32-NEXT: addi sp, sp, -16
+; RV32-NEXT: sw ra, 12(sp) # 4-byte Folded Spill
+; RV32-NEXT: call __clear_cache
+; RV32-NEXT: lw ra, 12(sp) # 4-byte Folded Reload
+; RV32-NEXT: addi sp, sp, 16
+; RV32-NEXT: ret
+;
+; RV64-LABEL: foo:
+; RV64: # %bb.0:
+; RV64-NEXT: addi sp, sp, -16
+; RV64-NEXT: sd ra, 8(sp) # 8-byte Folded Spill
+; RV64-NEXT: call __clear_cache
+; RV64-NEXT: ld ra, 8(sp) # 8-byte Folded Reload
+; RV64-NEXT: addi sp, sp, 16
+; RV64-NEXT: ret
+;
+; RV32-GLIBC-LABEL: foo:
+; RV32-GLIBC: # %bb.0:
+; RV32-GLIBC-NEXT: addi sp, sp, -16
+; RV32-GLIBC-NEXT: sw ra, 12(sp) # 4-byte Folded Spill
+; RV32-GLIBC-NEXT: li a2, 0
+; RV32-GLIBC-NEXT: call __riscv_flush_icache
+; RV32-GLIBC-NEXT: lw ra, 12(sp) # 4-byte Folded Reload
+; RV32-GLIBC-NEXT: addi sp, sp, 16
+; RV32-GLIBC-NEXT: ret
+;
+; RV64-GLIBC-LABEL: foo:
+; RV64-GLIBC: # %bb.0:
+; RV64-GLIBC-NEXT: addi sp, sp, -16
+; RV64-GLIBC-NEXT: sd ra, 8(sp) # 8-byte Folded Spill
+; RV64-GLIBC-NEXT: li a2, 0
+; RV64-GLIBC-NEXT: call __riscv_flush_icache
+; RV64-GLIBC-NEXT: ld ra, 8(sp) # 8-byte Folded Reload
+; RV64-GLIBC-NEXT: addi sp, sp, 16
+; RV64-GLIBC-NEXT: ret
+ call void @llvm.clear_cache(ptr %a, ptr %b)
+ ret void
+}
|
Just FYI, we have implemented llvm-project/compiler-rt/lib/builtins/clear_cache.c Lines 184 to 194 in 5c7c1f6
So we can link it by specifying --rtlib=compiler-rt .For libgcc , __clear_cache does nothing.For glibc's __riscv_flush_icache implementation, it just calls the __NR_riscv_flush_icache (the same as compiler-rt 's __clear_cache implementation), and there may be vDSO acceleration.
|
2a51c76
to
0661f8c
Compare
Thanks for the update. Do you use |
No I don't use
We can know it in Clang via the Maybe another way to fix this issue is implementing |
Typically __clear_cache is used but on these targets __riscv_flush_icache is used instead. Because it also has a different signature we need custom lowering.
This allows us to make the code much simpler.
07c1fd2
to
ceba0c1
Compare
@kito-cheng @wangpc-pp I currently gated this to glibc. However, investigating it further this seems a Linux ABI specific thing ( https://docs.kernel.org/arch/riscv/cmodx.html ) supported by both glibc and musl ( https://git.musl-libc.org/cgit/musl/tree/src/linux/cache.c#n39 ). It looks to me it makes more sense to gate this to just Linux and not glibc (which would also include things like Hurd). Do you agree? |
Yeah, make sense to me. Maybe we should write it down in psABI or elsewhere? @kito-cheng |
I agree that should just gate with linux rather than glibc, but I am incline do not including OS specific stuff within psABI if possible (although we already has document |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM :)
LLVM Buildbot has detected a new failure on builder Full details are available at: https://lab.llvm.org/buildbot/#/builders/101/builds/349 Here is the relevant piece of the build log for the reference:
|
LLVM Buildbot has detected a new failure on builder Full details are available at: https://lab.llvm.org/buildbot/#/builders/54/builds/153 Here is the relevant piece of the build log for the reference:
|
Our Windows build also seems to be seeing similar failures as the |
I'll take a look. Thanks for the heads up. |
@dyung if you can revert the change that'd be helpful, so I can investigate. Thanks! |
Oddly it no longer seems to be failing on the bot and I don’t think there was a revert. Perhaps it was some unrelated change? I’ll try to verify today whether it was your change or not that caused the test failures. |
Oh, good to know. Thanks a lot @dyung ! |
…ets (llvm#93481) This change is a preliminary step to support trampolines on RISC-V. Trampolines are used by flang to implement obtaining the address of an internal program (i.e., a nested function in Fortran parlance). In this change we lower `llvm.clear_cache` intrinsic on glibc targets to `__riscv_flush_icache` which is what GCC is currently doing for Linux targets.
We are working on Fortran applications on RISC-V Linux and some of them rely on passing a pointer to an internal (aka nested) function that uses the enclosing context (e.g., a variable of the enclosing function). Flang uses trampolines which rely on llvm.init.trampoline and llvm.adjust.trampoline. Those rely on having an executable stack.
This change is a preliminary step to support trampolines on RISC-V.
In this change we lower
llvm.clear_cache
intrinsic on glibc targets to__riscv_flush_icache
( https://www.gnu.org/software/libc/manual/html_node/RISC_002dV.html ) which is what GCC is currently doing (https://www.godbolt.org/z/qsd1P3fYT).I could not find where it is specified that RISC-V on glibc targets must call
__riscv_flush_icache
so before landing this we may want to address this issue.