[RISCV] Lower llvm.clear_cache to __riscv_flush_icache for glibc targets #93481

rofirrim · 2024-05-27T15:16:50Z

We are working on Fortran applications on RISC-V Linux and some of them rely on passing a pointer to an internal (aka nested) function that uses the enclosing context (e.g., a variable of the enclosing function). Flang uses trampolines which rely on llvm.init.trampoline and llvm.adjust.trampoline. Those rely on having an executable stack.

This change is a preliminary step to support trampolines on RISC-V.

In this change we lower llvm.clear_cache intrinsic on glibc targets to __riscv_flush_icache ( https://www.gnu.org/software/libc/manual/html_node/RISC_002dV.html ) which is what GCC is currently doing (https://www.godbolt.org/z/qsd1P3fYT).

I could not find where it is specified that RISC-V on glibc targets must call __riscv_flush_icache so before landing this we may want to address this issue.

llvmbot · 2024-05-27T15:17:24Z

@llvm/pr-subscribers-llvm-ir
@llvm/pr-subscribers-llvm-selectiondag

@llvm/pr-subscribers-backend-risc-v

Author: Roger Ferrer Ibáñez (rofirrim)

Changes

We are working on Fortran applications on RISC-V Linux and some of them rely on passing a pointer to an internal (aka nested) function that uses the enclosing context (e.g., a variable of the enclosing function). Flang uses trampolines which rely on llvm.init.trampoline and llvm.adjust.trampoline. Those rely on having an executable stack.

This change is a preliminary step to support trampolines on RISC-V.

In this change we lower llvm.clear_cache intrinsic on glibc targets to __riscv_flush_icache ( https://www.gnu.org/software/libc/manual/html_node/RISC_002dV.html ) which is what GCC is currently doing (https://www.godbolt.org/z/qsd1P3fYT). This change also must extend the generic part so a target can lower the call to a function that does not have the same signature as the generic __clear_cache.

I could not find where it is specified that RISC-V on glibc targets must call __riscv_flush_icache so before landing this we may want to address this issue.

Full diff: https://github.com/llvm/llvm-project/pull/93481.diff

5 Files Affected:

(modified) llvm/include/llvm/CodeGen/TargetLowering.h (+8-1)
(modified) llvm/lib/CodeGen/SelectionDAG/SelectionDAGBuilder.cpp (+7-2)
(modified) llvm/lib/Target/RISCV/RISCVISelLowering.cpp (+56)
(modified) llvm/lib/Target/RISCV/RISCVISelLowering.h (+5)
(added) llvm/test/CodeGen/RISCV/clear-cache.ll (+49)

diff --git a/llvm/include/llvm/CodeGen/TargetLowering.h b/llvm/include/llvm/CodeGen/TargetLowering.h
index 50a8c7eb75af5..3f052449b0a6f 100644
--- a/llvm/include/llvm/CodeGen/TargetLowering.h
+++ b/llvm/include/llvm/CodeGen/TargetLowering.h
@@ -4764,8 +4764,15 @@ class TargetLowering : public TargetLoweringBase {
     return false;
   }
 
+  /// Returns true if the target needs to lower __builtin___clear_cache in a
+  /// specific way that is incompatible with the clear_cache
+  /// signature. When returning false, the lowering will invoke
+  /// getClearCacheBuiltinName.
+  virtual bool isClearCacheBuiltinTargetSpecific() const { return false; }
+
   /// Return the builtin name for the __builtin___clear_cache intrinsic
-  /// Default is to invoke the clear cache library call
+  /// This is only used if isClearCacheBuiltinTargetSpecific returns false.
+  /// If nullptr is returned, the builtin is lowered to no code.
   virtual const char * getClearCacheBuiltinName() const {
     return "__clear_cache";
   }
diff --git a/llvm/lib/CodeGen/SelectionDAG/SelectionDAGBuilder.cpp b/llvm/lib/CodeGen/SelectionDAG/SelectionDAGBuilder.cpp
index ca352da5d36eb..320ec0cdfffb6 100644
--- a/llvm/lib/CodeGen/SelectionDAG/SelectionDAGBuilder.cpp
+++ b/llvm/lib/CodeGen/SelectionDAG/SelectionDAGBuilder.cpp
@@ -7518,8 +7518,13 @@ void SelectionDAGBuilder::visitIntrinsicCall(const CallInst &I,
     return;
   case Intrinsic::clear_cache:
     /// FunctionName may be null.
-    if (const char *FunctionName = TLI.getClearCacheBuiltinName())
-      lowerCallToExternalSymbol(I, FunctionName);
+    if (!TLI.isClearCacheBuiltinTargetSpecific()) {
+      if (const char *FunctionName = TLI.getClearCacheBuiltinName())
+        lowerCallToExternalSymbol(I, FunctionName);
+    } else {
+      // Turn this into a target intrinsic node.
+      visitTargetIntrinsic(I, Intrinsic);
+    }
     return;
   case Intrinsic::donothing:
   case Intrinsic::seh_try_begin:
diff --git a/llvm/lib/Target/RISCV/RISCVISelLowering.cpp b/llvm/lib/Target/RISCV/RISCVISelLowering.cpp
index f0e5a7d393b6c..0e591034698b6 100644
--- a/llvm/lib/Target/RISCV/RISCVISelLowering.cpp
+++ b/llvm/lib/Target/RISCV/RISCVISelLowering.cpp
@@ -662,6 +662,12 @@ RISCVTargetLowering::RISCVTargetLowering(const TargetMachine &TM,
 
   setBooleanContents(ZeroOrOneBooleanContent);
 
+  if (Subtarget.getTargetTriple().isOSGlibc()) {
+    // Custom lowering of llvm.clear_cache
+    setOperationAction({ISD::INTRINSIC_VOID, ISD::INTRINSIC_VOID}, MVT::Other,
+                       Custom);
+  }
+
   if (Subtarget.hasVInstructions()) {
     setBooleanVectorContents(ZeroOrOneBooleanContent);
 
@@ -7120,6 +7126,39 @@ SDValue RISCVTargetLowering::LowerOperation(SDValue Op,
   }
 }
 
+SDValue RISCVTargetLowering::emitFlushICache(SelectionDAG &DAG, SDValue InChain,
+                                             SDValue Start, SDValue End,
+                                             SDValue Flags, SDLoc DL) const {
+  TargetLowering::ArgListTy Args;
+  TargetLowering::ArgListEntry Entry;
+
+  // start
+  Entry.Node = Start;
+  Entry.Ty = PointerType::getUnqual(*DAG.getContext());
+  Args.push_back(Entry);
+
+  // end
+  Entry.Node = End;
+  Entry.Ty = PointerType::getUnqual(*DAG.getContext());
+  Args.push_back(Entry);
+
+  // flags
+  Entry.Node = Flags;
+  Entry.Ty = Type::getIntNTy(*DAG.getContext(), Subtarget.getXLen());
+  Args.push_back(Entry);
+
+  TargetLowering::CallLoweringInfo CLI(DAG);
+  EVT Ty = getPointerTy(DAG.getDataLayout());
+  CLI.setDebugLoc(DL).setChain(InChain).setLibCallee(
+      CallingConv::C, Type::getVoidTy(*DAG.getContext()),
+      DAG.getExternalSymbol("__riscv_flush_icache", Ty), std::move(Args));
+
+  std::pair<SDValue, SDValue> CallResult = LowerCallTo(CLI);
+
+  // This function returns void so only the out chain matters.
+  return CallResult.second;
+}
+
 static SDValue getTargetNode(GlobalAddressSDNode *N, const SDLoc &DL, EVT Ty,
                              SelectionDAG &DAG, unsigned Flags) {
   return DAG.getTargetGlobalAddress(N->getGlobal(), DL, Ty, 0, Flags);
@@ -9497,6 +9536,15 @@ SDValue RISCVTargetLowering::LowerINTRINSIC_VOID(SDValue Op,
     return getVCIXISDNodeVOID(Op, DAG, RISCVISD::SF_VC_VVW_SE);
   case Intrinsic::riscv_sf_vc_fvw_se:
     return getVCIXISDNodeVOID(Op, DAG, RISCVISD::SF_VC_FVW_SE);
+  case Intrinsic::clear_cache: {
+    if (Subtarget.getTargetTriple().isOSGlibc()) {
+      SDLoc DL(Op);
+      SDValue Flags = DAG.getConstant(0, DL, Subtarget.getXLenVT());
+      return emitFlushICache(DAG, Op.getOperand(0), Op.getOperand(2),
+                             Op.getOperand(3), Flags, DL);
+    }
+    break;
+  }
   }
 
   return lowerVectorIntrinsicScalars(Op, DAG, Subtarget);
@@ -21684,6 +21732,14 @@ SDValue RISCVTargetLowering::expandIndirectJTBranch(const SDLoc &dl,
   return TargetLowering::expandIndirectJTBranch(dl, Value, Addr, JTI, DAG);
 }
 
+bool RISCVTargetLowering::isClearCacheBuiltinTargetSpecific() const {
+  // We do a manual lowering for glibc-based targets to call
+  // __riscv_flush_icache instead.
+  if (Subtarget.getTargetTriple().isOSGlibc())
+    return true;
+  return TargetLowering::isClearCacheBuiltinTargetSpecific();
+}
+
 namespace llvm::RISCVVIntrinsicsTable {
 
 #define GET_RISCVVIntrinsicsTable_IMPL
diff --git a/llvm/lib/Target/RISCV/RISCVISelLowering.h b/llvm/lib/Target/RISCV/RISCVISelLowering.h
index 856ce06ba1c4f..a5301a997bf6b 100644
--- a/llvm/lib/Target/RISCV/RISCVISelLowering.h
+++ b/llvm/lib/Target/RISCV/RISCVISelLowering.h
@@ -897,6 +897,8 @@ class RISCVTargetLowering : public TargetLowering {
                                const RISCVTargetLowering &TLI,
                                RVVArgDispatcher &RVVDispatcher);
 
+  bool isClearCacheBuiltinTargetSpecific() const override;
+
 private:
   void analyzeInputArgs(MachineFunction &MF, CCState &CCInfo,
                         const SmallVectorImpl<ISD::InputArg> &Ins, bool IsRet,
@@ -1033,6 +1035,9 @@ class RISCVTargetLowering : public TargetLowering {
                                          const APInt &AndMask) const override;
 
   unsigned getMinimumJumpTableEntries() const override;
+
+  SDValue emitFlushICache(SelectionDAG &DAG, SDValue InChain, SDValue Start,
+                          SDValue End, SDValue Flags, SDLoc DL) const;
 };
 
 /// As per the spec, the rules for passing vector arguments are as follows:
diff --git a/llvm/test/CodeGen/RISCV/clear-cache.ll b/llvm/test/CodeGen/RISCV/clear-cache.ll
new file mode 100644
index 0000000000000..84db1eb0d3bda
--- /dev/null
+++ b/llvm/test/CodeGen/RISCV/clear-cache.ll
@@ -0,0 +1,49 @@
+; NOTE: Assertions have been autogenerated by utils/update_llc_test_checks.py UTC_ARGS: --version 5
+; RUN: llc -mtriple=riscv32 < %s | FileCheck --check-prefix=RV32 %s
+; RUN: llc -mtriple=riscv64 < %s | FileCheck --check-prefix=RV64 %s
+; RUN: llc -mtriple=riscv32-unknown-linux-gnu < %s | FileCheck --check-prefix=RV32-GLIBC %s
+; RUN: llc -mtriple=riscv64-unknown-linux-gnu < %s | FileCheck --check-prefix=RV64-GLIBC %s
+
+declare void @llvm.clear_cache(ptr, ptr)
+
+define void @foo(ptr %a, ptr %b) nounwind {
+; RV32-LABEL: foo:
+; RV32:       # %bb.0:
+; RV32-NEXT:    addi sp, sp, -16
+; RV32-NEXT:    sw ra, 12(sp) # 4-byte Folded Spill
+; RV32-NEXT:    call __clear_cache
+; RV32-NEXT:    lw ra, 12(sp) # 4-byte Folded Reload
+; RV32-NEXT:    addi sp, sp, 16
+; RV32-NEXT:    ret
+;
+; RV64-LABEL: foo:
+; RV64:       # %bb.0:
+; RV64-NEXT:    addi sp, sp, -16
+; RV64-NEXT:    sd ra, 8(sp) # 8-byte Folded Spill
+; RV64-NEXT:    call __clear_cache
+; RV64-NEXT:    ld ra, 8(sp) # 8-byte Folded Reload
+; RV64-NEXT:    addi sp, sp, 16
+; RV64-NEXT:    ret
+;
+; RV32-GLIBC-LABEL: foo:
+; RV32-GLIBC:       # %bb.0:
+; RV32-GLIBC-NEXT:    addi sp, sp, -16
+; RV32-GLIBC-NEXT:    sw ra, 12(sp) # 4-byte Folded Spill
+; RV32-GLIBC-NEXT:    li a2, 0
+; RV32-GLIBC-NEXT:    call __riscv_flush_icache
+; RV32-GLIBC-NEXT:    lw ra, 12(sp) # 4-byte Folded Reload
+; RV32-GLIBC-NEXT:    addi sp, sp, 16
+; RV32-GLIBC-NEXT:    ret
+;
+; RV64-GLIBC-LABEL: foo:
+; RV64-GLIBC:       # %bb.0:
+; RV64-GLIBC-NEXT:    addi sp, sp, -16
+; RV64-GLIBC-NEXT:    sd ra, 8(sp) # 8-byte Folded Spill
+; RV64-GLIBC-NEXT:    li a2, 0
+; RV64-GLIBC-NEXT:    call __riscv_flush_icache
+; RV64-GLIBC-NEXT:    ld ra, 8(sp) # 8-byte Folded Reload
+; RV64-GLIBC-NEXT:    addi sp, sp, 16
+; RV64-GLIBC-NEXT:    ret
+  call void @llvm.clear_cache(ptr %a, ptr %b)
+  ret void
+}

llvm/include/llvm/CodeGen/TargetLowering.h

wangpc-pp · 2024-05-28T05:22:09Z

Just FYI, we have implemented __clear_cache via calling __NR_riscv_flush_icache in compiler-rt:

llvm-project/compiler-rt/lib/builtins/clear_cache.c

Lines 184 to 194 in 5c7c1f6

    
           #elif defined(__riscv) && defined(__linux__) 
        
             // See: arch/riscv/include/asm/cacheflush.h, arch/riscv/kernel/sys_riscv.c 
        
             register void *start_reg __asm("a0") = start; 
        
             const register void *end_reg __asm("a1") = end; 
        
             // "0" means that we clear cache for all threads (SYS_RISCV_FLUSH_ICACHE_ALL) 
        
             const register long flags __asm("a2") = 0; 
        
             const register long syscall_nr __asm("a7") = __NR_riscv_flush_icache; 
        
             __asm __volatile("ecall" 
        
                              : "=r"(start_reg) 
        
                              : "r"(start_reg), "r"(end_reg), "r"(flags), "r"(syscall_nr)); 
        
             assert(start_reg == 0 && "Cache flush syscall failed.");

So we can link it by specifying --rtlib=compiler-rt.
For libgcc, __clear_cache does nothing.
For glibc's __riscv_flush_icache implementation, it just calls the __NR_riscv_flush_icache (the same as compiler-rt's __clear_cache implementation), and there may be vDSO acceleration.

llvm/lib/Target/RISCV/RISCVISelLowering.cpp

rofirrim · 2024-05-30T13:34:23Z

Just FYI, we have implemented __clear_cache via calling __NR_riscv_flush_icache in compiler-rt:

Thanks for the update. Do you use compiler-rt even on glibc targets? Not sure if there is a way to know we're on a glibc target and we are using libgcc for the compiler runtime (which would indeed have to call __riscv_flush_icache).

llvm/lib/Target/RISCV/RISCVISelLowering.cpp

wangpc-pp · 2024-05-31T04:16:36Z

Just FYI, we have implemented __clear_cache via calling __NR_riscv_flush_icache in compiler-rt:

Thanks for the update. Do you use compiler-rt even on glibc targets?

No I don't use compiler-rt on glibc.

Not sure if there is a way to know we're on a glibc target and we are using libgcc for the compiler runtime (which would indeed have to call __riscv_flush_icache).

We can know it in Clang via the RuntimeLibType we use, but I think this information won't be passed to LLVM's CodeGen part.

Maybe another way to fix this issue is implementing __clear_cache in libgcc. cc @kito-cheng

Typically __clear_cache is used but on these targets __riscv_flush_icache is used instead. Because it also has a different signature we need custom lowering.

This allows us to make the code much simpler.

rofirrim · 2024-06-14T06:16:40Z

@kito-cheng @wangpc-pp I currently gated this to glibc.

However, investigating it further this seems a Linux ABI specific thing ( https://docs.kernel.org/arch/riscv/cmodx.html ) supported by both glibc and musl ( https://git.musl-libc.org/cgit/musl/tree/src/linux/cache.c#n39 ).

It looks to me it makes more sense to gate this to just Linux and not glibc (which would also include things like Hurd).

Do you agree?

wangpc-pp · 2024-06-17T06:40:07Z

@kito-cheng @wangpc-pp I currently gated this to glibc.

However, investigating it further this seems a Linux ABI specific thing ( https://docs.kernel.org/arch/riscv/cmodx.html ) supported by both glibc and musl ( https://git.musl-libc.org/cgit/musl/tree/src/linux/cache.c#n39 ).

It looks to me it makes more sense to gate this to just Linux and not glibc (which would also include things like Hurd).

Do you agree?

Yeah, make sense to me. Maybe we should write it down in psABI or elsewhere? @kito-cheng

kito-cheng · 2024-06-17T06:51:31Z

I agree that should just gate with linux rather than glibc, but I am incline do not including OS specific stuff within psABI if possible (although we already has document wchat_t in psABI)

kito-cheng

LGTM :)

llvm-ci · 2024-06-20T05:37:34Z

LLVM Buildbot has detected a new failure on builder clang-cuda-l4 running on cuda-l4-0 while building llvm at step 3 "annotate".

Full details are available at: https://lab.llvm.org/buildbot/#/builders/101/builds/349

Here is the relevant piece of the build log for the reference:

Step 3 (annotate) failure: '/buildbot/cuda-build --jobs=' (failure)
...
  NV_LIBCUBLAS_PACKAGE_NAME=libcublas-12-2
  NV_LIBCUBLAS_VERSION=12.2.5.6-1
  NV_LIBCUSPARSE_VERSION=12.1.2.141-1
  NV_LIBNCCL_PACKAGE=libnccl2=2.18.5-1+cuda12.2
  NV_LIBNCCL_PACKAGE_NAME=libnccl2
  NV_LIBNCCL_PACKAGE_VERSION=2.18.5-1
  NV_LIBNPP_PACKAGE=libnpp-12-2=12.2.1.4-1
  NV_LIBNPP_VERSION=12.2.1.4-1
  NV_NVTX_VERSION=12.2.140-1
  PATH=/usr/local/nvidia/bin:/usr/local/cuda/bin:/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin:/buildbot
  PWD=/buildbot/cuda-l4-0/work/cuda-l4-0/clang-cuda-l4/build
  SHLVL=1
  TERM=dumb
  WORK_DIR=/buildbot/cuda-l4-0/work
  _=/usr/local/bin/buildbot-worker
 using PTY: False
++ echo @@@HALT_ON_FAILURE@@@
++ readlink -f ..
+ buildbot_dir=/buildbot/cuda-l4-0/work/cuda-l4-0/clang-cuda-l4
+ revision=5ef02d9963765514f094092d6635eb8b4f1f9ce6
+ GPU_ARCH=sm_89
+ CUDA_TEST_JOBS=1
+ build_base=/buildbot/cuda-l4-0/work/clang-cuda-l4
+ mkdir -p /buildbot/cuda-l4-0/work/clang-cuda-l4
+ build_dir=/buildbot/cuda-l4-0/work/clang-cuda-l4/build
+ libc_build_dir=/buildbot/cuda-l4-0/work/clang-cuda-l4/build-libc
+ clang_dir=/buildbot/cuda-l4-0/work/clang-cuda-l4/clang
+ testsuite_dir=/buildbot/cuda-l4-0/work/clang-cuda-l4/llvm-test-suite
+ llvm_src_dir=/buildbot/cuda-l4-0/work/clang-cuda-l4/llvm
+ ext_dir=/buildbot/cuda-l4-0/work/clang-cuda-l4/external
+ inner_pid=231420
+ trap 'handle_termination $inner_pid' TERM
+ do_build_and_test
+ wait 231420
+ fetch_prebuilt_clang 5ef02d9963765514f094092d6635eb8b4f1f9ce6 /buildbot/cuda-l4-0/work/clang-cuda-l4/clang
+ local revision=5ef02d9963765514f094092d6635eb8b4f1f9ce6
+ local destdir=/buildbot/cuda-l4-0/work/clang-cuda-l4/clang
+ local 'timeout=10 minutes'
++ date -ud '10 minutes' +%s
+ local endtime=1718862034
++ storage_location llvm-5ef02d9963765514f094092d6635eb8b4f1f9ce6
++ local file=llvm-5ef02d9963765514f094092d6635eb8b4f1f9ce6
++ local default_storage_prefix=gs://cudabot-gce-artifacts/
++ echo gs://cudabot-gce-artifacts/llvm-5ef02d9963765514f094092d6635eb8b4f1f9ce6
+ local snapshot=gs://cudabot-gce-artifacts/llvm-5ef02d9963765514f094092d6635eb8b4f1f9ce6
+ step 'Waiting for LLVM & Clang snapshot to be built. '
+ local 'name=Waiting for LLVM & Clang snapshot to be built. '
+ local summary=
+ echo '@@@BUILD_STEP Waiting for LLVM & Clang snapshot to be built. @@@'
+ step_summary_clear

llvm-ci · 2024-06-20T05:49:18Z

LLVM Buildbot has detected a new failure on builder llvm-nvptx-nvidia-win running on as-builder-8 while building llvm at step 7 "test-build-unified-tree-check-llvm".

Full details are available at: https://lab.llvm.org/buildbot/#/builders/54/builds/153

Here is the relevant piece of the build log for the reference:

Step 7 (test-build-unified-tree-check-llvm) failure: test (failure)
******************** TEST 'LLVM :: CodeGen/X86/2008-05-12-tailmerge-5.ll' FAILED ********************
Exit Code: 1

Command Output (stdout):
--
# RUN: at line 2
c:\buildbot\as-builder-8\llvm-nvptx-nvidia-win\build\bin\llc.exe < C:\buildbot\as-builder-8\llvm-nvptx-nvidia-win\llvm-project\llvm\test\CodeGen\X86\2008-05-12-tailmerge-5.ll | c:\buildbot\as-builder-8\llvm-nvptx-nvidia-win\build\bin\filecheck.exe C:\buildbot\as-builder-8\llvm-nvptx-nvidia-win\llvm-project\llvm\test\CodeGen\X86\2008-05-12-tailmerge-5.ll
# executed command: 'c:\buildbot\as-builder-8\llvm-nvptx-nvidia-win\build\bin\llc.exe'
# executed command: 'c:\buildbot\as-builder-8\llvm-nvptx-nvidia-win\build\bin\filecheck.exe' 'C:\buildbot\as-builder-8\llvm-nvptx-nvidia-win\llvm-project\llvm\test\CodeGen\X86\2008-05-12-tailmerge-5.ll'
# .---command stderr------------
# | C:\buildbot\as-builder-8\llvm-nvptx-nvidia-win\llvm-project\llvm\test\CodeGen\X86\2008-05-12-tailmerge-5.ll:25:15: error: CHECK-NEXT: expected string not found in input
# | ; CHECK-NEXT: shll $14, %edi
# |               ^
# | <stdin>:17:21: note: scanning from here
# |  movw %r9w, 14(%rsp)
# |                     ^
# | <stdin>:19:2: note: possible intended match here
# |  shll $14, %eax
# |  ^
# | 
# | Input file: <stdin>
# | Check file: C:\buildbot\as-builder-8\llvm-nvptx-nvidia-win\llvm-project\llvm\test\CodeGen\X86\2008-05-12-tailmerge-5.ll
# | 
# | -dump-input=help explains the following input dump.
# | 
# | Input was:
# | <<<<<<
# |            .
# |            .
# |            .
# |           12:  movb %al, 26(%rsp) 
# |           13:  movb %ah, 27(%rsp) 
# |           14:  movw %dx, 38(%rsp) 
# |           15:  movl %ecx, 32(%rsp) 
# |           16:  movb %r8b, 13(%rsp) 
# |           17:  movw %r9w, 14(%rsp) 
# | next:25'0                         X error: no match found
# |           18:  movl 16(%rsp), %eax 
# | next:25'0     ~~~~~~~~~~~~~~~~~~~~~
# |           19:  shll $14, %eax 
# | next:25'0     ~~~~~~~~~~~~~~~~
# | next:25'1      ?               possible intended match
# |           20:  sarl $23, %eax 
# | next:25'0     ~~~~~~~~~~~~~~~~
# |           21:  cmpl 32(%rsp), %eax 
# | next:25'0     ~~~~~~~~~~~~~~~~~~~~~
# |           22:  jne LBB0_6 
# | next:25'0     ~~~~~~~~~~~~
# |           23: ## %bb.1: ## %bb27 
# | next:25'0     ~~~~~~~~~~~~~~~~~~~
...

dyung · 2024-06-20T06:14:17Z

Our Windows build also seems to be seeing similar failures as the llvm-nvptx-nvidia-win builder https://lab.llvm.org/buildbot/#/builders/46/builds/312

rofirrim · 2024-06-20T08:41:22Z

Our Windows build also seems to be seeing similar failures as the llvm-nvptx-nvidia-win builder https://lab.llvm.org/buildbot/#/builders/46/builds/312

I'll take a look. Thanks for the heads up.

rofirrim · 2024-06-20T09:55:06Z

@dyung if you can revert the change that'd be helpful, so I can investigate. Thanks!

dyung · 2024-06-20T12:18:21Z

@dyung if you can revert the change that'd be helpful, so I can investigate. Thanks!

Oddly it no longer seems to be failing on the bot and I don’t think there was a revert. Perhaps it was some unrelated change? I’ll try to verify today whether it was your change or not that caused the test failures.

rofirrim · 2024-06-20T12:42:53Z

@dyung if you can revert the change that'd be helpful, so I can investigate. Thanks!

Oddly it no longer seems to be failing on the bot and I don’t think there was a revert. Perhaps it was some unrelated change? I’ll try to verify today whether it was your change or not that caused the test failures.

Oh, good to know. Thanks a lot @dyung !

…ets (llvm#93481) This change is a preliminary step to support trampolines on RISC-V. Trampolines are used by flang to implement obtaining the address of an internal program (i.e., a nested function in Fortran parlance). In this change we lower `llvm.clear_cache` intrinsic on glibc targets to `__riscv_flush_icache` which is what GCC is currently doing for Linux targets.

rofirrim requested review from asb, lukel97, kito-cheng and topperc May 27, 2024 15:16

llvmbot added backend:RISC-V llvm:SelectionDAG SelectionDAGISel as well labels May 27, 2024

arsenm reviewed May 27, 2024

View reviewed changes

llvm/include/llvm/CodeGen/TargetLowering.h Outdated Show resolved Hide resolved

rofirrim force-pushed the riscv-support-for-trampolines-change-2 branch from 2a51c76 to 0661f8c Compare May 30, 2024 13:30

rofirrim changed the title ~~[RISCV][SelectionDAG] Lower llvm.clear_cache to __riscv_flush_icache for glibc targets~~ [RISCV] Lower llvm.clear_cache to __riscv_flush_icache for glibc targets May 30, 2024

rofirrim removed the llvm:SelectionDAG SelectionDAGISel as well label May 30, 2024

arsenm reviewed May 30, 2024

View reviewed changes

llvm/lib/Target/RISCV/RISCVISelLowering.cpp Outdated Show resolved Hide resolved

arsenm reviewed May 30, 2024

View reviewed changes

llvm/lib/Target/RISCV/RISCVISelLowering.cpp Outdated Show resolved Hide resolved

arsenm reviewed May 30, 2024

View reviewed changes

llvm/lib/Target/RISCV/RISCVISelLowering.cpp Outdated Show resolved Hide resolved

rofirrim added 3 commits June 13, 2024 05:32

[RISCV] Custom lower llvm.clear_cache on glibc targets

01673ae

Typically __clear_cache is used but on these targets __riscv_flush_icache is used instead. Because it also has a different signature we need custom lowering.

Get the TargetTriple directly from the TargetMachine

6f5598b

Add a runtime call for __riscv_flush_icache

ceba0c1

This allows us to make the code much simpler.

rofirrim force-pushed the riscv-support-for-trampolines-change-2 branch from 07c1fd2 to ceba0c1 Compare June 13, 2024 07:04

llvmbot added the llvm:ir label Jun 13, 2024

arsenm approved these changes Jun 13, 2024

View reviewed changes

Gate lowering to Linux instead of Glibc

0d896ad

kito-cheng approved these changes Jun 18, 2024

View reviewed changes

rofirrim merged commit 5ef02d9 into llvm:main Jun 20, 2024
5 of 7 checks passed

alexrp mentioned this pull request Jul 26, 2024

Clang should call __riscv_flush_icache for __builtin___clear_cache on Linux RISC-V (w/ libgcc?) #63551

Closed

rofirrim deleted the riscv-support-for-trampolines-change-2 branch October 18, 2024 06:07

[RISCV] Lower llvm.clear_cache to __riscv_flush_icache for glibc targets #93481

[RISCV] Lower llvm.clear_cache to __riscv_flush_icache for glibc targets #93481

Uh oh!

Conversation

rofirrim commented May 27, 2024 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

llvmbot commented May 27, 2024 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Uh oh!

wangpc-pp commented May 28, 2024

Uh oh!

Uh oh!

rofirrim commented May 30, 2024

Uh oh!

Uh oh!

Uh oh!

wangpc-pp commented May 31, 2024

Uh oh!

rofirrim commented Jun 14, 2024

Uh oh!

wangpc-pp commented Jun 17, 2024

Uh oh!

kito-cheng commented Jun 17, 2024

Uh oh!

kito-cheng left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

llvm-ci commented Jun 20, 2024

Uh oh!

llvm-ci commented Jun 20, 2024

Uh oh!

dyung commented Jun 20, 2024

Uh oh!

rofirrim commented Jun 20, 2024

Uh oh!

rofirrim commented Jun 20, 2024

Uh oh!

dyung commented Jun 20, 2024

Uh oh!

rofirrim commented Jun 20, 2024

Uh oh!

Uh oh!

rofirrim commented May 27, 2024 •

edited

Loading

llvmbot commented May 27, 2024 •

edited

Loading