Skip to content

[lld][LoongArch] Relax R_LARCH_{PCALA,GOT_PC}_{HI20,LO12} #123566

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 7 commits into from
Feb 15, 2025

Conversation

ylzsx
Copy link
Contributor

@ylzsx ylzsx commented Jan 20, 2025

Support relaxation optimization for two types of code sequences.

From:
   pcalau12i $a0, %pc_hi20(sym)
       R_LARCH_PCALA_HI20, R_LARCH_RELAX
   addi.w/d $a0, $a0, %pc_lo12(sym)
       R_LARCH_PCALA_LO12, R_LARCH_RELAX
To:
   pcaddi $a0, %pc_lo12(sym)
       R_LARCH_PCREL20_S2
    
From:
   pcalau12i $a0, %got_pc_hi20(sym_got)
       R_LARCH_GOT_PC_HI20, R_LARCH_RELAX
   ld.w/d $a0, $a0, %got_pc_lo12(sym_got)
       R_LARCH_GOT_PC_LO12, R_LARCH_RELAX
To:
   pcaddi $a0, %got_pc_hi20(sym_got)
       R_LARCH_PCREL20_S2

Others:

  • loongarch-relax-pc-hi20-lo12-got-symbols.s is inspired by aarch64-adrp-ldr-got-symbols.s.

Co-authored-by: Xin Wang [email protected]

ylzsx added 5 commits January 20, 2025 10:08
Support relaxation optimization for two types of code sequences.
```
From:
  pcalau12i $a0, %pc_hi20(sym)
    R_LARCH_PCALA_HI20, R_LARCH_RELAX
  addi.w/d $a0, $a0, %pc_lo12(sym)
    R_LARCH_PCALA_LO12, R_LARCH_RELAX
To:
  pcaddi $a0, %pc_lo12(sym)
    R_LARCH_PCREL20_S2

From:
  pcalau12i $a0, %got_pc_hi20(sym_got)
    R_LARCH_GOT_PC_HI20, R_LARCH_RELAX
  ld.w/d $a0, $a0, %got_pc_lo12(sym_got)
    R_LARCH_GOT_PC_LO12, R_LARCH_RELAX
To:
  pcaddi $a0, %got_pc_hi20(sym_got)
    R_LARCH_PCREL20_S2
```
Similar to aarch64-adrp-ldr-got-symbols.s.
Dependency on #123017
@llvmbot
Copy link
Member

llvmbot commented Jan 20, 2025

@llvm/pr-subscribers-backend-loongarch
@llvm/pr-subscribers-lld

@llvm/pr-subscribers-lld-elf

Author: Zhaoxin Yang (ylzsx)

Changes

Support relaxation optimization for two types of code sequences.

From:
   pcalau12i $a0, %pc_hi20(sym)
       R_LARCH_PCALA_HI20, R_LARCH_RELAX
   addi.w/d $a0, $a0, %pc_lo12(sym)
       R_LARCH_PCALA_LO12, R_LARCH_RELAX
To:
   pcaddi $a0, %pc_lo12(sym)
       R_LARCH_PCREL20_S2
    
From:
   pcalau12i $a0, %got_pc_hi20(sym_got)
       R_LARCH_GOT_PC_HI20, R_LARCH_RELAX
   ld.w/d $a0, $a0, %got_pc_lo12(sym_got)
       R_LARCH_GOT_PC_LO12, R_LARCH_RELAX
To:
   pcaddi $a0, %got_pc_hi20(sym_got)
       R_LARCH_PCREL20_S2

Others:

  • loongarch-relax-pc-hi20-lo12-got-symbols.s is inspired by aarch64-adrp-ldr-got-symbols.s.

Co-authored-by: Xin Wang [[email protected]](mailto:[email protected])


Patch is 22.43 KiB, truncated to 20.00 KiB below, full version: https://github.com/llvm/llvm-project/pull/123566.diff

5 Files Affected:

  • (modified) lld/ELF/Arch/LoongArch.cpp (+112-2)
  • (modified) lld/test/ELF/loongarch-relax-align.s (+82-47)
  • (modified) lld/test/ELF/loongarch-relax-emit-relocs.s (+46-20)
  • (added) lld/test/ELF/loongarch-relax-pc-hi20-lo12-got-symbols.s (+90)
  • (added) lld/test/ELF/loongarch-relax-pc-hi20-lo12.s (+62)
diff --git a/lld/ELF/Arch/LoongArch.cpp b/lld/ELF/Arch/LoongArch.cpp
index 3280c34cb6ed05..69d14a6e996dae 100644
--- a/lld/ELF/Arch/LoongArch.cpp
+++ b/lld/ELF/Arch/LoongArch.cpp
@@ -53,6 +53,7 @@ enum Op {
   ADDI_W = 0x02800000,
   ADDI_D = 0x02c00000,
   ANDI = 0x03400000,
+  PCADDI = 0x18000000,
   PCADDU12I = 0x1c000000,
   LD_W = 0x28800000,
   LD_D = 0x28c00000,
@@ -131,6 +132,8 @@ static uint32_t extractBits(uint64_t v, uint32_t begin, uint32_t end) {
   return begin == 63 ? v >> end : (v & ((1ULL << (begin + 1)) - 1)) >> end;
 }
 
+static uint32_t getD5(uint64_t v) { return extractBits(v, 4, 0); }
+
 static uint32_t setD5k16(uint32_t insn, uint32_t imm) {
   uint32_t immLo = extractBits(imm, 15, 0);
   uint32_t immHi = extractBits(imm, 20, 16);
@@ -743,6 +746,88 @@ void LoongArch::relocate(uint8_t *loc, const Relocation &rel,
   }
 }
 
+static bool relaxable(ArrayRef<Relocation> relocs, size_t i) {
+  return i + 1 < relocs.size() && relocs[i + 1].type == R_LARCH_RELAX;
+}
+
+static bool isPairRelaxable(ArrayRef<Relocation> relocs, size_t i) {
+  return relaxable(relocs, i) && relaxable(relocs, i + 2) &&
+         relocs[i].offset + 4 == relocs[i + 2].offset;
+}
+
+// Relax code sequence.
+// From:
+//   pcalau12i $a0, %pc_hi20(sym)
+//   addi.w/d $a0, $a0, %pc_lo12(sym)
+// To:
+//   pcaddi $a0, %pc_lo12(sym)
+//
+// From:
+//   pcalau12i $a0, %got_pc_hi20(sym_got)
+//   ld.w/d $a0, $a0, %got_pc_lo12(sym_got)
+// To:
+//   pcaddi $a0, %got_pc_hi20(sym_got)
+static void relaxPCHi20Lo12(Ctx &ctx, const InputSection &sec, size_t i,
+                            uint64_t loc, Relocation &rHi20, Relocation &rLo12,
+                            uint32_t &remove) {
+  // check if the relocations are relaxable sequences.
+  if (!((rHi20.type == R_LARCH_PCALA_HI20 &&
+         rLo12.type == R_LARCH_PCALA_LO12) ||
+        (rHi20.type == R_LARCH_GOT_PC_HI20 &&
+         rLo12.type == R_LARCH_GOT_PC_LO12)))
+    return;
+
+  // GOT references to absolute symbols can't be relaxed to use pcaddi in
+  // position-independent code, because these instructions produce a relative
+  // address.
+  // Meanwhile skip undefined, preemptible and STT_GNU_IFUNC symbols, because
+  // these symbols may be resolve in runtime.
+  if (rHi20.type == R_LARCH_GOT_PC_HI20 &&
+      (!rHi20.sym->isDefined() || rHi20.sym->isPreemptible ||
+       rHi20.sym->isGnuIFunc() ||
+       (ctx.arg.isPic && !cast<Defined>(*rHi20.sym).section)))
+    return;
+
+  uint64_t symBase = 0;
+  if (rHi20.expr == RE_LOONGARCH_PLT_PAGE_PC)
+    symBase = rHi20.sym->getPltVA(ctx);
+  else if (rHi20.expr == RE_LOONGARCH_PAGE_PC ||
+           rHi20.expr == RE_LOONGARCH_GOT_PAGE_PC)
+    symBase = rHi20.sym->getVA(ctx);
+  else {
+    Err(ctx) << getErrorLoc(ctx, (const uint8_t *)loc) << "unknown expr ("
+             << rHi20.expr << ") against symbol " << rHi20.sym
+             << "in relaxPCHi20Lo12";
+    return;
+  }
+  const uint64_t symLocal = symBase + rHi20.addend;
+
+  const int64_t distance = symLocal - loc;
+  // Check if the distance aligns 4 bytes or exceeds the range of pcaddi.
+  if ((distance & 0x3) != 0 || !isInt<22>(distance))
+    return;
+
+  // Note: If we can ensure that the .o files generated by LLVM only contain
+  // relaxable instruction sequences with R_LARCH_RELAX, then we do not need to
+  // decode instructions. The relaxable instruction sequences imply the
+  // following constraints:
+  // * For relocation pairs related to got_pc, the opcodes of instructions
+  // must be pcalau12i + ld.w/d. In other cases, the opcodes must be pcalau12i +
+  // addi.w/d.
+  // * The destination register of pcalau12i is guaranteed to be used only by
+  // the immediately following instruction.
+  const uint32_t currInsn = read32le(sec.content().data() + rHi20.offset);
+  const uint32_t nextInsn = read32le(sec.content().data() + rLo12.offset);
+  // Check if use the same register.
+  if (getD5(currInsn) != getJ5(nextInsn) || getJ5(nextInsn) != getD5(nextInsn))
+    return;
+
+  sec.relaxAux->relocTypes[i] = R_LARCH_RELAX;
+  sec.relaxAux->relocTypes[i + 2] = R_LARCH_PCREL20_S2;
+  sec.relaxAux->writes.push_back(insn(PCADDI, getD5(nextInsn), 0, 0));
+  remove = 4;
+}
+
 static bool relax(Ctx &ctx, InputSection &sec) {
   const uint64_t secAddr = sec.getVA();
   const MutableArrayRef<Relocation> relocs = sec.relocs();
@@ -781,6 +866,12 @@ static bool relax(Ctx &ctx, InputSection &sec) {
       }
       break;
     }
+    case R_LARCH_PCALA_HI20:
+    case R_LARCH_GOT_PC_HI20:
+      // The overflow check for i+2 will be carried out in isPairRelaxable.
+      if (isPairRelaxable(relocs, i))
+        relaxPCHi20Lo12(ctx, sec, i, loc, r, relocs[i + 2], remove);
+      break;
     }
 
     // For all anchors whose offsets are <= r.offset, they are preceded by
@@ -851,6 +942,7 @@ void LoongArch::finalizeRelax(int passes) const {
       MutableArrayRef<Relocation> rels = sec->relocs();
       ArrayRef<uint8_t> old = sec->content();
       size_t newSize = old.size() - aux.relocDeltas[rels.size() - 1];
+      size_t writesIdx = 0;
       uint8_t *p = ctx.bAlloc.Allocate<uint8_t>(newSize);
       uint64_t offset = 0;
       int64_t delta = 0;
@@ -867,11 +959,29 @@ void LoongArch::finalizeRelax(int passes) const {
           continue;
 
         // Copy from last location to the current relocated location.
-        const Relocation &r = rels[i];
+        Relocation &r = rels[i];
         uint64_t size = r.offset - offset;
         memcpy(p, old.data() + offset, size);
         p += size;
-        offset = r.offset + remove;
+
+        int64_t skip = 0;
+        if (RelType newType = aux.relocTypes[i]) {
+          switch (newType) {
+          case R_LARCH_RELAX:
+            break;
+          case R_LARCH_PCREL20_S2:
+            skip = 4;
+            write32le(p, aux.writes[writesIdx++]);
+            // RelExpr is needed for relocating.
+            r.expr = r.sym->hasFlag(NEEDS_PLT) ? R_PLT_PC : R_PC;
+            break;
+          default:
+            llvm_unreachable("unsupported type");
+          }
+        }
+
+        p += skip;
+        offset = r.offset + skip + remove;
       }
       memcpy(p, old.data() + offset, old.size() - offset);
 
diff --git a/lld/test/ELF/loongarch-relax-align.s b/lld/test/ELF/loongarch-relax-align.s
index ab61e15d5caca2..66a8ed3abf71e0 100644
--- a/lld/test/ELF/loongarch-relax-align.s
+++ b/lld/test/ELF/loongarch-relax-align.s
@@ -2,60 +2,95 @@
 
 # RUN: llvm-mc --filetype=obj --triple=loongarch32 --mattr=+relax %s -o %t.32.o
 # RUN: llvm-mc --filetype=obj --triple=loongarch64 --mattr=+relax %s -o %t.64.o
-# RUN: ld.lld --section-start=.text=0x10000 --section-start=.text2=0x20000 -e 0 %t.32.o -o %t.32
-# RUN: ld.lld --section-start=.text=0x10000 --section-start=.text2=0x20000 -e 0 %t.64.o -o %t.64
-# RUN: ld.lld --section-start=.text=0x10000 --section-start=.text2=0x20000 -e 0 %t.32.o --no-relax -o %t.32n
-# RUN: ld.lld --section-start=.text=0x10000 --section-start=.text2=0x20000 -e 0 %t.64.o --no-relax -o %t.64n
-# RUN: llvm-objdump -td --no-show-raw-insn %t.32 | FileCheck %s
-# RUN: llvm-objdump -td --no-show-raw-insn %t.64 | FileCheck %s
-# RUN: llvm-objdump -td --no-show-raw-insn %t.32n | FileCheck %s
-# RUN: llvm-objdump -td --no-show-raw-insn %t.64n | FileCheck %s
+# RUN: ld.lld --section-start=.text=0x10000 --section-start=.text2=0x20000 -e 0 --relax %t.32.o -o %t.32
+# RUN: ld.lld --section-start=.text=0x10000 --section-start=.text2=0x20000 -e 0 --relax %t.64.o -o %t.64
+# RUN: ld.lld --section-start=.text=0x10000 --section-start=.text2=0x20000 -e 0 --no-relax %t.32.o -o %t.32n
+# RUN: ld.lld --section-start=.text=0x10000 --section-start=.text2=0x20000 -e 0 --no-relax %t.64.o -o %t.64n
+# RUN: llvm-objdump -td --no-show-raw-insn %t.32 | FileCheck --check-prefixes=RELAX,NOOLD %s
+# RUN: llvm-objdump -td --no-show-raw-insn %t.64 | FileCheck --check-prefixes=RELAX,NOOLD %s
+# RUN: llvm-objdump -td --no-show-raw-insn %t.32n | FileCheck --check-prefixes=NORELAX,NOOLD %s
+# RUN: llvm-objdump -td --no-show-raw-insn %t.64n | FileCheck --check-prefixes=NORELAX,NOOLD %s
 
 ## Test the R_LARCH_ALIGN without symbol index.
 # RUN: llvm-mc --filetype=obj --triple=loongarch64 --mattr=+relax %s -o %t.o64.o --defsym=old=1
-# RUN: ld.lld --section-start=.text=0x10000 --section-start=.text2=0x20000 -e 0 %t.o64.o -o %t.o64
-# RUN: ld.lld --section-start=.text=0x10000 --section-start=.text2=0x20000 -e 0 %t.o64.o --no-relax -o %t.o64n
-# RUN: llvm-objdump -td --no-show-raw-insn %t.o64 | FileCheck %s
-# RUN: llvm-objdump -td --no-show-raw-insn %t.o64n | FileCheck %s
+# RUN: ld.lld --section-start=.text=0x10000 --section-start=.text2=0x20000 -e 0 --relax %t.o64.o -o %t.o64
+# RUN: ld.lld --section-start=.text=0x10000 --section-start=.text2=0x20000 -e 0 --no-relax %t.o64.o -o %t.o64n
+# RUN: llvm-objdump -td --no-show-raw-insn %t.o64 | FileCheck --check-prefixes=RELAX,OLD %s
+# RUN: llvm-objdump -td --no-show-raw-insn %t.o64n | FileCheck --check-prefixes=NORELAX,OLD %s
 
 ## -r keeps section contents unchanged.
-# RUN: ld.lld -r %t.64.o -o %t.64.r
+# RUN: ld.lld -r --relax %t.64.o -o %t.64.r
 # RUN: llvm-objdump -dr --no-show-raw-insn %t.64.r | FileCheck %s --check-prefix=CHECKR
 
-# CHECK-DAG: {{0*}}10000 l .text  {{0*}}44 .Ltext_start
-# CHECK-DAG: {{0*}}10038 l .text  {{0*}}0c .L1
-# CHECK-DAG: {{0*}}10040 l .text  {{0*}}04 .L2
-# CHECK-DAG: {{0*}}20000 l .text2 {{0*}}14 .Ltext2_start
-
-# CHECK:      <.Ltext_start>:
-# CHECK-NEXT:   break 1
-# CHECK-NEXT:   break 2
-# CHECK-NEXT:   nop
-# CHECK-NEXT:   nop
-# CHECK-NEXT:   break 3
-# CHECK-NEXT:   break 4
-# CHECK-NEXT:   nop
-# CHECK-NEXT:   nop
-# CHECK-NEXT:   pcalau12i     $a0, 0
-# CHECK-NEXT:   addi.{{[dw]}} $a0, $a0, 0
-# CHECK-NEXT:   pcalau12i     $a0, 0
-# CHECK-NEXT:   addi.{{[dw]}} $a0, $a0, 56
-# CHECK-NEXT:   pcalau12i     $a0, 0
-# CHECK-NEXT:   addi.{{[dw]}} $a0, $a0, 64
-# CHECK-EMPTY:
-# CHECK-NEXT: <.L1>:
-# CHECK-NEXT:   nop
-# CHECK-NEXT:   nop
-# CHECK-EMPTY:
-# CHECK-NEXT: <.L2>:
-# CHECK-NEXT:   break 5
-
-# CHECK:      <.Ltext2_start>:
-# CHECK-NEXT:   pcalau12i     $a0, 0
-# CHECK-NEXT:   addi.{{[dw]}} $a0, $a0, 0
-# CHECK-NEXT:   nop
-# CHECK-NEXT:   nop
-# CHECK-NEXT:   break 6
+# NOOLD: {{0*}}10000 l .text  {{0*}}00 .Lalign_symbol
+# OLD: {{0*}}00001 l *ABS*  {{0*}}00 old
+
+# NORELAX-DAG: {{0*}}10000 l .text  {{0*}}44 .Ltext_start
+# NORELAX-DAG: {{0*}}10038 l .text  {{0*}}0c .L1
+# NORELAX-DAG: {{0*}}10040 l .text  {{0*}}04 .L2
+# NORELAX-DAG: {{0*}}20000 l .text2 {{0*}}14 .Ltext2_start
+
+# NORELAX:      <.Ltext_start>:
+# NORELAX-NEXT:   break 1
+# NORELAX-NEXT:   break 2
+# NORELAX-NEXT:   nop
+# NORELAX-NEXT:   nop
+# NORELAX-NEXT:   break 3
+# NORELAX-NEXT:   break 4
+# NORELAX-NEXT:   nop
+# NORELAX-NEXT:   nop
+# NORELAX-NEXT:   pcalau12i     $a0, 0
+# NORELAX-NEXT:   addi.{{[dw]}} $a0, $a0, 0
+# NORELAX-NEXT:   pcalau12i     $a0, 0
+# NORELAX-NEXT:   addi.{{[dw]}} $a0, $a0, 56
+# NORELAX-NEXT:   pcalau12i     $a0, 0
+# NORELAX-NEXT:   addi.{{[dw]}} $a0, $a0, 64
+# NORELAX-EMPTY:
+# NORELAX-NEXT: <.L1>:
+# NORELAX-NEXT:   nop
+# NORELAX-NEXT:   nop
+# NORELAX-EMPTY:
+# NORELAX-NEXT: <.L2>:
+# NORELAX-NEXT:   break 5
+
+# NORELAX:      <.Ltext2_start>:
+# NORELAX-NEXT:   pcalau12i     $a0, 0
+# NORELAX-NEXT:   addi.{{[dw]}} $a0, $a0, 0
+# NORELAX-NEXT:   nop
+# NORELAX-NEXT:   nop
+# NORELAX-NEXT:   break 6
+
+
+# RELAX-DAG: {{0*}}10000 l .text  {{0*}}34 .Ltext_start
+# RELAX-DAG: {{0*}}1002c l .text  {{0*}}08 .L1
+# RELAX-DAG: {{0*}}10030 l .text  {{0*}}04 .L2
+# RELAX-DAG: {{0*}}20000 l .text2 {{0*}}14 .Ltext2_start
+
+# RELAX:      <.Ltext_start>:
+# RELAX-NEXT:   break 1
+# RELAX-NEXT:   break 2
+# RELAX-NEXT:   nop
+# RELAX-NEXT:   nop
+# RELAX-NEXT:   break 3
+# RELAX-NEXT:   break 4
+# RELAX-NEXT:   nop
+# RELAX-NEXT:   nop
+# RELAX-NEXT:   pcaddi     $a0, -8
+# RELAX-NEXT:   pcaddi     $a0, 2
+# RELAX-NEXT:   pcaddi     $a0, 2
+# RELAX-EMPTY:
+# RELAX-NEXT: <.L1>:
+# RELAX-NEXT:   nop
+# RELAX-EMPTY:
+# RELAX-NEXT: <.L2>:
+# RELAX-NEXT:   break 5
+
+# RELAX:      <.Ltext2_start>:
+# RELAX-NEXT:   pcaddi     $a0, 0
+# RELAX-NEXT:   nop
+# RELAX-NEXT:   nop
+# RELAX-NEXT:   nop
+# RELAX-NEXT:   break 6
 
 # CHECKR:      <.Ltext2_start>:
 # CHECKR-NEXT:   pcalau12i $a0, 0
diff --git a/lld/test/ELF/loongarch-relax-emit-relocs.s b/lld/test/ELF/loongarch-relax-emit-relocs.s
index ba414e8c93f0fb..a02cd272aba5bf 100644
--- a/lld/test/ELF/loongarch-relax-emit-relocs.s
+++ b/lld/test/ELF/loongarch-relax-emit-relocs.s
@@ -3,39 +3,64 @@
 
 # RUN: llvm-mc --filetype=obj --triple=loongarch32 --mattr=+relax %s -o %t.32.o
 # RUN: llvm-mc --filetype=obj --triple=loongarch64 --mattr=+relax %s -o %t.64.o
-# RUN: ld.lld -Ttext=0x10000 --emit-relocs %t.32.o -o %t.32
-# RUN: ld.lld -Ttext=0x10000 --emit-relocs %t.64.o -o %t.64
-# RUN: llvm-objdump -dr %t.32 | FileCheck %s
-# RUN: llvm-objdump -dr %t.64 | FileCheck %s
+# RUN: ld.lld -Ttext=0x10000 -section-start=.got=0x20000 --emit-relocs --relax %t.32.o -o %t.32
+# RUN: ld.lld -Ttext=0x10000 -section-start=.got=0x20000 --emit-relocs --relax %t.64.o -o %t.64
+# RUN: llvm-objdump -dr %t.32 | FileCheck %s --check-prefix=RELAX
+# RUN: llvm-objdump -dr %t.64 | FileCheck %s --check-prefix=RELAX
 
 ## -r should keep original relocations.
-# RUN: ld.lld -r %t.64.o -o %t.64.r
+# RUN: ld.lld --relax -r %t.64.o -o %t.64.r
 # RUN: llvm-objdump -dr %t.64.r | FileCheck %s --check-prefix=CHECKR
 
 ## --no-relax should keep original relocations.
-## TODO Due to R_LARCH_RELAX is not relaxed, it plays same as --relax now.
-# RUN: ld.lld -Ttext=0x10000 --emit-relocs --no-relax %t.64.o -o %t.64.norelax
-# RUN: llvm-objdump -dr %t.64.norelax | FileCheck %s
+# RUN: ld.lld -Ttext=0x10000 -section-start=.got=0x20000 --emit-relocs --no-relax %t.64.o -o %t.64.norelax
+# RUN: llvm-objdump -dr %t.64.norelax | FileCheck %s --check-prefix=NORELAX
 
-# CHECK:      00010000 <_start>:
-# CHECK-NEXT:   pcalau12i $a0, 0
-# CHECK-NEXT:     R_LARCH_PCALA_HI20 _start
-# CHECK-NEXT:     R_LARCH_RELAX *ABS*
-# CHECK-NEXT:   addi.{{[dw]}} $a0, $a0, 0
-# CHECK-NEXT:     R_LARCH_PCALA_LO12 _start
-# CHECK-NEXT:     R_LARCH_RELAX *ABS*
-# CHECK-NEXT:   nop
-# CHECK-NEXT:     R_LARCH_ALIGN *ABS*+0xc
-# CHECK-NEXT:   nop
-# CHECK-NEXT:   ret
+# RELAX:      00010000 <_start>:
+# RELAX-NEXT:   pcaddi $a0, 0
+# RELAX-NEXT:     R_LARCH_RELAX _start
+# RELAX-NEXT:     R_LARCH_RELAX *ABS*
+# RELAX-NEXT:     R_LARCH_PCREL20_S2 _start
+# RELAX-NEXT:     R_LARCH_RELAX *ABS*
+# RELAX-NEXT:   pcaddi $a0, -1
+# RELAX-NEXT:     R_LARCH_RELAX _start
+# RELAX-NEXT:     R_LARCH_RELAX *ABS*
+# RELAX-NEXT:     R_LARCH_PCREL20_S2 _start
+# RELAX-NEXT:     R_LARCH_RELAX *ABS*
+# RELAX-NEXT:   nop
+# RELAX-NEXT:     R_LARCH_ALIGN *ABS*+0xc
+# RELAX-NEXT:   nop
+# RELAX-NEXT:   ret
+
+# NORELAX:      <_start>:
+# NORELAX-NEXT:   pcalau12i $a0, 0
+# NORELAX-NEXT:     R_LARCH_PCALA_HI20 _start
+# NORELAX-NEXT:     R_LARCH_RELAX *ABS*
+# NORELAX-NEXT:   addi.d    $a0, $a0, 0
+# NORELAX-NEXT:     R_LARCH_PCALA_LO12 _start
+# NORELAX-NEXT:     R_LARCH_RELAX *ABS*
+# NORELAX-NEXT:   pcalau12i $a0, 16
+# NORELAX-NEXT:     R_LARCH_GOT_PC_HI20 _start
+# NORELAX-NEXT:     R_LARCH_RELAX *ABS*
+# NORELAX-NEXT:   ld.d      $a0, $a0, 0
+# NORELAX-NEXT:     R_LARCH_GOT_PC_LO12 _start
+# NORELAX-NEXT:     R_LARCH_RELAX *ABS*
+# NORELAX-NEXT:   ret
+# NORELAX-NEXT:     R_LARCH_ALIGN *ABS*+0xc
 
 # CHECKR:      <_start>:
 # CHECKR-NEXT:   pcalau12i $a0, 0
 # CHECKR-NEXT:     R_LARCH_PCALA_HI20 _start
 # CHECKR-NEXT:     R_LARCH_RELAX *ABS*
-# CHECKR-NEXT:   addi.d $a0, $a0, 0
+# CHECKR-NEXT:   addi.d    $a0, $a0, 0
 # CHECKR-NEXT:     R_LARCH_PCALA_LO12 _start
 # CHECKR-NEXT:     R_LARCH_RELAX *ABS*
+# CHECKR-NEXT:   pcalau12i $a0, 0
+# CHECKR-NEXT:     R_LARCH_GOT_PC_HI20 _start
+# CHECKR-NEXT:     R_LARCH_RELAX *ABS*
+# CHECKR-NEXT:   ld.d      $a0, $a0, 0
+# CHECKR-NEXT:     R_LARCH_GOT_PC_LO12 _start
+# CHECKR-NEXT:     R_LARCH_RELAX *ABS*
 # CHECKR-NEXT:   nop
 # CHECKR-NEXT:     R_LARCH_ALIGN *ABS*+0xc
 # CHECKR-NEXT:   nop
@@ -45,5 +70,6 @@
 .global _start
 _start:
   la.pcrel $a0, _start
+  la.got   $a0, _start
   .p2align 4
   ret
diff --git a/lld/test/ELF/loongarch-relax-pc-hi20-lo12-got-symbols.s b/lld/test/ELF/loongarch-relax-pc-hi20-lo12-got-symbols.s
new file mode 100644
index 00000000000000..0a75d2289209c2
--- /dev/null
+++ b/lld/test/ELF/loongarch-relax-pc-hi20-lo12-got-symbols.s
@@ -0,0 +1,90 @@
+## This test verifies that the pair pcalau12i + ld.w/d is relaxed/not relaxed
+## depending on the target symbol properties.
+
+# REQUIRES: loongarch
+# RUN: rm -rf %t && split-file %s %t && cd %t
+
+# RUN: llvm-mc --filetype=obj --triple=loongarch32 -mattr=+relax symbols.s -o symbols.32.o
+# RUN: llvm-mc --filetype=obj --triple=loongarch64 -mattr=+relax symbols.s -o symbols.64.o
+# RUN: llvm-mc --filetype=obj --triple=loongarch32 -mattr=+relax abs.s -o abs.32.o
+# RUN: llvm-mc --filetype=obj --triple=loongarch64 -mattr=+relax abs.s -o abs.64.o
+
+# RUN: ld.lld --shared --relax -Tlinker.t symbols.32.o abs.32.o -o symbols.32.so
+# RUN: ld.lld --shared --relax -Tlinker.t symbols.64.o abs.64.o -o symbols.64.so
+# RUN: llvm-objdump -d --no-show-raw-insn symbols.32.so | FileCheck --check-prefixes=LIB %s
+# RUN: llvm-objdump -d --no-show-raw-insn symbols.64.so | FileCheck --check-prefixes=LIB %s
+
+# RUN: ld.lld --relax -Tlinker.t -z undefs symbols.32.o abs.32.o -o symbols.32
+# RUN: ld.lld --relax -Tlinker.t -z undefs symbols.64.o abs.64.o -o symbols.64
+# RUN: llvm-objdump -d --no-show-raw-insn symbols.32 | FileCheck --check-prefixes=EXE %s
+# RUN: llvm-objdump -d --no-show-raw-insn symbols.64 | FileCheck --check-prefixes=EXE %s
+
+
+## Symbol 'hidden_sym' is nonpreemptible, the relaxation should be applied.
+LIB:      pcaddi      $a0, {{[0-9]+}}
+## Symbol 'global_sym' is preemptible, no relaxations should be applied.
+LIB-NEXT: pcalau12i   $a1, 4
+LIB-NEXT: ld.{{[wd]}} $a1, $a1, {{[0-9]+}}
+## Symbol 'undefined_sym' is undefined, no relaxations should be applied.
+LIB-NEXT: pcalau12i   $a2, 4
+LIB-NEXT: ld.{{[wd]}} $a2, $a2, {{[0-9]+}}
+## Symbol 'ifunc_sym' is STT_GNU_IFUNC, no relaxations should be applied.
+LIB-NEXT: pcalau12i   $a3, 4
+LIB-NEXT: ld.{{[wd]}} $a3, $a3, {{[0-9]+}}
+## Symbol 'abs_sym' is absolute, no relaxations should be applied.
+LIB-NEXT: pcalau12i   $a4, 4
+LIB-NEXT: ld.{{[wd]}} $a4, $a4, {{[0-9]+}}
+
+
+## Symbol 'hidden_sym' is nonpreemptible, the relaxation should be applied.
+EXE:      pcaddi      $a0, {{[0-9]+}}
+## Symbol 'global_sym' is nonpreemptible, the relaxation should be applied.
+EXE-NEXT: pcaddi      $a1, {{[0-9]+}}
+## Symbol 'undefined_sym' is undefined, no relaxations should be applied.
+EXE-NEXT: pcalau12i   $a2, 4
+EXE-NEXT: ld.{{[wd]}} $a2, $a2, {{[0-9]+}}
+## Symbol 'ifunc_sym' is STT_GNU_IFUNC, no relaxations should be applied.
+EXE-NEXT: pcalau12i   $a3, 4
+EXE-NEXT: ld.{{[wd]}} $a3, $a3, {{[0-9]+}}
+## Symbol 'abs_sym' is absolute, relaxations may be applied in -no-pie mode.
+EXE-NEXT: pcaddi      $a4, -{{[0-9]+}}
+
+
+## The linker script ensures that .rodata and .text are near (>4M) so that
+## the pcalau12i+ld.w/d pair can be relaxed to pcaddi.
+#--- linker.t
+SECTIONS {
+ .text   0x10000: { *(.text) }
+ .rodata 0x14000: { *(.rodata) }
+}
+
+# This symbol is defined in a separate file to prevent the definition from
+# being folded into the instructions that reference it.
+#--- abs.s
+.global abs_sym
+.hidden abs_sym
+abs_sym = 0x1000
+
+#--- symbols.s
+.rodata
+.hidden hidden_sym
+hidden_sym:
+.word 10
+
+.global global_sym
+global_sym:
+.word 10
+
+.text
+.type ifunc_sym STT_GNU_IFUNC
+.hidden ifunc_sym
+ifunc_sym:
+  nop
+
+.global _start
+_start:
+  la.got    $a0, hidden_sym
+  la.got    $a1, global_sym
+  la.got    $a2, undefined_sym
+  la.got    $a3, ifunc_sym
+  la.got    $a4, abs_sym
diff --git a/lld/test/ELF/loongarch-relax-pc-hi20-lo12.s b/lld/test/ELF/loongarch-relax-pc-hi20-lo12.s
new file mode 100644
index 00000000000000..760fe77d774e30
--- /dev/null
+++ b/lld/test/ELF/loongarch-relax-pc-hi20-lo12.s
@@ -0,0 +1,62 @@
+# REQUIRES: loongarch
+
+# RUN: llvm-mc --filetype=obj --triple=loongarch32 -mattr=+relax %s -o %t.32.o
+# RUN: llvm-mc --filetype=obj --triple=loongarch64 -mattr=+relax %s -o %t.64.o
+
+# RUN: ld.lld --section-start=.text=0x10000 --section-start=.data=0x14000 --relax %t.32.o -o %t.32
+# RUN: ld.lld --section-start=.text=0x10000 --section-start=.data=0x14000 --relax %t.64.o -o %t....
[truncated]

Copy link
Contributor

@wangleiat wangleiat left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, thanks.

@MaskRay
Copy link
Member

MaskRay commented Feb 13, 2025

Since PCHi20Lo12 does not name a relocation (R_LARCH_PCHi20Lo12), just omit it from the title.

Relax R_LARCH_{PCALA,GOT_PC}_{HI20,LO12} is clear. If you want to make this more searchable (git log --grep R_LARCH_PCALAHI20`, you can mention the literal names of these relocations in the PR description.

@ylzsx ylzsx changed the title [lld][LoongArch] Relax PCHi20Lo12: R_LARCH_{PCALA,GOT_PC}_{HI20,LO12} [lld][LoongArch] Relax R_LARCH_{PCALA,GOT_PC}_{HI20,LO12} Feb 14, 2025
@ylzsx ylzsx merged commit 6c54ab5 into main Feb 15, 2025
8 checks passed
@ylzsx ylzsx deleted the users/ylzsx/r-pchi20lo12 branch February 15, 2025 01:19
sivan-shani pushed a commit to sivan-shani/llvm-project that referenced this pull request Feb 24, 2025
Support relaxation optimization for two types of code sequences.
```
From:
   pcalau12i $a0, %pc_hi20(sym)
       R_LARCH_PCALA_HI20, R_LARCH_RELAX
   addi.w/d $a0, $a0, %pc_lo12(sym)
       R_LARCH_PCALA_LO12, R_LARCH_RELAX
To:
   pcaddi $a0, %pc_lo12(sym)
       R_LARCH_PCREL20_S2
    
From:
   pcalau12i $a0, %got_pc_hi20(sym_got)
       R_LARCH_GOT_PC_HI20, R_LARCH_RELAX
   ld.w/d $a0, $a0, %got_pc_lo12(sym_got)
       R_LARCH_GOT_PC_LO12, R_LARCH_RELAX
To:
   pcaddi $a0, %got_pc_hi20(sym_got)
       R_LARCH_PCREL20_S2
```
Others:
- `loongarch-relax-pc-hi20-lo12-got-symbols.s` is inspired by
`aarch64-adrp-ldr-got-symbols.s`.

Co-authored-by: Xin Wang
[[email protected]](mailto:[email protected])
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants