Skip to content

[SDAG] Ensure load is included in output chain of sincos expansion #140525

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 4 commits into
base: main
Choose a base branch
from

Conversation

MacDue
Copy link
Member

@MacDue MacDue commented May 19, 2025

The load not being included in the chain meant that it could materialize after a @llvm.lifetime.end annotation on the pointer. This could result in miscompiles if the stack slot is reused for another value.

Fixes #140491

@MacDue MacDue requested a review from sdesmalen-arm May 19, 2025 10:30
@MacDue MacDue requested review from arsenm and RKSimon May 19, 2025 10:30
@llvmbot
Copy link
Member

llvmbot commented May 19, 2025

@llvm/pr-subscribers-llvm-selectiondag

@llvm/pr-subscribers-testing-tools

Author: Benjamin Maxwell (MacDue)

Changes

The load not being included in the chain meant that it could materialize after a @<!-- -->llvm.lifetime.end annotation on the pointer. This could result in miscompiles if the stack slot is reused for another value.

Fixes #140491


Full diff: https://github.com/llvm/llvm-project/pull/140525.diff

3 Files Affected:

  • (modified) llvm/lib/CodeGen/SelectionDAG/SelectionDAG.cpp (+6-3)
  • (added) llvm/test/CodeGen/X86/sincos-lifetimes-issue-140491.ll (+63)
  • (modified) llvm/utils/UpdateTestChecks/asm.py (+3-3)
diff --git a/llvm/lib/CodeGen/SelectionDAG/SelectionDAG.cpp b/llvm/lib/CodeGen/SelectionDAG/SelectionDAG.cpp
index 5d640c39a56d5..29808f15c0603 100644
--- a/llvm/lib/CodeGen/SelectionDAG/SelectionDAG.cpp
+++ b/llvm/lib/CodeGen/SelectionDAG/SelectionDAG.cpp
@@ -2625,16 +2625,19 @@ bool SelectionDAG::expandMultipleResultFPLibCall(
       continue;
     }
     MachinePointerInfo PtrInfo;
+    SDValue LoadResult =
+    getLoad(Node->getValueType(ResNo), DL, CallChain, ResultPtr, PtrInfo);
+    SDValue OutChain = LoadResult.getValue(1);
+
     if (StoreSDNode *ST = ResultStores[ResNo]) {
       // Replace store with the library call.
-      ReplaceAllUsesOfValueWith(SDValue(ST, 0), CallChain);
+      ReplaceAllUsesOfValueWith(SDValue(ST, 0), OutChain);
       PtrInfo = ST->getPointerInfo();
     } else {
       PtrInfo = MachinePointerInfo::getFixedStack(
           getMachineFunction(), cast<FrameIndexSDNode>(ResultPtr)->getIndex());
     }
-    SDValue LoadResult =
-        getLoad(Node->getValueType(ResNo), DL, CallChain, ResultPtr, PtrInfo);
+
     Results.push_back(LoadResult);
   }
 
diff --git a/llvm/test/CodeGen/X86/sincos-lifetimes-issue-140491.ll b/llvm/test/CodeGen/X86/sincos-lifetimes-issue-140491.ll
new file mode 100644
index 0000000000000..bb42f176987e2
--- /dev/null
+++ b/llvm/test/CodeGen/X86/sincos-lifetimes-issue-140491.ll
@@ -0,0 +1,63 @@
+; NOTE: Assertions have been autogenerated by utils/update_llc_test_checks.py UTC_ARGS: --no_x86_scrub_sp --no_x86_scrub_mem_shuffle --version 5
+; RUN: llc < %s | FileCheck %s
+
+target triple = "x86_64-sie-ps5"
+
+declare hidden void @use_ptr(ptr readonly) local_unnamed_addr
+
+define hidden noundef i32 @sincos_stack_slot_with_lifetime(float %in) local_unnamed_addr {
+; CHECK-LABEL: sincos_stack_slot_with_lifetime:
+; CHECK:       # %bb.0: # %entry
+; CHECK-NEXT:    pushq %rbx
+; CHECK-NEXT:    .cfi_def_cfa_offset 16
+; CHECK-NEXT:    subq $32, %rsp
+; CHECK-NEXT:    .cfi_def_cfa_offset 48
+; CHECK-NEXT:    .cfi_offset %rbx, -16
+; CHECK-NEXT:    leaq 12(%rsp), %rdi
+; CHECK-NEXT:    leaq 8(%rsp), %rbx
+; CHECK-NEXT:    movq %rbx, %rsi
+; CHECK-NEXT:    callq sincosf@PLT
+; CHECK-NEXT:    movss 8(%rsp), %xmm0 # xmm0 = mem[0],zero,zero,zero
+; CHECK-NEXT:    movaps %xmm0, 16(%rsp) # 16-byte Spill
+; CHECK-NEXT:    movq %rbx, %rdi
+; CHECK-NEXT:    callq use_ptr
+; CHECK-NEXT:    movss 12(%rsp), %xmm0 # xmm0 = mem[0],zero,zero,zero
+; CHECK-NEXT:    xorps {{\.?LCPI[0-9]+_[0-9]+}}(%rip), %xmm0
+; CHECK-NEXT:    movss %xmm0, 8(%rsp)
+; CHECK-NEXT:    leaq 8(%rsp), %rdi
+; CHECK-NEXT:    callq use_ptr
+; CHECK-NEXT:    movaps 16(%rsp), %xmm0 # 16-byte Reload
+; CHECK-NEXT:    xorps {{\.?LCPI[0-9]+_[0-9]+}}(%rip), %xmm0
+; CHECK-NEXT:    movss %xmm0, 8(%rsp)
+; CHECK-NEXT:    leaq 8(%rsp), %rdi
+; CHECK-NEXT:    callq use_ptr
+; CHECK-NEXT:    xorl %eax, %eax
+; CHECK-NEXT:    addq $32, %rsp
+; CHECK-NEXT:    .cfi_def_cfa_offset 16
+; CHECK-NEXT:    popq %rbx
+; CHECK-NEXT:    .cfi_def_cfa_offset 8
+; CHECK-NEXT:    retq
+entry:
+  %computed = alloca float, align 4
+  %computed1 = alloca float, align 4
+  %computed3 = alloca float, align 4
+  %0 = tail call { float, float } @llvm.sincos.f32(float %in)
+  %1 = extractvalue { float, float } %0, 0
+  %2 = extractvalue { float, float } %0, 1
+  call void @llvm.lifetime.start.p0(i64 4, ptr nonnull %computed)
+  store float %2, ptr %computed, align 4
+  call void @use_ptr(ptr nonnull %computed)
+  call void @llvm.lifetime.end.p0(i64 4, ptr nonnull %computed)
+  call void @llvm.lifetime.start.p0(i64 4, ptr nonnull %computed1)
+  %fneg = fneg float %1
+  store float %fneg, ptr %computed1, align 4
+  call void @use_ptr(ptr nonnull %computed1)
+  call void @llvm.lifetime.end.p0(i64 4, ptr nonnull %computed1)
+  call void @llvm.lifetime.start.p0(i64 4, ptr nonnull %computed3)
+  %fneg4 = fneg float %2
+  store float %fneg4, ptr %computed3, align 4
+  call void @use_ptr(ptr nonnull %computed3)
+  call void @llvm.lifetime.end.p0(i64 4, ptr nonnull %computed3)
+  ret i32 0
+}
+
diff --git a/llvm/utils/UpdateTestChecks/asm.py b/llvm/utils/UpdateTestChecks/asm.py
index 7d4fb7d8e1504..e7097fdf7c9d1 100644
--- a/llvm/utils/UpdateTestChecks/asm.py
+++ b/llvm/utils/UpdateTestChecks/asm.py
@@ -294,10 +294,10 @@ def scrub_asm_x86(asm, args):
     else:
         asm = SCRUB_X86_SHUFFLES_RE.sub(r"\1 {{.*#+}} \2", asm)
 
-    # Detect stack spills and reloads and hide their exact offset and whether
-    # they used the stack pointer or frame pointer.
-    asm = SCRUB_X86_SPILL_RELOAD_RE.sub(r"{{[-0-9]+}}(%\1{{[sb]}}p)\2", asm)
     if getattr(args, "x86_scrub_sp", True):
+        # Detect stack spills and reloads and hide their exact offset and whether
+        # they used the stack pointer or frame pointer.
+        asm = SCRUB_X86_SPILL_RELOAD_RE.sub(r"{{[-0-9]+}}(%\1{{[sb]}}p)\2", asm)
         # Generically match the stack offset of a memory operand.
         asm = SCRUB_X86_SP_RE.sub(r"{{[0-9]+}}(%\1)", asm)
     if getattr(args, "x86_scrub_rip", False):

@llvmbot
Copy link
Member

llvmbot commented May 19, 2025

@llvm/pr-subscribers-backend-x86

Author: Benjamin Maxwell (MacDue)

Changes

The load not being included in the chain meant that it could materialize after a @<!-- -->llvm.lifetime.end annotation on the pointer. This could result in miscompiles if the stack slot is reused for another value.

Fixes #140491


Full diff: https://github.com/llvm/llvm-project/pull/140525.diff

3 Files Affected:

  • (modified) llvm/lib/CodeGen/SelectionDAG/SelectionDAG.cpp (+6-3)
  • (added) llvm/test/CodeGen/X86/sincos-lifetimes-issue-140491.ll (+63)
  • (modified) llvm/utils/UpdateTestChecks/asm.py (+3-3)
diff --git a/llvm/lib/CodeGen/SelectionDAG/SelectionDAG.cpp b/llvm/lib/CodeGen/SelectionDAG/SelectionDAG.cpp
index 5d640c39a56d5..29808f15c0603 100644
--- a/llvm/lib/CodeGen/SelectionDAG/SelectionDAG.cpp
+++ b/llvm/lib/CodeGen/SelectionDAG/SelectionDAG.cpp
@@ -2625,16 +2625,19 @@ bool SelectionDAG::expandMultipleResultFPLibCall(
       continue;
     }
     MachinePointerInfo PtrInfo;
+    SDValue LoadResult =
+    getLoad(Node->getValueType(ResNo), DL, CallChain, ResultPtr, PtrInfo);
+    SDValue OutChain = LoadResult.getValue(1);
+
     if (StoreSDNode *ST = ResultStores[ResNo]) {
       // Replace store with the library call.
-      ReplaceAllUsesOfValueWith(SDValue(ST, 0), CallChain);
+      ReplaceAllUsesOfValueWith(SDValue(ST, 0), OutChain);
       PtrInfo = ST->getPointerInfo();
     } else {
       PtrInfo = MachinePointerInfo::getFixedStack(
           getMachineFunction(), cast<FrameIndexSDNode>(ResultPtr)->getIndex());
     }
-    SDValue LoadResult =
-        getLoad(Node->getValueType(ResNo), DL, CallChain, ResultPtr, PtrInfo);
+
     Results.push_back(LoadResult);
   }
 
diff --git a/llvm/test/CodeGen/X86/sincos-lifetimes-issue-140491.ll b/llvm/test/CodeGen/X86/sincos-lifetimes-issue-140491.ll
new file mode 100644
index 0000000000000..bb42f176987e2
--- /dev/null
+++ b/llvm/test/CodeGen/X86/sincos-lifetimes-issue-140491.ll
@@ -0,0 +1,63 @@
+; NOTE: Assertions have been autogenerated by utils/update_llc_test_checks.py UTC_ARGS: --no_x86_scrub_sp --no_x86_scrub_mem_shuffle --version 5
+; RUN: llc < %s | FileCheck %s
+
+target triple = "x86_64-sie-ps5"
+
+declare hidden void @use_ptr(ptr readonly) local_unnamed_addr
+
+define hidden noundef i32 @sincos_stack_slot_with_lifetime(float %in) local_unnamed_addr {
+; CHECK-LABEL: sincos_stack_slot_with_lifetime:
+; CHECK:       # %bb.0: # %entry
+; CHECK-NEXT:    pushq %rbx
+; CHECK-NEXT:    .cfi_def_cfa_offset 16
+; CHECK-NEXT:    subq $32, %rsp
+; CHECK-NEXT:    .cfi_def_cfa_offset 48
+; CHECK-NEXT:    .cfi_offset %rbx, -16
+; CHECK-NEXT:    leaq 12(%rsp), %rdi
+; CHECK-NEXT:    leaq 8(%rsp), %rbx
+; CHECK-NEXT:    movq %rbx, %rsi
+; CHECK-NEXT:    callq sincosf@PLT
+; CHECK-NEXT:    movss 8(%rsp), %xmm0 # xmm0 = mem[0],zero,zero,zero
+; CHECK-NEXT:    movaps %xmm0, 16(%rsp) # 16-byte Spill
+; CHECK-NEXT:    movq %rbx, %rdi
+; CHECK-NEXT:    callq use_ptr
+; CHECK-NEXT:    movss 12(%rsp), %xmm0 # xmm0 = mem[0],zero,zero,zero
+; CHECK-NEXT:    xorps {{\.?LCPI[0-9]+_[0-9]+}}(%rip), %xmm0
+; CHECK-NEXT:    movss %xmm0, 8(%rsp)
+; CHECK-NEXT:    leaq 8(%rsp), %rdi
+; CHECK-NEXT:    callq use_ptr
+; CHECK-NEXT:    movaps 16(%rsp), %xmm0 # 16-byte Reload
+; CHECK-NEXT:    xorps {{\.?LCPI[0-9]+_[0-9]+}}(%rip), %xmm0
+; CHECK-NEXT:    movss %xmm0, 8(%rsp)
+; CHECK-NEXT:    leaq 8(%rsp), %rdi
+; CHECK-NEXT:    callq use_ptr
+; CHECK-NEXT:    xorl %eax, %eax
+; CHECK-NEXT:    addq $32, %rsp
+; CHECK-NEXT:    .cfi_def_cfa_offset 16
+; CHECK-NEXT:    popq %rbx
+; CHECK-NEXT:    .cfi_def_cfa_offset 8
+; CHECK-NEXT:    retq
+entry:
+  %computed = alloca float, align 4
+  %computed1 = alloca float, align 4
+  %computed3 = alloca float, align 4
+  %0 = tail call { float, float } @llvm.sincos.f32(float %in)
+  %1 = extractvalue { float, float } %0, 0
+  %2 = extractvalue { float, float } %0, 1
+  call void @llvm.lifetime.start.p0(i64 4, ptr nonnull %computed)
+  store float %2, ptr %computed, align 4
+  call void @use_ptr(ptr nonnull %computed)
+  call void @llvm.lifetime.end.p0(i64 4, ptr nonnull %computed)
+  call void @llvm.lifetime.start.p0(i64 4, ptr nonnull %computed1)
+  %fneg = fneg float %1
+  store float %fneg, ptr %computed1, align 4
+  call void @use_ptr(ptr nonnull %computed1)
+  call void @llvm.lifetime.end.p0(i64 4, ptr nonnull %computed1)
+  call void @llvm.lifetime.start.p0(i64 4, ptr nonnull %computed3)
+  %fneg4 = fneg float %2
+  store float %fneg4, ptr %computed3, align 4
+  call void @use_ptr(ptr nonnull %computed3)
+  call void @llvm.lifetime.end.p0(i64 4, ptr nonnull %computed3)
+  ret i32 0
+}
+
diff --git a/llvm/utils/UpdateTestChecks/asm.py b/llvm/utils/UpdateTestChecks/asm.py
index 7d4fb7d8e1504..e7097fdf7c9d1 100644
--- a/llvm/utils/UpdateTestChecks/asm.py
+++ b/llvm/utils/UpdateTestChecks/asm.py
@@ -294,10 +294,10 @@ def scrub_asm_x86(asm, args):
     else:
         asm = SCRUB_X86_SHUFFLES_RE.sub(r"\1 {{.*#+}} \2", asm)
 
-    # Detect stack spills and reloads and hide their exact offset and whether
-    # they used the stack pointer or frame pointer.
-    asm = SCRUB_X86_SPILL_RELOAD_RE.sub(r"{{[-0-9]+}}(%\1{{[sb]}}p)\2", asm)
     if getattr(args, "x86_scrub_sp", True):
+        # Detect stack spills and reloads and hide their exact offset and whether
+        # they used the stack pointer or frame pointer.
+        asm = SCRUB_X86_SPILL_RELOAD_RE.sub(r"{{[-0-9]+}}(%\1{{[sb]}}p)\2", asm)
         # Generically match the stack offset of a memory operand.
         asm = SCRUB_X86_SP_RE.sub(r"{{[0-9]+}}(%\1)", asm)
     if getattr(args, "x86_scrub_rip", False):

Copy link

github-actions bot commented May 19, 2025

✅ With the latest revision this PR passed the C/C++ code formatter.

The load not being included in the chain meant that it could materialize
after a `@llvm.lifetime.end` annotation on the pointer. This could
result in miscompiles if the stack slot is reused for another value.

Fixes llvm#140491
Copy link
Collaborator

@RKSimon RKSimon left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Waiting on @wjristow to confirm this fixes the downstream issue

@MacDue
Copy link
Member Author

MacDue commented May 19, 2025

(CI issue unrelated -- something about the llvm-config.h precompiled header)

@wjristow
Copy link
Collaborator

Waiting on @wjristow to confirm this fixes the downstream issue

I confirm it fixes our downstream issue.

Thanks for the quick response @MacDue!

Copy link
Collaborator

@RKSimon RKSimon left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM - cheers

@@ -0,0 +1,62 @@
; RUN: llc < %s | FileCheck %s

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe comment what this shows

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Incorrect code-gen with stack-slot sharing for sincos results
5 participants