Skip to content

release/20.x: [X86] When expanding LCMPXCHG16B_SAVE_RBX, substitute RBX in base (#134109) #134331

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 1 commit into from
Apr 11, 2025

Conversation

llvmbot
Copy link
Member

@llvmbot llvmbot commented Apr 4, 2025

Backport 9e0ca57

Requested by: @phoebewang

@llvmbot
Copy link
Member Author

llvmbot commented Apr 4, 2025

@phoebewang What do you think about merging this PR to the release branch?

@llvmbot
Copy link
Member Author

llvmbot commented Apr 4, 2025

@llvm/pr-subscribers-backend-x86

Author: None (llvmbot)

Changes

Backport 9e0ca57

Requested by: @phoebewang


Full diff: https://github.com/llvm/llvm-project/pull/134331.diff

2 Files Affected:

  • (modified) llvm/lib/Target/X86/X86ExpandPseudo.cpp (+12-2)
  • (modified) llvm/test/CodeGen/X86/base-pointer-and-cmpxchg.ll (+34)
diff --git a/llvm/lib/Target/X86/X86ExpandPseudo.cpp b/llvm/lib/Target/X86/X86ExpandPseudo.cpp
index 78db8413e62c9..813f0f2542fa8 100644
--- a/llvm/lib/Target/X86/X86ExpandPseudo.cpp
+++ b/llvm/lib/Target/X86/X86ExpandPseudo.cpp
@@ -439,8 +439,18 @@ bool X86ExpandPseudo::expandMI(MachineBasicBlock &MBB,
     TII->copyPhysReg(MBB, MBBI, DL, X86::RBX, InArg.getReg(), false);
     // Create the actual instruction.
     MachineInstr *NewInstr = BuildMI(MBB, MBBI, DL, TII->get(X86::LCMPXCHG16B));
-    // Copy the operands related to the address.
-    for (unsigned Idx = 1; Idx < 6; ++Idx)
+    // Copy the operands related to the address. If we access a frame variable,
+    // we need to replace the RBX base with SaveRbx, as RBX has another value.
+    const MachineOperand &Base = MBBI->getOperand(1);
+    if (Base.getReg() == X86::RBX || Base.getReg() == X86::EBX)
+      NewInstr->addOperand(MachineOperand::CreateReg(
+          Base.getReg() == X86::RBX
+              ? SaveRbx
+              : Register(TRI->getSubReg(SaveRbx, X86::sub_32bit)),
+          /*IsDef=*/false));
+    else
+      NewInstr->addOperand(Base);
+    for (unsigned Idx = 1 + 1; Idx < 1 + X86::AddrNumOperands; ++Idx)
       NewInstr->addOperand(MBBI->getOperand(Idx));
     // Finally, restore the value of RBX.
     TII->copyPhysReg(MBB, MBBI, DL, X86::RBX, SaveRbx,
diff --git a/llvm/test/CodeGen/X86/base-pointer-and-cmpxchg.ll b/llvm/test/CodeGen/X86/base-pointer-and-cmpxchg.ll
index 498be7c9e1144..5e8da5818fe97 100644
--- a/llvm/test/CodeGen/X86/base-pointer-and-cmpxchg.ll
+++ b/llvm/test/CodeGen/X86/base-pointer-and-cmpxchg.ll
@@ -49,5 +49,39 @@ tail call void asm sideeffect "nop", "~{rax},~{rcx},~{rdx},~{rsi},~{rdi},~{rbp},
   store i32 %n, ptr %idx
   ret i1 %res
 }
+
+; If we compare-and-exchange a frame variable, we additionally need to rewrite
+; the memory operand to use the SAVE_rbx instead of rbx, which already contains
+; the input operand.
+;
+; CHECK-LABEL: cmp_and_swap16_frame:
+; Check that we actually use rbx.
+; gnux32 use the 32bit variant of the registers.
+; USE_BASE_64: movq %rsp, %rbx
+; USE_BASE_32: movl %esp, %ebx
+; Here we drop the inline assembly because the frame pointer is used anyway. So
+; rbx is not spilled to the stack but goes into a (hopefully numbered) register.
+; USE_BASE: movq %rbx, [[SAVE_rbx:%r[0-9]+]]
+;
+; USE_BASE: movq {{[^ ]+}}, %rbx
+; The use of the frame variable expands to N(%rbx) or N(%ebx). But we've just
+; overwritten that with the input operand. We need to use SAVE_rbx instead.
+; USE_BASE_64-NEXT: cmpxchg16b {{[0-9]*}}([[SAVE_rbx]])
+; USE_BASE_32-NEXT: cmpxchg16b {{[0-9]*}}([[SAVE_rbx]]d)
+; USE_BASE-NEXT: movq [[SAVE_rbx]], %rbx
+;
+; DONT_USE_BASE-NOT: movq %rsp, %rbx
+; DONT_USE_BASE-NOT: movl %esp, %ebx
+; DONT_USE_BASE: cmpxchg
+define i1 @cmp_and_swap16_frame(i128 %a, i128 %b, i32 %n) {
+  %local = alloca i128, align 16
+  %dummy = alloca i32, i32 %n
+  %cmp = cmpxchg ptr %local, i128 %a, i128 %b seq_cst seq_cst
+  %res = extractvalue { i128, i1 } %cmp, 1
+  %idx = getelementptr i32, ptr %dummy, i32 5
+  store i32 %n, ptr %idx
+  ret i1 %res
+}
+
 !llvm.module.flags = !{!0}
 !0 = !{i32 2, !"override-stack-alignment", i32 32}

@phoebewang
Copy link
Contributor

@phoebewang What do you think about merging this PR to the release branch?

I think it's ok to merge.

@aaronpuchert
Copy link
Member

You might have to (formally) approve the changes.

@github-project-automation github-project-automation bot moved this from Needs Triage to Needs Merge in LLVM Release Status Apr 5, 2025
…vm#134109)

The pseudo-instruction LCMPXCHG16B_SAVE_RBX is used when RBX serves as
frame base pointer. At a very late stage it is then translated into a
regular LCMPXCHG16B, preceded by copying the actual argument into RBX,
and followed by restoring the register to the base pointer.

However, in case the `cmpxchg` operates on a local variable, RBX might
also be used as a base for the memory operand in frame finalization, and
we've overwritten RBX with the input operand for `cmpxchg16b`. So we
have to rewrite the memory operand base to use the saved value of RBX.

Fixes llvm#119959.

(cherry picked from commit 9e0ca57)
@tstellar tstellar merged commit a0c8959 into llvm:release/20.x Apr 11, 2025
7 of 10 checks passed
@github-project-automation github-project-automation bot moved this from Needs Merge to Done in LLVM Release Status Apr 11, 2025
Copy link

@phoebewang (or anyone else). If you would like to add a note about this fix in the release notes (completely optional). Please reply to this comment with a one or two sentence description of the fix. When you are done, please add the release:note label to this PR.

@aaronpuchert
Copy link
Member

Release note: Fixed a miscompilation on X86 when using 16-byte atomic compare-and-swap on a frame variable when RBX is used as frame pointer.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
Development

Successfully merging this pull request may close these issues.

4 participants