[GVN] Look through select/phi when determining underlying object #99509

nikic · 2024-07-18T15:02:49Z

This addresses an optimization regression in Rust we have observed after #82458. We now only perform pointer replacement if they have the same underlying object. However, getUnderlyingObject() by default only looks through linear chains, not selects/phis. In particular, this means that we miss cases involving involving pointer induction variables.

This patch fixes this by introducing a new helper getUnderlyingObjectAggressive() which basically does what getUnderlyingObjects() does, just specialized to the case where we must arrive at a single underlying object in the end, and with a limit on the number of inspected values.

Doing this more expensive underlying object check has no measurable compile-time impact on CTMark.

llvmbot · 2024-07-18T15:03:22Z

@llvm/pr-subscribers-llvm-analysis

@llvm/pr-subscribers-llvm-transforms

Author: Nikita Popov (nikic)

Changes

This addresses an optimization regression in Rust we have observed after #82458. We now only perform pointer replacement if they have the same underlying object. However, getUnderlyingObject() by default only looks through linear chains, not selects/phis. In particular, this means that we miss cases involving involving pointer induction variables.

This patch fixes this by introducing a new helper
getUnderlyingObjectThroughPhisAndSelects() (let me know if you have a better name...) which basically does what getUnderlyingObjects() does, just specialized to the case where we must arrive at a single underlying object in the end, and with a limit on the number of inspected values.

Doing this more expensive underlying object check has no measurable compile-time impact on CTMark.

Full diff: https://github.com/llvm/llvm-project/pull/99509.diff

4 Files Affected:

(modified) llvm/include/llvm/Analysis/ValueTracking.h (+5)
(modified) llvm/lib/Analysis/Loads.cpp (+2-3)
(modified) llvm/lib/Analysis/ValueTracking.cpp (+42)
(modified) llvm/test/Transforms/GVN/condprop.ll (+3-3)

diff --git a/llvm/include/llvm/Analysis/ValueTracking.h b/llvm/include/llvm/Analysis/ValueTracking.h
index 2c2f965a3cd6f..c689fd4851f12 100644
--- a/llvm/include/llvm/Analysis/ValueTracking.h
+++ b/llvm/include/llvm/Analysis/ValueTracking.h
@@ -736,6 +736,11 @@ inline Value *getUnderlyingObject(Value *V, unsigned MaxLookup = 6) {
   return const_cast<Value *>(getUnderlyingObject(VConst, MaxLookup));
 }
 
+/// Like getUnderlyingObject(), but will also look through phi/select nodes
+/// to establish a single underlying object.
+const Value *getUnderlyingObjectThroughPhisAndSelects(const Value *V,
+                                                      unsigned MaxLookup = 6);
+
 /// This method is similar to getUnderlyingObject except that it can
 /// look through phi and select instructions and return multiple objects.
 ///
diff --git a/llvm/lib/Analysis/Loads.cpp b/llvm/lib/Analysis/Loads.cpp
index 1d54a66705a2a..cdb226a769074 100644
--- a/llvm/lib/Analysis/Loads.cpp
+++ b/llvm/lib/Analysis/Loads.cpp
@@ -743,9 +743,8 @@ static bool isPointerAlwaysReplaceable(const Value *From, const Value *To,
   if (isa<Constant>(To) &&
       isDereferenceablePointer(To, Type::getInt8Ty(To->getContext()), DL))
     return true;
-  if (getUnderlyingObject(From) == getUnderlyingObject(To))
-    return true;
-  return false;
+  return getUnderlyingObjectThroughPhisAndSelects(From) ==
+         getUnderlyingObjectThroughPhisAndSelects(To);
 }
 
 bool llvm::canReplacePointersInUseIfEqual(const Use &U, const Value *To,
diff --git a/llvm/lib/Analysis/ValueTracking.cpp b/llvm/lib/Analysis/ValueTracking.cpp
index 535a248a5f1a2..369c821dd7335 100644
--- a/llvm/lib/Analysis/ValueTracking.cpp
+++ b/llvm/lib/Analysis/ValueTracking.cpp
@@ -6609,6 +6609,48 @@ void llvm::getUnderlyingObjects(const Value *V,
   } while (!Worklist.empty());
 }
 
+const Value *
+llvm::getUnderlyingObjectThroughPhisAndSelects(const Value *V,
+                                               unsigned MaxLookup) {
+  const unsigned MaxVisited = 8;
+
+  SmallPtrSet<const Value *, 8> Visited;
+  SmallVector<const Value *, 8> Worklist;
+  Worklist.push_back(V);
+  const Value *Object = nullptr, *FirstObject = nullptr;
+  do {
+    const Value *P = Worklist.pop_back_val();
+    P = getUnderlyingObject(P, MaxLookup);
+
+    if (!FirstObject)
+      FirstObject = P;
+
+    if (!Visited.insert(P).second)
+      continue;
+
+    if (Visited.size() == MaxVisited)
+      return FirstObject;
+
+    if (auto *SI = dyn_cast<SelectInst>(P)) {
+      Worklist.push_back(SI->getTrueValue());
+      Worklist.push_back(SI->getFalseValue());
+      continue;
+    }
+
+    if (auto *PN = dyn_cast<PHINode>(P)) {
+      append_range(Worklist, PN->incoming_values());
+      continue;
+    }
+
+    if (!Object)
+      Object = P;
+    else if (Object != P)
+      return FirstObject;
+  } while (!Worklist.empty());
+
+  return Object;
+}
+
 /// This is the function that does the work of looking through basic
 /// ptrtoint+arithmetic+inttoptr sequences.
 static const Value *getUnderlyingObjectFromInt(const Value *V) {
diff --git a/llvm/test/Transforms/GVN/condprop.ll b/llvm/test/Transforms/GVN/condprop.ll
index 02f4bb97d7ebd..3babdd8173a21 100644
--- a/llvm/test/Transforms/GVN/condprop.ll
+++ b/llvm/test/Transforms/GVN/condprop.ll
@@ -813,7 +813,7 @@ define void @select_same_obj(i1 %c, ptr %p, i64 %x) {
 ; CHECK-NEXT:    br i1 [[CMP]], label [[IF:%.*]], label [[EXIT:%.*]]
 ; CHECK:       if:
 ; CHECK-NEXT:    call void @use_ptr(ptr [[P]])
-; CHECK-NEXT:    call void @use_ptr(ptr [[P3]])
+; CHECK-NEXT:    call void @use_ptr(ptr [[P]])
 ; CHECK-NEXT:    ret void
 ; CHECK:       exit:
 ; CHECK-NEXT:    ret void
@@ -902,7 +902,7 @@ define void @phi_same_obj(i1 %c, ptr %p, i64 %x) {
 ; CHECK-NEXT:    br i1 [[CMP]], label [[IF2:%.*]], label [[EXIT:%.*]]
 ; CHECK:       if2:
 ; CHECK-NEXT:    call void @use_ptr(ptr [[P]])
-; CHECK-NEXT:    call void @use_ptr(ptr [[P3]])
+; CHECK-NEXT:    call void @use_ptr(ptr [[P]])
 ; CHECK-NEXT:    ret void
 ; CHECK:       exit:
 ; CHECK-NEXT:    ret void
@@ -975,7 +975,7 @@ define void @phi_same_obj_cycle(i1 %c, ptr %p, i64 %x) {
 ; CHECK-NEXT:    [[CMP:%.*]] = icmp eq ptr [[P_IV]], [[P]]
 ; CHECK-NEXT:    br i1 [[CMP]], label [[IF:%.*]], label [[LOOP_LATCH]]
 ; CHECK:       if:
-; CHECK-NEXT:    call void @use_ptr(ptr [[P_IV]])
+; CHECK-NEXT:    call void @use_ptr(ptr [[P]])
 ; CHECK-NEXT:    call void @use_ptr(ptr [[P]])
 ; CHECK-NEXT:    br label [[LOOP_LATCH]]
 ; CHECK:       loop.latch:

goldsteinn · 2024-07-18T15:43:44Z

getUnderlyingObjectThroughPhisAndSelects() (let me know if you have a better name...)

maybe getUnderlyingObjectAggressive to leave more room for future extensions :)

goldsteinn · 2024-07-18T15:52:44Z

llvm/lib/Analysis/ValueTracking.cpp

+    }
+
+    if (auto *PN = dyn_cast<PHINode>(P)) {
+      append_range(Worklist, PN->incoming_values());


maybe there should be a check that PN->getNumIncominig() < MaxVisisted. Seems a bit inefficient to essentially wait until you have analyzed MaxVisited edges to give up.

As implemented, a phi with more than MaxVisited could be accepted as long as some of the arguments reduce to the same object (as returned by getUnderlyingObject).

Though I'd be open to make this a more strict limit on the number of getUnderlyingObject calls, regardless of what they return.

I mainly want to make sure we don't run into pathological cases. The cost of this analysis seems to be negligible for the average case.

fhahn · 2024-07-18T15:57:04Z

llvm/lib/Analysis/ValueTracking.cpp

+      continue;
+    }
+
+    if (auto *PN = dyn_cast<PHINode>(P)) {


For the loop case, there's also a variant of llvm::getUnderlyingObjects( that tries to identify PHIs that track the same underlying object I think (https://github.com/llvm/llvm-project/blob/main/llvm/lib/Analysis/ValueTracking.cpp#L6589)

Would that help/Should this be unified?

I believe that we don't need that isSameUnderlyingObjectInLoop() logic as long as we're reducing down to a single underlying object. What this handles is the case where you have underlying objects from different loop iterations that have the same SSA value -- this is the same basic issue as "value equal in potential cycles" in BasicAA.

However, if we're reducing to a single object, then this can't happen, because we have both a contribution from the pre-header and the loop. The loop contribution has to reduce back to the loop phi to end up with a single object. If it reduces to a loop variant value, then we'd have at least two underlying objects.

This addresses an optimization regression in Rust we have observed after llvm#82458. We now only perform pointer replacement if they have the same underlying object. However, getUnderlyingObject() by default only looks through linear chains, not selects/phis. In particular, this means that we miss cases involving involving pointer induction variables. This patch fixes this by introducing a new helper getUnderlyingObjectThroughPhisAndSelects() (let me know if you have a better name...) which basically does what getUnderlyingObjects() does, just specialized to the case where we must arrive at a single underlying object in the end, and with a limit on the number of inspected values. Doing this more expensive underlying object check has no measurable compile-time impact on CTMark.

nikic · 2024-07-18T16:18:57Z

getUnderlyingObjectThroughPhisAndSelects() (let me know if you have a better name...)

maybe getUnderlyingObjectAggressive to leave more room for future extensions :)

Sounds reasonable, done!

PR Link: llvm/llvm-project#99509

llvm/include/llvm/Analysis/ValueTracking.h

tschuett · 2024-07-18T17:17:05Z

Could you please update getUnderlyingObjectThroughPhisAndSelects() in the header.

tschuett · 2024-07-18T17:19:15Z

llvm/lib/Analysis/ValueTracking.cpp

+  const Value *Object = nullptr, *FirstObject = nullptr;
+  do {
+    const Value *P = Worklist.pop_back_val();
+    P = getUnderlyingObject(P, MaxLookup);


Do you need a budget for the second parameter? We could end up with MaxVisited * MaxLookup .

dtcxzyw · 2024-07-18T19:09:19Z

Compile-time impact:

Top 5 improvements:
  faiss/HNSW.cpp.ll 3445921402 3333911686 -3.25%
  php/jis0208.ll 308288236 299253584 -2.93%
  spike/regnames.ll 488153333 475353793 -2.62%
  faiss/IndexAdditiveQuantizerFastScan.cpp.ll 434431627 423208324 -2.58%
  lightgbm/boosting.cpp.ll 2999781046 2927390287 -2.41%
Top 5 regressions:
  hermes/StackPromotion.cpp.ll 5074724986 5817835860 +14.64%
  openjdk/ad_x86_gen.ll 18909114243 20121837367 +6.41%
  postgres/strftime.ll 404414332 421153688 +4.14%
  lightgbm/linear_tree_learner.cpp.ll 9902392573 10299377803 +4.01%
  meshlab/filter_screened_poisson.cpp.ll 26246454218 27101073771 +3.26%
Overall: 0.00777903

nikic · 2024-07-18T19:59:27Z

Top 5 regressions:
  hermes/StackPromotion.cpp.ll 5074724986 5817835860 +14.64%
  openjdk/ad_x86_gen.ll 18909114243 20121837367 +6.41%
  postgres/strftime.ll 404414332 421153688 +4.14%
  lightgbm/linear_tree_learner.cpp.ll 9902392573 10299377803 +4.01%
  meshlab/filter_screened_poisson.cpp.ll 26246454218 27101073771 +3.26%

I checked all of these, and time spent in getUnderlyingObjectsAggressive() was always tiny (think sub-0.01% level). So any regressions here are due to second-order effects (i.e. pointer replacement enabling other optimizations that take more time etc).

tschuett · 2024-07-19T03:44:21Z

llvm/include/llvm/Analysis/ValueTracking.h

@@ -736,6 +736,10 @@ inline Value *getUnderlyingObject(Value *V, unsigned MaxLookup = 6) {
  return const_cast<Value *>(getUnderlyingObject(VConst, MaxLookup));
 }

+/// Like getUnderlyingObject(), but will try harder to find a single underlying
+/// object. In particular, this function also looks through selects and phis.
+const Value *getUnderlyingObjectAggressive(const Value *V);


Could you still expose a parameter to control the aggressiveness? Aggressive without a parameter is kind of scary. E.g. the size of the visited set?

No, we should not expose unnecessary options. If we have reason to have different levels of aggressiveness for different callers in the future, we will add the option at the time. Until then, cutoffs should be centrally controlled.

nikic · 2024-07-22T08:00:18Z

Ping, I'd like to merge this one before the release branches.

fhahn · 2024-07-22T08:43:08Z

llvm/lib/Analysis/ValueTracking.cpp

+    P = getUnderlyingObject(P);
+
+    if (!FirstObject)
+      FirstObject = P;


Should we pull this out of the loop if possible?

This would make it clear that we shouldn't end up in a situation where e.g. getUnderlyingObject returns nullptr for the initial pointer, and then we initialize FirstObject with the underlying object for one of the arms of a select and may end up returning an underlying object that just matches one side of the select.

Generally agree, but I'm not sure how to structure the code to do this without either repeating the getUnderlyingObject() call or duplicating a bunch of logic. Any ideas?

(Note that getUnderlyingObject cannot return null.)

Not the nicest but:

auto SetNextP = [&]() -> bool { if(Worklist.empty()) return false; P = getUnderlyingObject(Worklist.pop_back_val()); return true; }; FirstObject = P = getUnderlyingObject(V); do { if (!Visited.insert(P).second) ... } while(SetNextP());

I feel like that's harder to follow than the current logic :(

I pushed a possible variant at f160e9c, which does repeat the getUnderlyingObject() call, but only in the case where FirstObject is actually needed. Maybe this is fine?

Yeah, the capturing lambda isn't nice.
I definetly do not prefer f160e9c, that really obfuscates the whole "cant be null" aspect.

Maybe just moving the flag? I.e

// Cant be null FirstObject = getUnderlying(V); // Instead of a seperate `First` bool, you could also give `P` scope outside of the loop and do `P == nullptr` as `First`. bool First= true; do { P = First ? FirstObject : getUnderlying(Worklist.pop_back_val()); First = false; }

Its basically same special case as you had but moved around to make the non-null condition clearer.

So something like cac98c1?

Thanks, latest version looks good.

if (!FirstObject) FirstObject = P;

can be removed from the loop?

Oh yeah, that wasn't supposed to be there anymore... Dropped in 44714f0.

This reverts commit f160e9c.

fhahn

LGTM, thanks!

…gressive` (#100102) Thanks to #99509, we can fix rust-lang/rust#119573 too.

) Summary: This addresses an optimization regression in Rust we have observed after #82458. We now only perform pointer replacement if they have the same underlying object. However, getUnderlyingObject() by default only looks through linear chains, not selects/phis. In particular, this means that we miss cases involving involving pointer induction variables. This patch fixes this by introducing a new helper getUnderlyingObjectAggressive() which basically does what getUnderlyingObjects() does, just specialized to the case where we must arrive at a single underlying object in the end, and with a limit on the number of inspected values. Doing this more expensive underlying object check has no measurable compile-time impact on CTMark. Test Plan: Reviewers: Subscribers: Tasks: Tags: Differential Revision: https://phabricator.intern.facebook.com/D60251674

…gressive` (#100102) Summary: Thanks to #99509, we can fix rust-lang/rust#119573 too. Test Plan: Reviewers: Subscribers: Tasks: Tags: Differential Revision: https://phabricator.intern.facebook.com/D60251024

nikic requested review from UsmanNadeem, dtcxzyw and efriedma-quic July 18, 2024 15:02

nikic mentioned this pull request Jul 18, 2024

Optimization for types implementing drop seems broken on LLVM 19 rust-lang/rust#125373

Closed

llvmbot added llvm:analysis llvm:transforms labels Jul 18, 2024

goldsteinn reviewed Jul 18, 2024

View reviewed changes

fhahn reviewed Jul 18, 2024

View reviewed changes

nikic added 2 commits July 18, 2024 18:18

Rename to getUnderlyingObjectAggressive

291ffd2

nikic force-pushed the gvn-pointer-repl branch from bbbfa04 to 291ffd2 Compare July 18, 2024 16:18

dtcxzyw added a commit to dtcxzyw/llvm-opt-benchmark that referenced this pull request Jul 18, 2024

pre-commit: test PR99509

ebc85ab

PR Link: llvm/llvm-project#99509

dtcxzyw mentioned this pull request Jul 18, 2024

pre-commit: test PR99509 dtcxzyw/llvm-opt-benchmark#995

Closed

tschuett reviewed Jul 18, 2024

View reviewed changes

llvm/include/llvm/Analysis/ValueTracking.h Show resolved Hide resolved

tschuett reviewed Jul 18, 2024

View reviewed changes

Remove unused MaxLookup parameter

25857e8

tschuett reviewed Jul 19, 2024

View reviewed changes

fhahn reviewed Jul 22, 2024

View reviewed changes

nikic added 4 commits July 22, 2024 11:08

Refactor FirstObject logic

f160e9c

Revert "Refactor FirstObject logic"

8e5f7b8

This reverts commit f160e9c.

Use bool First instead

cac98c1

Remove leftover code

44714f0

fhahn approved these changes Jul 22, 2024

View reviewed changes

nikic merged commit 32cd189 into llvm:main Jul 22, 2024
5 of 7 checks passed

nikic deleted the gvn-pointer-repl branch July 22, 2024 14:22

dianqk mentioned this pull request Jul 23, 2024

[FunctionAttrs] Determine underlying object by getUnderlyingObjectAggressive #100102

Merged

dianqk added a commit that referenced this pull request Jul 23, 2024

[FunctionAttrs] Determine underlying object by `getUnderlyingObjectAg…

a213edd

…gressive` (#100102) Thanks to #99509, we can fix rust-lang/rust#119573 too.

[GVN] Look through select/phi when determining underlying object #99509

[GVN] Look through select/phi when determining underlying object #99509

Uh oh!

Conversation

nikic commented Jul 18, 2024 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

llvmbot commented Jul 18, 2024 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

goldsteinn commented Jul 18, 2024 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

nikic commented Jul 18, 2024

Uh oh!

Uh oh!

tschuett commented Jul 18, 2024

Uh oh!

Choose a reason for hiding this comment

Uh oh!

dtcxzyw commented Jul 18, 2024

Uh oh!

nikic commented Jul 18, 2024

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

nikic commented Jul 22, 2024

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

fhahn left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

nikic commented Jul 18, 2024 •

edited

Loading

llvmbot commented Jul 18, 2024 •

edited

Loading

goldsteinn commented Jul 18, 2024 •

edited

Loading