Skip to content

[SCEV] Introduce SCEVUse, use it instead of const SCEV * (NFCI) (WIP). #91961

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 3 commits into
base: main
Choose a base branch
from

Conversation

fhahn
Copy link
Contributor

@fhahn fhahn commented May 13, 2024

This patch introduces SCEVUse, which is a tagged pointer containing the
used const SCEV *, plus extra bits to store NUW/NSW flags that are only
valid at the specific use.

This was suggested by @nikic as an alternative
to #90742.

This patch just updates most SCEV infrastructure to operate on SCEVUse
instead of const SCEV *. It does not introduce any code that makes use
of the use-specific flags yet which I'll share as follow-ups.

Note that this should be NFC, but currently there's at least one case
where it is not (turn-to-invariant.ll), which I'll investigate once we
agree on the overall direction.

This PR at the moment also contains a commit that updates various SCEV
clients to use const SCEV * instead of const auto *, to prepare for
this patch. This reduces the number of changes needed, as SCEVUse will
automatically convert to const SCEV *. This is a safe default, as it
just drops the use-specific flags for the expression (it will not drop
any use-specific flags for any of its operands though).

SCEVUse could probably also be used to address mis-compiles due to
equivalent AddRecs modulo flags result in an AddRec with incorrect flags
for some uses of some phis, e.g. the one
#80430 attempted to fix

Compile-time impact:
stage1-O3: +0.06%
stage1-ReleaseThinLTO: +0.07%
stage1-ReleaseLTO-g: +0.07%
stage2-O3: +0.11%

https://llvm-compile-time-tracker.com/compare.php?from=ce055843e2be9643bd58764783a7bb69f6db8c9a&to=8c7f4e9e154ebc4862c4e2716cedc3c688352d7c&stat=instructions:u

@llvmbot
Copy link
Member

llvmbot commented May 13, 2024

@llvm/pr-subscribers-backend-powerpc
@llvm/pr-subscribers-llvm-analysis
@llvm/pr-subscribers-backend-arm

@llvm/pr-subscribers-llvm-transforms

Author: Florian Hahn (fhahn)

Changes

This patch introduces SCEVUse, which is a tagged pointer containing the
used const SCEV *, plus extra bits to store NUW/NSW flags that are only
valid at the specific use.

This was suggested by @nikic as an alternative
to #90742.

This patch just updates most SCEV infrastructure to operate on SCEVUse
instead of const SCEV *. It does not introduce any code that makes use
of the use-specific flags yet which I'll share as follow-ups.

Note that this should be NFC, but currently there's at least one case
where it is not (turn-to-invariant.ll), which I'll investigate once we
agree on the overall direction.

This PR at the moment also contains a commit that updates various SCEV
clients to use const SCEV * instead of const auto *, to prepare for
this patch. This reduces the number of changes needed, as SCEVUse will
automatically convert to const SCEV *. This is a safe default, as it
just drops the use-specific flags for the expression (it will not drop
any use-specific flags for any of its operands though).

This probably
SCEVUse could probably also be used to address mis-compiles due to
equivalent AddRecs modulo flags result in an AddRec with incorrect flags
for some uses of some phis, e.g. the one
#80430 attempted to fix

Compile-time impact:
stage1-O3: +0.06%
stage1-ReleaseThinLTO: +0.07%
stage1-ReleaseLTO-g: +0.07%
stage2-O3: +0.11%

https://llvm-compile-time-tracker.com/compare.php?from=ce055843e2be9643bd58764783a7bb69f6db8c9a&to=8c7f4e9e154ebc4862c4e2716cedc3c688352d7c&stat=instructions:u


Patch is 475.92 KiB, truncated to 20.00 KiB below, full version: https://github.com/llvm/llvm-project/pull/91961.diff

24 Files Affected:

  • (modified) llvm/include/llvm/Analysis/ScalarEvolution.h (+441-363)
  • (modified) llvm/include/llvm/Analysis/ScalarEvolutionExpressions.h (+154-133)
  • (modified) llvm/lib/Analysis/Delinearization.cpp (+1-1)
  • (modified) llvm/lib/Analysis/DependenceAnalysis.cpp (+12-8)
  • (modified) llvm/lib/Analysis/IVDescriptors.cpp (+1-1)
  • (modified) llvm/lib/Analysis/IVUsers.cpp (+2-2)
  • (modified) llvm/lib/Analysis/LoopAccessAnalysis.cpp (+4-4)
  • (modified) llvm/lib/Analysis/LoopCacheAnalysis.cpp (+2-1)
  • (modified) llvm/lib/Analysis/ScalarEvolution.cpp (+1169-1190)
  • (modified) llvm/lib/Target/ARM/MVETailPredication.cpp (+9-7)
  • (modified) llvm/lib/Transforms/Scalar/IndVarSimplify.cpp (+5-5)
  • (modified) llvm/lib/Transforms/Scalar/InductiveRangeCheckElimination.cpp (+3-3)
  • (modified) llvm/lib/Transforms/Scalar/LoopDeletion.cpp (+2-2)
  • (modified) llvm/lib/Transforms/Scalar/LoopIdiomRecognize.cpp (+5-5)
  • (modified) llvm/lib/Transforms/Scalar/LoopPredication.cpp (+2-2)
  • (modified) llvm/lib/Transforms/Scalar/LoopStrengthReduce.cpp (+5-4)
  • (modified) llvm/lib/Transforms/Utils/LowerMemIntrinsics.cpp (+2-2)
  • (modified) llvm/lib/Transforms/Utils/SimplifyIndVar.cpp (+5-5)
  • (modified) llvm/lib/Transforms/Vectorize/LoopVectorizationLegality.cpp (+5-4)
  • (modified) llvm/lib/Transforms/Vectorize/SLPVectorizer.cpp (+2-2)
  • (modified) llvm/test/Transforms/IndVarSimplify/turn-to-invariant.ll (+4-4)
  • (modified) llvm/unittests/Analysis/ScalarEvolutionTest.cpp (+71-67)
  • (modified) llvm/unittests/Transforms/Utils/LoopUtilsTest.cpp (+2-2)
  • (modified) llvm/unittests/Transforms/Utils/ScalarEvolutionExpanderTest.cpp (+5-5)
diff --git a/llvm/include/llvm/Analysis/ScalarEvolution.h b/llvm/include/llvm/Analysis/ScalarEvolution.h
index 5828cc156cc78..2859df9964555 100644
--- a/llvm/include/llvm/Analysis/ScalarEvolution.h
+++ b/llvm/include/llvm/Analysis/ScalarEvolution.h
@@ -69,6 +69,97 @@ enum SCEVTypes : unsigned short;
 
 extern bool VerifySCEV;
 
+class SCEV;
+
+class SCEVUse : public PointerIntPair<const SCEV *, 2> {
+public:
+  SCEVUse() : PointerIntPair(nullptr, 0) {}
+  SCEVUse(const SCEV *S) : PointerIntPair(S, 0) {}
+  SCEVUse(const SCEV *S, int Flags) : PointerIntPair(S, Flags) {}
+
+  operator const SCEV *() const { return getPointer(); }
+  const SCEV *operator->() const { return getPointer(); }
+  const SCEV *operator->() { return getPointer(); }
+
+  /// Print out the internal representation of this scalar to the specified
+  /// stream.  This should really only be used for debugging purposes.
+  void print(raw_ostream &OS) const;
+
+  /// This method is used for debugging.
+  void dump() const;
+};
+
+template <> struct PointerLikeTypeTraits<SCEVUse> {
+  static inline void *getAsVoidPointer(SCEVUse U) { return U.getOpaqueValue(); }
+  static inline SCEVUse getFromVoidPointer(void *P) {
+    SCEVUse U;
+    U.setFromOpaqueValue(P);
+    return U;
+  }
+
+  /// Note, we assume here that void* is related to raw malloc'ed memory and
+  /// that malloc returns objects at least 4-byte aligned. However, this may be
+  /// wrong, or pointers may be from something other than malloc. In this case,
+  /// you should specify a real typed pointer or avoid this template.
+  ///
+  /// All clients should use assertions to do a run-time check to ensure that
+  /// this is actually true.
+  static constexpr int NumLowBitsAvailable = 0;
+};
+
+template <> struct DenseMapInfo<SCEVUse> {
+  // The following should hold, but it would require T to be complete:
+  // static_assert(alignof(T) <= (1 << Log2MaxAlign),
+  //               "DenseMap does not support pointer keys requiring more than "
+  //               "Log2MaxAlign bits of alignment");
+  static constexpr uintptr_t Log2MaxAlign = 12;
+
+  static inline SCEVUse getEmptyKey() {
+    uintptr_t Val = static_cast<uintptr_t>(-1);
+    Val <<= Log2MaxAlign;
+    return PointerLikeTypeTraits<SCEVUse>::getFromVoidPointer((void *)Val);
+  }
+
+  static inline SCEVUse getTombstoneKey() {
+    uintptr_t Val = static_cast<uintptr_t>(-2);
+    Val <<= Log2MaxAlign;
+    return PointerLikeTypeTraits<SCEVUse>::getFromVoidPointer((void *)Val);
+  }
+
+  static unsigned getHashValue(SCEVUse U) {
+    void *PtrVal = PointerLikeTypeTraits<SCEVUse>::getAsVoidPointer(U);
+    return (unsigned((uintptr_t)PtrVal) >> 4) ^
+           (unsigned((uintptr_t)PtrVal) >> 9);
+  }
+
+  static bool isEqual(const SCEVUse LHS, const SCEVUse RHS) {
+    return LHS == RHS;
+  }
+};
+
+template <typename To> [[nodiscard]] inline decltype(auto) dyn_cast(SCEVUse U) {
+  assert(detail::isPresent(U.getPointer()) &&
+         "dyn_cast on a non-existent value");
+  return CastInfo<To, const SCEV *>::doCastIfPossible(U.getPointer());
+}
+
+template <typename To> [[nodiscard]] inline decltype(auto) cast(SCEVUse U) {
+  assert(detail::isPresent(U.getPointer()) &&
+         "dyn_cast on a non-existent value");
+  return CastInfo<To, const SCEV *>::doCast(U.getPointer());
+}
+
+template <typename To> [[nodiscard]] inline bool isa(SCEVUse U) {
+  return CastInfo<To, const SCEV *>::isPossible(U.getPointer());
+}
+
+template <class X> auto dyn_cast_or_null(SCEVUse U) {
+  const SCEV *Val = U.getPointer();
+  if (!detail::isPresent(Val))
+    return CastInfo<X, const SCEV *>::castFailed();
+  return CastInfo<X, const SCEV *>::doCastIfPossible(detail::unwrapValue(Val));
+}
+
 /// This class represents an analyzed expression in the program.  These are
 /// opaque objects that the client is not allowed to do much with directly.
 ///
@@ -147,7 +238,7 @@ class SCEV : public FoldingSetNode {
   Type *getType() const;
 
   /// Return operands of this SCEV expression.
-  ArrayRef<const SCEV *> operands() const;
+  ArrayRef<SCEVUse> operands() const;
 
   /// Return true if the expression is a constant zero.
   bool isZero() const;
@@ -202,6 +293,11 @@ inline raw_ostream &operator<<(raw_ostream &OS, const SCEV &S) {
   return OS;
 }
 
+inline raw_ostream &operator<<(raw_ostream &OS, const SCEVUse &S) {
+  S.print(OS);
+  return OS;
+}
+
 /// An object of this class is returned by queries that could not be answered.
 /// For example, if you ask for the number of iterations of a linked-list
 /// traversal loop, you will get one of these.  None of the standard SCEV
@@ -211,6 +307,7 @@ struct SCEVCouldNotCompute : public SCEV {
 
   /// Methods for support type inquiry through isa, cast, and dyn_cast:
   static bool classof(const SCEV *S);
+  static bool classof(const SCEVUse *U) { return classof(U->getPointer()); }
 };
 
 /// This class represents an assumption made using SCEV expressions which can
@@ -281,13 +378,13 @@ struct FoldingSetTrait<SCEVPredicate> : DefaultFoldingSetTrait<SCEVPredicate> {
 class SCEVComparePredicate final : public SCEVPredicate {
   /// We assume that LHS Pred RHS is true.
   const ICmpInst::Predicate Pred;
-  const SCEV *LHS;
-  const SCEV *RHS;
+  SCEVUse LHS;
+  SCEVUse RHS;
 
 public:
   SCEVComparePredicate(const FoldingSetNodeIDRef ID,
-                       const ICmpInst::Predicate Pred,
-                       const SCEV *LHS, const SCEV *RHS);
+                       const ICmpInst::Predicate Pred, SCEVUse LHS,
+                       SCEVUse RHS);
 
   /// Implementation of the SCEVPredicate interface
   bool implies(const SCEVPredicate *N) const override;
@@ -297,10 +394,10 @@ class SCEVComparePredicate final : public SCEVPredicate {
   ICmpInst::Predicate getPredicate() const { return Pred; }
 
   /// Returns the left hand side of the predicate.
-  const SCEV *getLHS() const { return LHS; }
+  SCEVUse getLHS() const { return LHS; }
 
   /// Returns the right hand side of the predicate.
-  const SCEV *getRHS() const { return RHS; }
+  SCEVUse getRHS() const { return RHS; }
 
   /// Methods for support type inquiry through isa, cast, and dyn_cast:
   static bool classof(const SCEVPredicate *P) {
@@ -415,8 +512,7 @@ class SCEVWrapPredicate final : public SCEVPredicate {
 /// ScalarEvolution::Preds folding set.  This is why the \c add function is sound.
 class SCEVUnionPredicate final : public SCEVPredicate {
 private:
-  using PredicateMap =
-      DenseMap<const SCEV *, SmallVector<const SCEVPredicate *, 4>>;
+  using PredicateMap = DenseMap<SCEVUse, SmallVector<const SCEVPredicate *, 4>>;
 
   /// Vector with references to all predicates in this union.
   SmallVector<const SCEVPredicate *, 16> Preds;
@@ -525,18 +621,17 @@ class ScalarEvolution {
   ///     loop { v2 = load @global2; }
   /// }
   /// No SCEV with operand V1, and v2 can exist in this program.
-  bool instructionCouldExistWithOperands(const SCEV *A, const SCEV *B);
+  bool instructionCouldExistWithOperands(SCEVUse A, SCEVUse B);
 
   /// Return true if the SCEV is a scAddRecExpr or it contains
   /// scAddRecExpr. The result will be cached in HasRecMap.
-  bool containsAddRecurrence(const SCEV *S);
+  bool containsAddRecurrence(SCEVUse S);
 
   /// Is operation \p BinOp between \p LHS and \p RHS provably does not have
   /// a signed/unsigned overflow (\p Signed)? If \p CtxI is specified, the
   /// no-overflow fact should be true in the context of this instruction.
-  bool willNotOverflow(Instruction::BinaryOps BinOp, bool Signed,
-                       const SCEV *LHS, const SCEV *RHS,
-                       const Instruction *CtxI = nullptr);
+  bool willNotOverflow(Instruction::BinaryOps BinOp, bool Signed, SCEVUse LHS,
+                       SCEVUse RHS, const Instruction *CtxI = nullptr);
 
   /// Parse NSW/NUW flags from add/sub/mul IR binary operation \p Op into
   /// SCEV no-wrap flags, and deduce flag[s] that aren't known yet.
@@ -547,78 +642,84 @@ class ScalarEvolution {
   getStrengthenedNoWrapFlagsFromBinOp(const OverflowingBinaryOperator *OBO);
 
   /// Notify this ScalarEvolution that \p User directly uses SCEVs in \p Ops.
-  void registerUser(const SCEV *User, ArrayRef<const SCEV *> Ops);
+  void registerUser(SCEVUse User, ArrayRef<SCEVUse> Ops);
 
   /// Return true if the SCEV expression contains an undef value.
-  bool containsUndefs(const SCEV *S) const;
+  bool containsUndefs(SCEVUse S) const;
 
   /// Return true if the SCEV expression contains a Value that has been
   /// optimised out and is now a nullptr.
-  bool containsErasedValue(const SCEV *S) const;
+  bool containsErasedValue(SCEVUse S) const;
 
   /// Return a SCEV expression for the full generality of the specified
   /// expression.
-  const SCEV *getSCEV(Value *V);
+  SCEVUse getSCEV(Value *V);
 
   /// Return an existing SCEV for V if there is one, otherwise return nullptr.
-  const SCEV *getExistingSCEV(Value *V);
-
-  const SCEV *getConstant(ConstantInt *V);
-  const SCEV *getConstant(const APInt &Val);
-  const SCEV *getConstant(Type *Ty, uint64_t V, bool isSigned = false);
-  const SCEV *getLosslessPtrToIntExpr(const SCEV *Op, unsigned Depth = 0);
-  const SCEV *getPtrToIntExpr(const SCEV *Op, Type *Ty);
-  const SCEV *getTruncateExpr(const SCEV *Op, Type *Ty, unsigned Depth = 0);
-  const SCEV *getVScale(Type *Ty);
-  const SCEV *getElementCount(Type *Ty, ElementCount EC);
-  const SCEV *getZeroExtendExpr(const SCEV *Op, Type *Ty, unsigned Depth = 0);
-  const SCEV *getZeroExtendExprImpl(const SCEV *Op, Type *Ty,
-                                    unsigned Depth = 0);
-  const SCEV *getSignExtendExpr(const SCEV *Op, Type *Ty, unsigned Depth = 0);
-  const SCEV *getSignExtendExprImpl(const SCEV *Op, Type *Ty,
-                                    unsigned Depth = 0);
-  const SCEV *getCastExpr(SCEVTypes Kind, const SCEV *Op, Type *Ty);
-  const SCEV *getAnyExtendExpr(const SCEV *Op, Type *Ty);
-  const SCEV *getAddExpr(SmallVectorImpl<const SCEV *> &Ops,
-                         SCEV::NoWrapFlags Flags = SCEV::FlagAnyWrap,
-                         unsigned Depth = 0);
-  const SCEV *getAddExpr(const SCEV *LHS, const SCEV *RHS,
-                         SCEV::NoWrapFlags Flags = SCEV::FlagAnyWrap,
-                         unsigned Depth = 0) {
-    SmallVector<const SCEV *, 2> Ops = {LHS, RHS};
+  SCEVUse getExistingSCEV(Value *V);
+
+  SCEVUse getConstant(ConstantInt *V);
+  SCEVUse getConstant(const APInt &Val);
+  SCEVUse getConstant(Type *Ty, uint64_t V, bool isSigned = false);
+  SCEVUse getLosslessPtrToIntExpr(SCEVUse Op, unsigned Depth = 0);
+  SCEVUse getPtrToIntExpr(SCEVUse Op, Type *Ty);
+  SCEVUse getTruncateExpr(SCEVUse Op, Type *Ty, unsigned Depth = 0);
+  SCEVUse getVScale(Type *Ty);
+  SCEVUse getElementCount(Type *Ty, ElementCount EC);
+  SCEVUse getZeroExtendExpr(SCEVUse Op, Type *Ty, unsigned Depth = 0);
+  SCEVUse getZeroExtendExprImpl(SCEVUse Op, Type *Ty, unsigned Depth = 0);
+  SCEVUse getSignExtendExpr(SCEVUse Op, Type *Ty, unsigned Depth = 0);
+  SCEVUse getSignExtendExprImpl(SCEVUse Op, Type *Ty, unsigned Depth = 0);
+  SCEVUse getCastExpr(SCEVTypes Kind, SCEVUse Op, Type *Ty);
+  SCEVUse getAnyExtendExpr(SCEVUse Op, Type *Ty);
+  SCEVUse getAddExpr(ArrayRef<const SCEV *> Ops,
+                     SCEV::NoWrapFlags Flags = SCEV::FlagAnyWrap,
+                     unsigned Depth = 0);
+  SCEVUse getAddExpr(SmallVectorImpl<SCEVUse> &Ops,
+                     SCEV::NoWrapFlags Flags = SCEV::FlagAnyWrap,
+                     unsigned Depth = 0);
+  SCEVUse getAddExpr(SCEVUse LHS, SCEVUse RHS,
+                     SCEV::NoWrapFlags Flags = SCEV::FlagAnyWrap,
+                     unsigned Depth = 0) {
+    SmallVector<SCEVUse, 2> Ops = {LHS, RHS};
     return getAddExpr(Ops, Flags, Depth);
   }
-  const SCEV *getAddExpr(const SCEV *Op0, const SCEV *Op1, const SCEV *Op2,
-                         SCEV::NoWrapFlags Flags = SCEV::FlagAnyWrap,
-                         unsigned Depth = 0) {
-    SmallVector<const SCEV *, 3> Ops = {Op0, Op1, Op2};
+  SCEVUse getAddExpr(SCEVUse Op0, SCEVUse Op1, SCEVUse Op2,
+                     SCEV::NoWrapFlags Flags = SCEV::FlagAnyWrap,
+                     unsigned Depth = 0) {
+    SmallVector<SCEVUse, 3> Ops = {Op0, Op1, Op2};
     return getAddExpr(Ops, Flags, Depth);
   }
-  const SCEV *getMulExpr(SmallVectorImpl<const SCEV *> &Ops,
-                         SCEV::NoWrapFlags Flags = SCEV::FlagAnyWrap,
-                         unsigned Depth = 0);
-  const SCEV *getMulExpr(const SCEV *LHS, const SCEV *RHS,
-                         SCEV::NoWrapFlags Flags = SCEV::FlagAnyWrap,
-                         unsigned Depth = 0) {
-    SmallVector<const SCEV *, 2> Ops = {LHS, RHS};
+  SCEVUse getMulExpr(ArrayRef<const SCEV *> Ops,
+                     SCEV::NoWrapFlags Flags = SCEV::FlagAnyWrap,
+                     unsigned Depth = 0);
+  SCEVUse getMulExpr(SmallVectorImpl<SCEVUse> &Ops,
+                     SCEV::NoWrapFlags Flags = SCEV::FlagAnyWrap,
+                     unsigned Depth = 0);
+  SCEVUse getMulExpr(SCEVUse LHS, SCEVUse RHS,
+                     SCEV::NoWrapFlags Flags = SCEV::FlagAnyWrap,
+                     unsigned Depth = 0) {
+    SmallVector<SCEVUse, 2> Ops = {LHS, RHS};
     return getMulExpr(Ops, Flags, Depth);
   }
-  const SCEV *getMulExpr(const SCEV *Op0, const SCEV *Op1, const SCEV *Op2,
-                         SCEV::NoWrapFlags Flags = SCEV::FlagAnyWrap,
-                         unsigned Depth = 0) {
-    SmallVector<const SCEV *, 3> Ops = {Op0, Op1, Op2};
+  SCEVUse getMulExpr(SCEVUse Op0, SCEVUse Op1, SCEVUse Op2,
+                     SCEV::NoWrapFlags Flags = SCEV::FlagAnyWrap,
+                     unsigned Depth = 0) {
+    SmallVector<SCEVUse, 3> Ops = {Op0, Op1, Op2};
     return getMulExpr(Ops, Flags, Depth);
   }
-  const SCEV *getUDivExpr(const SCEV *LHS, const SCEV *RHS);
-  const SCEV *getUDivExactExpr(const SCEV *LHS, const SCEV *RHS);
-  const SCEV *getURemExpr(const SCEV *LHS, const SCEV *RHS);
-  const SCEV *getAddRecExpr(const SCEV *Start, const SCEV *Step, const Loop *L,
-                            SCEV::NoWrapFlags Flags);
-  const SCEV *getAddRecExpr(SmallVectorImpl<const SCEV *> &Operands,
-                            const Loop *L, SCEV::NoWrapFlags Flags);
-  const SCEV *getAddRecExpr(const SmallVectorImpl<const SCEV *> &Operands,
-                            const Loop *L, SCEV::NoWrapFlags Flags) {
-    SmallVector<const SCEV *, 4> NewOp(Operands.begin(), Operands.end());
+  SCEVUse getUDivExpr(SCEVUse LHS, SCEVUse RHS);
+  SCEVUse getUDivExactExpr(SCEVUse LHS, SCEVUse RHS);
+  SCEVUse getURemExpr(SCEVUse LHS, SCEVUse RHS);
+  SCEVUse getAddRecExpr(SCEVUse Start, SCEVUse Step, const Loop *L,
+                        SCEV::NoWrapFlags Flags);
+  SCEVUse getAddRecExpr(ArrayRef<const SCEV *> Operands, const Loop *L,
+                        SCEV::NoWrapFlags Flags);
+  SCEVUse getAddRecExpr(SmallVectorImpl<SCEVUse> &Operands, const Loop *L,
+                        SCEV::NoWrapFlags Flags);
+  SCEVUse getAddRecExpr(const SmallVectorImpl<SCEVUse> &Operands, const Loop *L,
+                        SCEV::NoWrapFlags Flags) {
+    SmallVector<SCEVUse, 4> NewOp(Operands.begin(), Operands.end());
     return getAddRecExpr(NewOp, L, Flags);
   }
 
@@ -626,7 +727,7 @@ class ScalarEvolution {
   /// Predicates. If successful return these <AddRecExpr, Predicates>;
   /// The function is intended to be called from PSCEV (the caller will decide
   /// whether to actually add the predicates and carry out the rewrites).
-  std::optional<std::pair<const SCEV *, SmallVector<const SCEVPredicate *, 3>>>
+  std::optional<std::pair<SCEVUse, SmallVector<const SCEVPredicate *, 3>>>
   createAddRecFromPHIWithCasts(const SCEVUnknown *SymbolicPHI);
 
   /// Returns an expression for a GEP
@@ -634,61 +735,61 @@ class ScalarEvolution {
   /// \p GEP The GEP. The indices contained in the GEP itself are ignored,
   /// instead we use IndexExprs.
   /// \p IndexExprs The expressions for the indices.
-  const SCEV *getGEPExpr(GEPOperator *GEP,
-                         const SmallVectorImpl<const SCEV *> &IndexExprs);
-  const SCEV *getAbsExpr(const SCEV *Op, bool IsNSW);
-  const SCEV *getMinMaxExpr(SCEVTypes Kind,
-                            SmallVectorImpl<const SCEV *> &Operands);
-  const SCEV *getSequentialMinMaxExpr(SCEVTypes Kind,
-                                      SmallVectorImpl<const SCEV *> &Operands);
-  const SCEV *getSMaxExpr(const SCEV *LHS, const SCEV *RHS);
-  const SCEV *getSMaxExpr(SmallVectorImpl<const SCEV *> &Operands);
-  const SCEV *getUMaxExpr(const SCEV *LHS, const SCEV *RHS);
-  const SCEV *getUMaxExpr(SmallVectorImpl<const SCEV *> &Operands);
-  const SCEV *getSMinExpr(const SCEV *LHS, const SCEV *RHS);
-  const SCEV *getSMinExpr(SmallVectorImpl<const SCEV *> &Operands);
-  const SCEV *getUMinExpr(const SCEV *LHS, const SCEV *RHS,
-                          bool Sequential = false);
-  const SCEV *getUMinExpr(SmallVectorImpl<const SCEV *> &Operands,
-                          bool Sequential = false);
-  const SCEV *getUnknown(Value *V);
-  const SCEV *getCouldNotCompute();
+  SCEVUse getGEPExpr(GEPOperator *GEP, ArrayRef<const SCEV *> IndexExprs);
+  SCEVUse getGEPExpr(GEPOperator *GEP,
+                     const SmallVectorImpl<SCEVUse> &IndexExprs);
+  SCEVUse getAbsExpr(SCEVUse Op, bool IsNSW);
+  SCEVUse getMinMaxExpr(SCEVTypes Kind, ArrayRef<const SCEV *> Operands);
+  SCEVUse getMinMaxExpr(SCEVTypes Kind, SmallVectorImpl<SCEVUse> &Operands);
+  SCEVUse getSequentialMinMaxExpr(SCEVTypes Kind,
+                                  SmallVectorImpl<SCEVUse> &Operands);
+  SCEVUse getSMaxExpr(SCEVUse LHS, SCEVUse RHS);
+  SCEVUse getSMaxExpr(SmallVectorImpl<SCEVUse> &Operands);
+  SCEVUse getUMaxExpr(SCEVUse LHS, SCEVUse RHS);
+  SCEVUse getUMaxExpr(SmallVectorImpl<SCEVUse> &Operands);
+  SCEVUse getSMinExpr(SCEVUse LHS, SCEVUse RHS);
+  SCEVUse getSMinExpr(SmallVectorImpl<SCEVUse> &Operands);
+  SCEVUse getUMinExpr(SCEVUse LHS, SCEVUse RHS, bool Sequential = false);
+  SCEVUse getUMinExpr(SmallVectorImpl<SCEVUse> &Operands,
+                      bool Sequential = false);
+  SCEVUse getUnknown(Value *V);
+  SCEVUse getCouldNotCompute();
 
   /// Return a SCEV for the constant 0 of a specific type.
-  const SCEV *getZero(Type *Ty) { return getConstant(Ty, 0); }
+  SCEVUse getZero(Type *Ty) { return getConstant(Ty, 0); }
 
   /// Return a SCEV for the constant 1 of a specific type.
-  const SCEV *getOne(Type *Ty) { return getConstant(Ty, 1); }
+  SCEVUse getOne(Type *Ty) { return getConstant(Ty, 1); }
 
   /// Return a SCEV for the constant \p Power of two.
-  const SCEV *getPowerOfTwo(Type *Ty, unsigned Power) {
+  SCEVUse getPowerOfTwo(Type *Ty, unsigned Power) {
     assert(Power < getTypeSizeInBits(Ty) && "Power out of range");
     return getConstant(APInt::getOneBitSet(getTypeSizeInBits(Ty), Power));
   }
 
   /// Return a SCEV for the constant -1 of a specific type.
-  const SCEV *getMinusOne(Type *Ty) {
+  SCEVUse getMinusOne(Type *Ty) {
     return getConstant(Ty, -1, /*isSigned=*/true);
   }
 
   /// Return an expression for a TypeSize.
-  const SCEV *getSizeOfExpr(Type *IntTy, TypeSize Size);
+  SCEVUse getSizeOfExpr(Type *IntTy, TypeSize Size);
 
   /// Return an expression for the alloc size of AllocTy that is type IntTy
-  const SCEV *getSizeOfExpr(Type *IntTy, Type *AllocTy);
+  SCEVUse getSizeOfExpr(Type *IntTy, Type *AllocTy);
 
   /// Return an expression for the store size of StoreTy that is type IntTy
-  const SCEV *getStoreSizeOfExpr(Type *IntTy, Type *StoreTy);
+  SCEVUse getStoreSizeOfExpr(Type *IntTy, Type *StoreTy);
 
   /// Return an expression for offsetof on the given field with type IntTy
-  const SCEV *getOffsetOfExpr(Type *IntTy, StructType *STy, unsigned FieldNo);
+  SCEVUse getOffsetOfExpr(Type *IntTy, StructType *STy, unsigned FieldNo);
 
   /// Return the SCEV object corresponding to -V.
-  const SCEV *getNegativeSCEV(const SCEV *V,
-                              SCEV::NoWrapFlags Flags = SCEV::FlagAnyWrap);
+  SCEVUse getNegativeSCEV(SCEVUse V,
+                          SCEV::NoWrapFlags Flags = SCEV::FlagAnyWrap);
 
   /// Return the SCEV object corresponding to ~V.
-  const SCEV *getNotSCEV(const SCEV *V);
+  SCEVUse getNotSCEV(SCEVUse V);
 
   /// Return LHS-RHS.  Minus is represented in SCEV as A+B*-1.
   ///
@@ -697,9 +798,9 @@ class ScalarEvolution {...
[truncated]

Copy link

github-actions bot commented May 13, 2024

✅ With the latest revision this PR passed the C/C++ code formatter.

fhahn added a commit to fhahn/llvm-project that referenced this pull request May 13, 2024
Use SCEVUse from llvm#91961 to
return a SCEVUse with use-specific no-wrap flags for GEP expr, when
demanded.

Clients need to opt-in, as the use-specific flags may not be valid in
some contexts (e.g. backedge taken counts).
@fhahn
Copy link
Contributor Author

fhahn commented May 13, 2024

Example uses:

@fhahn fhahn force-pushed the users/fhahn/scevuse branch from 8999663 to 3a3232d Compare May 22, 2024 14:22
fhahn added a commit to fhahn/llvm-project that referenced this pull request May 22, 2024
Use SCEVUse from llvm#91961 to
return a SCEVUse with use-specific no-wrap flags for GEP expr, when
demanded.

Clients need to opt-in, as the use-specific flags may not be valid in
some contexts (e.g. backedge taken counts).
@fhahn
Copy link
Contributor Author

fhahn commented May 22, 2024

ping :)

@fhahn fhahn force-pushed the users/fhahn/scevuse branch from 07989af to 7b4f33b Compare May 22, 2024 15:29
@fhahn fhahn force-pushed the users/fhahn/scevuse branch from 7b4f33b to 6104124 Compare May 31, 2024 13:39
@fhahn
Copy link
Contributor Author

fhahn commented May 31, 2024

rebase & ping

@fhahn
Copy link
Contributor Author

fhahn commented Jun 25, 2024

Ping. Any thoughts about the overall direction? There are a number of places where context-specific SCEV flags could be helpful, with my main interest being on LAA at the moment.

@fhahn
Copy link
Contributor Author

fhahn commented Jul 9, 2024

Ping

@fhahn fhahn force-pushed the users/fhahn/scevuse branch from 4b0c62b to 352daca Compare August 1, 2024 19:47
@fhahn
Copy link
Contributor Author

fhahn commented Aug 1, 2024

Ping :)

Any thoughts on moving forward with this now we branched for the release?

@nikic
Copy link
Contributor

nikic commented Aug 2, 2024

The first commit looks good, can you please land it?

Copy link
Contributor

@nikic nikic left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I assume that this patch is supposed to just mechanical changes for SCEVUse introduction -- otherwise this would need more adjustments, e.g. the poison analysis code would need to account for these flags, and these are proper poison flags rather than UB flags.


I think my main high level question here is: What about SCEV unification? We get this straightforwardly for the top-level use flags because we can just compare the underlying pointers. But what about nested expressions?

If we have ((%a * %b)(u nuw) + %c) and ((%a * %b) + %c), I assume we're actually going to treat these as two distinct, non-unified expressions. We currently rely on equality for SCEVs to be pointer equality, and this would somewhat break that. Do we need to have each SCEV also store a pointer to the unified SCEV without flags or something?

void dump() const;
};

template <> struct PointerLikeTypeTraits<SCEVUse> {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why do we need PointerLikeTypeTraits on SCEVUse?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is needed so it can be used in SmallPtrSet (and a few others I think). Added a comment.

/// you should specify a real typed pointer or avoid this template.
///
/// All clients should use assertions to do a run-time check to ensure that
/// this is actually true.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This copied comment doesn't make a lot of sense with a value of 0.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Adjusted, thanks!

}
};

template <typename To> [[nodiscard]] inline decltype(auto) dyn_cast(SCEVUse U) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is directly overloading dyn_cast the correct way to do this? Rather than providing a CastInfo implementation or something?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So far I've not managed to get CastInfo to work as needed, but I'll look into it more

C, detail::combineHashValue(reinterpret_cast<uintptr_t>(Op),
reinterpret_cast<uintptr_t>(Ty)));
C,
detail::combineHashValue(reinterpret_cast<uintptr_t>(Op.getPointer()),
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Shouldn't this be getRawPointer()?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Updated, thanks!

fhahn added a commit that referenced this pull request Aug 3, 2024
Use const SCEV * explicitly in more places to prepare for
#91961. Split off as suggested.
@fhahn fhahn force-pushed the users/fhahn/scevuse branch from 352daca to 45b5e62 Compare October 4, 2024 14:56
@fhahn
Copy link
Contributor Author

fhahn commented Oct 4, 2024

I assume that this patch is supposed to just mechanical changes for SCEVUse introduction -- otherwise this would need more adjustments, e.g. the poison analysis code would need to account for these flags, and these are proper poison flags rather than UB flags.

Yes exactly. The first real use would be in #91964

I think my main high level question here is: What about SCEV unification? We get this straightforwardly for the top-level use flags because we can just compare the underlying pointers. But what about nested expressions?

If we have ((%a * %b)(u nuw) + %c) and ((%a * %b) + %c), I assume we're actually going to treat these as two distinct, non-unified expressions. We currently rely on equality for SCEVs to be pointer equality, and this would somewhat break that. Do we need to have each SCEV also store a pointer to the unified SCEV without flags or something?

I've update the patch to make a distinction between canonical (no flags in any subexpression) and non-canonical. Only canonical SCEVs can be compared with ==, with an assertion checking that. A canonical SCEV can be constructed by rebuilding the SCEV expression with all flags dropped. One of the bits in SCEVUse is used to indicate that the wrapped SCEV is canonical or not. So far it is not complete yet, and only supports expressions with flags in Add and Mul expressions.

This unfortunately about doubles the compile-time impact, worst increase is +0.18% for stage2-O3 ( https://llvm-compile-time-tracker.com/compare.php?from=9e831d50a0d67c6779048742657edb437818b681&to=d1a824d7f47b90faa4d708b543b78d28b761b18b&stat=instructions:u)

I didn't spend any time on tuning yet, but there are likely more places that will need updating as we make use of more SCEVUses.

I added some basic tests for comparing SCEVs with and without flags here https://github.com/llvm/llvm-project/pull/91961/files#diff-218646cbfc7bf44fcc471db9ceb892a5a69a89edfa871a459cedddc991eb67a8

Adding a pointer to each SCEV would be an alternative, I originally thought tracking whether a SCEV is canonical using a bit in the pointer may be less overhead for code that doesn't use any flags, but I'll prototype it to see how it compares

Copy link
Contributor Author

@fhahn fhahn left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I also put together a version that stores the canonical SCEV via a pointer in each SCEV expression: 180f8be

Compile-time wise this seems to fare slightly better in most cases, with stage2-O3 being +0.17%
https://llvm-compile-time-tracker.com/compare.php?from=67c0846357bcd6faca713315380f9981a805a6e5&to=180f8be49b3d4db744704af11ad463a3f199d25a&stat=instructions:u

void dump() const;
};

template <> struct PointerLikeTypeTraits<SCEVUse> {
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is needed so it can be used in SmallPtrSet (and a few others I think). Added a comment.

/// you should specify a real typed pointer or avoid this template.
///
/// All clients should use assertions to do a run-time check to ensure that
/// this is actually true.
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Adjusted, thanks!

C, detail::combineHashValue(reinterpret_cast<uintptr_t>(Op),
reinterpret_cast<uintptr_t>(Ty)));
C,
detail::combineHashValue(reinterpret_cast<uintptr_t>(Op.getPointer()),
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Updated, thanks!

}
};

template <typename To> [[nodiscard]] inline decltype(auto) dyn_cast(SCEVUse U) {
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So far I've not managed to get CastInfo to work as needed, but I'll look into it more

@fhahn fhahn force-pushed the users/fhahn/scevuse branch from 45b5e62 to a777a0e Compare October 9, 2024 13:35
fhahn added a commit to fhahn/llvm-project that referenced this pull request Oct 10, 2024
Use SCEVUse from llvm#91961 to
return a SCEVUse with use-specific no-wrap flags for GEP expr, when
demanded.

Clients need to opt-in, as the use-specific flags may not be valid in
some contexts (e.g. backedge taken counts).
@fhahn
Copy link
Contributor Author

fhahn commented Oct 15, 2024

Ping :)

I also put together a version that stores the canonical SCEV via a pointer in each SCEV expression: 180f8be

Compile-time wise this seems to fare slightly better in most cases, with stage2-O3 being +0.17% https://llvm-compile-time-tracker.com/compare.php?from=67c0846357bcd6faca713315380f9981a805a6e5&to=180f8be49b3d4db744704af11ad463a3f199d25a&stat=instructions:u

WDYT would be the next steps to move forward with this, given the compile-time impact?

Copy link
Contributor

@artagnon artagnon left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is a very interesting patch! I can definitely see the value of having non-global SCEV flags, and I hope I was able to leave some comments that can help improve it.

Comment on lines +172 to +188
template <class X> auto dyn_cast_or_null(SCEVUse U) {
const SCEV *Val = U.getPointer();
if (!detail::isPresent(Val))
return CastInfo<X, const SCEV *>::castFailed();
return CastInfo<X, const SCEV *>::doCastIfPossible(detail::unwrapValue(Val));
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

CastInfo definitely needs to be worked out, otherwise dyn_cast_if_present, cast_if_present etc. won't work.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yep, will do that once we are happy with the rest of the patch


/// Provide PointerLikeTypeTraits for SCEVUse, so it can be used with
/// SmallPtrSet, among others.
template <> struct PointerLikeTypeTraits<SCEVUse> {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think there's a way to escape PointerLikeTraits, as SmallPtrSet needs to work.

const SCEV *getCanonical(ScalarEvolution &SE) {
if (isCanonical())
return getPointer();
return computeCanonical(SE);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm a little confused by why you can't computeCanonical at the callsites, and keep the canonical representation. Alternatively, why can't you computeCanonical in the constructor? Calling computeCanonical every time has caused a huge compile-time regression.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

One problem with constructing the canonical form is that it needs access to a ScalarEvolution object, which currently isn't available in the constructors unfortunately, which makes this a bit cumbersome.

The latest version stores a pointer to the canonical version as suggested by @nikic , which avoids this

@@ -65,6 +65,117 @@ enum SCEVTypes : unsigned short;

extern bool VerifySCEV;

class SCEV;

class SCEVUse : public PointerIntPair<const SCEV *, 3> {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think you need two bits: one for NUW and one for NSW. The canonical bit can be dropped, if you can canonicalize on construction.

@fhahn
Copy link
Contributor Author

fhahn commented Nov 26, 2024

@nikic I updated the current version to include a pointer to the canonical SCEV. WDYT would be the next steps to make progress with the patch and the follow-ups? How much compile-time would we be willing to spend on this?

@fhahn fhahn force-pushed the users/fhahn/scevuse branch from a15daba to f3905cd Compare December 12, 2024 22:05
fhahn added a commit that referenced this pull request Dec 13, 2024
Use const SCEV * explicitly in more places to prepare for
#91961.
This patch introduces SCEVUse, which is a tagged pointer containing the
used const SCEV *, plus extra bits to store NUW/NSW flags that are only
valid at the specific use.

This was suggested by @nikic as an alternative
to #90742.

This patch just updates most SCEV infrastructure to operate on SCEVUse
instead of const SCEV *. It does not introduce any code that makes use
of the use-specific flags yet which I'll share as follow-ups.

Note that this should be NFC, but currently there's at least one case
where it is not (turn-to-invariant.ll), which I'll investigate once we
agree on the overall direction.

This PR at the moment also contains a commit that updates various SCEV
clients to use `const SCEV *` instead of `const auto *`, to prepare for
this patch. This reduces the number of changes needed, as SCEVUse will
automatically convert to `const SCEV *`. This is a safe default, as it
just drops the use-specific flags for the expression (it will not drop
any use-specific flags for any of its operands though).

This probably
SCEVUse could probably also be used to address mis-compiles due to
equivalent AddRecs modulo flags result in an AddRec with incorrect flags
for some uses of some phis, e.g. the one
#80430 attempted to fix

Compile-time impact:
stage1-O3: +0.06%
stage1-ReleaseThinLTO: +0.07%
stage1-ReleaseLTO-g: +0.07%
stage2-O3: +0.11%

https://llvm-compile-time-tracker.com/compare.php?from=ce055843e2be9643bd58764783a7bb69f6db8c9a&to=8c7f4e9e154ebc4862c4e2716cedc3c688352d7c&stat=instructions:u

!fixup use raw pointer (const SCEV * + lower bits) for AddPointer.

!fix formatting

!fixup fix build failures after rebase, address comments

!fixup isCanonical/getCanonical

Simp
@fhahn fhahn force-pushed the users/fhahn/scevuse branch from f3905cd to 6eeb7f7 Compare December 13, 2024 11:40
@fhahn
Copy link
Contributor Author

fhahn commented Dec 13, 2024

Ping :) rebased after landing pattern matching and fixed the polly build failures.

Current CT impact: https://llvm-compile-time-tracker.com/compare.php?from=4e828f8d741ff61317bb1e0b67f22e274632b07a&to=f3905cdaf342e506e1e40dbef4d23b031dd83bd9&stat=instructions:u

@artagnon
Copy link
Contributor

artagnon commented Feb 12, 2025

Reverse ping. Do we have any viable alternative to resolve issues pertaining to the global wrap flags in SCEV? Several of them are quite complicated and exotic, but they result in real miscompiles. I think the compile-time regression is the cost to pay for a fundamental oversight in SCEV's design, and I think this patch will pay for itself in terms of codegen improvements (although there's a long tail of follow-up patches required). Consider putting up an RFC for this patch, so that we get consensus on the compile-time cost, and get volunteers to help out with follow-ups if it lands? I think you can motivate it by scraping issues for long-standing miscompiles that it would eventually fix.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants