Skip to content

[LLVM] Add option to store Parent-pointer in ilist_node_base #94224

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 6 commits into from
Jun 18, 2024

Conversation

SLTozer
Copy link
Contributor

@SLTozer SLTozer commented Jun 3, 2024

This patch adds a new option for ilist, ilist_parent<ParentTy>, that enables storing an additional pointer in the ilist_node_base type to a specified "parent" type, and uses that option for Instruction.

This is distinct from the ilist_node_with_parent class, despite their apparent similarities. The ilist_node_with_parent class is a subclass of ilist_node that requires its own subclasses to define a getParent method, which is then used by the owning ilist for some of its operations; it is purely an interface declaration. The ilist_parent option on the other hand is concerned with data; it is declared as a property of the list itself, and causes every node of that list to own a pointer to the list owner; it has no effect on node access and adds no new methods. Not all ilist_node_with_parents should be in lists with ilist_parent; all lists with ilist_parent currently contain ilist_node_with_parents, but there's no real advantage to enforcing this relationship at compile time.

Currently, we can use BasicBlock::iterator to insert instructions, except when either the iterator is invalid (NodePtr=0x0), or when the iterator points to a sentinel value (BasicBlock::end()). The former can be easily verified from the iterator object itself, but the latter is currently not detectable (as we do not enable explicit sentinel tracking), and even if we know it we have no clean way to get the correct BasicBlock* value from the sentinel node (technically it should exist at a fixed offset from the start of the owning BasicBlock object, but using that fact is tricky without reinterpret_cast and funky pointer arithmetic). Moving the BasicBlock *Parent field from Instruction to ilist_node_base means that we can fetch the owning BB from the end() iterator, which allows us to use iterators for all insertions, without needing to store or pass an extra BasicBlock *BB argument alongside it.

The motivation for this is to move smoothly towards the proposed rewrite of the instruction insertion interface, which itself is needed to reduce the potential for errors to sneak in to the compiler from the use of insertion patterns that behave incorrectly with respect to debug records. This also allows us to greatly deduplicate the code for creating new instructions; in a followup commit (#94226), this behaviour is used to define a simple InsertPosition class that replaces the "insertBefore/insertAtEnd" arguments for all instruction creation/constructor methods, resulting in a significant reduction in the size of that file with almost neutral performance impact.

The performance impact of this patch can be seen on the compile time tracker here. The results are basically neutral, except for a very slight increase in compile time (0.04%) and file size (0.01%) of stage 2 clang; the followup commit that exploits this change causes a larger reduction in both of these statistics.

@SLTozer SLTozer self-assigned this Jun 3, 2024
@llvmbot llvmbot added the llvm:ir label Jun 3, 2024
@llvmbot
Copy link
Member

llvmbot commented Jun 3, 2024

@llvm/pr-subscribers-llvm-ir

@llvm/pr-subscribers-llvm-adt

Author: Stephen Tozer (SLTozer)

Changes

This patch adds a new option for ilist, ilist_parent&lt;ParentTy&gt;, that enables storing an additional pointer in the ilist_node_base type to a specified "parent" type, and uses that option for Instruction.

This is distinct from the ilist_node_with_parent class, despite their apparent similarities. The ilist_node_with_parent class is a subclass of ilist_node that requires its own subclasses to define a getParent method, which is then used by the owning ilist for some of its operations; it is purely an interface declaration. The ilist_parent option on the other hand is concerned with data; it is declared as a property of the list itself, and causes every node of that list to own a pointer to the list owner; it has no effect on node access and adds no new methods. Not all ilist_node_with_parents should be in lists with ilist_parent; all lists with ilist_parent currently contain ilist_node_with_parents, but there's no real advantage to enforcing this relationship at compile time.

Currently, we can use BasicBlock::iterator to insert instructions, except when either the iterator is invalid (NodePtr=0x0), or when the iterator points to a sentinel value (BasicBlock::end()). The former can be easily verified from the iterator object itself, but the latter is currently not detectable (as we do not enable explicit sentinel tracking), and even if we know it we have no clean way to get the correct BasicBlock* value from the sentinel node (technically it should exist at a fixed offset from the start of the owning BasicBlock object, but using that fact is tricky without reinterpret_cast and funky pointer arithmetic). Moving the BasicBlock *Parent field from Instruction to ilist_node_base means that we can fetch the owning BB from the end() iterator, which allows us to use iterators for all insertions, without needing to store or pass an extra BasicBlock *BB argument alongside it.

The motivation for this is to move smoothly towards the proposed rewrite of the instruction insertion interface, which itself is needed to reduce the potential for errors to sneak in to the compiler from the use of insertion patterns that behave incorrectly with respect to debug records. This also allows us to greatly deduplicate the code for creating new instructions; in a followup commit (opening shortly), this behaviour is used to define a simple InsertPosition class that replaces the "insertBefore/insertAtEnd" arguments for all instruction creation/constructor methods, resulting in a significant reduction in the size of that file with almost neutral performance impact.

The performance impact of this patch can be seen on the compile time tracker here. The results are basically neutral, except for a very slight increase in compile time (0.04%) and file size (0.01%) of stage 2 clang; the followup commit that exploits this change causes a larger reduction in both of these statistics.


Patch is 20.25 KiB, truncated to 20.00 KiB below, full version: https://github.com/llvm/llvm-project/pull/94224.diff

10 Files Affected:

  • (modified) llvm/include/llvm/ADT/ilist_base.h (+2-2)
  • (modified) llvm/include/llvm/ADT/ilist_iterator.h (+10)
  • (modified) llvm/include/llvm/ADT/ilist_node.h (+11)
  • (modified) llvm/include/llvm/ADT/ilist_node_base.h (+48-3)
  • (modified) llvm/include/llvm/ADT/ilist_node_options.h (+37-6)
  • (modified) llvm/include/llvm/IR/BasicBlock.h (+6-4)
  • (modified) llvm/include/llvm/IR/Instruction.h (+9-6)
  • (modified) llvm/include/llvm/IR/ValueSymbolTable.h (+3-1)
  • (modified) llvm/lib/IR/BasicBlock.cpp (+3-2)
  • (modified) llvm/lib/IR/Instruction.cpp (+11-14)
diff --git a/llvm/include/llvm/ADT/ilist_base.h b/llvm/include/llvm/ADT/ilist_base.h
index b8c098b951ade..000253a26012b 100644
--- a/llvm/include/llvm/ADT/ilist_base.h
+++ b/llvm/include/llvm/ADT/ilist_base.h
@@ -15,9 +15,9 @@
 namespace llvm {
 
 /// Implementations of list algorithms using ilist_node_base.
-template <bool EnableSentinelTracking> class ilist_base {
+template <bool EnableSentinelTracking, class ParentPtrTy> class ilist_base {
 public:
-  using node_base_type = ilist_node_base<EnableSentinelTracking>;
+  using node_base_type = ilist_node_base<EnableSentinelTracking, ParentPtrTy>;
 
   static void insertBeforeImpl(node_base_type &Next, node_base_type &N) {
     node_base_type &Prev = *Next.getPrev();
diff --git a/llvm/include/llvm/ADT/ilist_iterator.h b/llvm/include/llvm/ADT/ilist_iterator.h
index 2393c4d2c403c..635e050e0d09a 100644
--- a/llvm/include/llvm/ADT/ilist_iterator.h
+++ b/llvm/include/llvm/ADT/ilist_iterator.h
@@ -70,6 +70,7 @@ class ilist_iterator : ilist_detail::SpecificNodeAccess<OptionsT> {
   using iterator_category = std::bidirectional_iterator_tag;
   using const_pointer = typename OptionsT::const_pointer;
   using const_reference = typename OptionsT::const_reference;
+  using parent_ptr_ty = typename OptionsT::parent_ptr_ty;
 
 private:
   using node_pointer = typename Traits::node_pointer;
@@ -101,6 +102,8 @@ class ilist_iterator : ilist_detail::SpecificNodeAccess<OptionsT> {
     return *this;
   }
 
+  parent_ptr_ty getNodeParent() { return NodePtr->getNodeBaseParent(); }
+
   /// Explicit conversion between forward/reverse iterators.
   ///
   /// Translate between forward and reverse iterators without changing range
@@ -168,6 +171,8 @@ class ilist_iterator : ilist_detail::SpecificNodeAccess<OptionsT> {
     return tmp;
   }
 
+  bool isValid() const { return NodePtr; }
+
   /// Get the underlying ilist_node.
   node_pointer getNodePtr() const { return static_cast<node_pointer>(NodePtr); }
 
@@ -195,6 +200,7 @@ class ilist_iterator_w_bits : ilist_detail::SpecificNodeAccess<OptionsT> {
   using iterator_category = std::bidirectional_iterator_tag;
   using const_pointer = typename OptionsT::const_pointer;
   using const_reference = typename OptionsT::const_reference;
+  using parent_ptr_ty = typename OptionsT::parent_ptr_ty;
 
 private:
   using node_pointer = typename Traits::node_pointer;
@@ -319,6 +325,10 @@ class ilist_iterator_w_bits : ilist_detail::SpecificNodeAccess<OptionsT> {
     return tmp;
   }
 
+  parent_ptr_ty getNodeParent() { return NodePtr->getNodeBaseParent(); }
+
+  bool isValid() const { return NodePtr; }
+
   /// Get the underlying ilist_node.
   node_pointer getNodePtr() const { return static_cast<node_pointer>(NodePtr); }
 
diff --git a/llvm/include/llvm/ADT/ilist_node.h b/llvm/include/llvm/ADT/ilist_node.h
index 3b6f0dcc7b5e9..ec615497abee1 100644
--- a/llvm/include/llvm/ADT/ilist_node.h
+++ b/llvm/include/llvm/ADT/ilist_node.h
@@ -118,7 +118,9 @@ template <class OptionsT> class ilist_node_impl : OptionsT::node_base_type {
   }
 
   // Under-approximation, but always available for assertions.
+  using node_base_type::getNodeBaseParent;
   using node_base_type::isKnownSentinel;
+  using node_base_type::setNodeBaseParent;
 
   /// Check whether this is the sentinel node.
   ///
@@ -171,6 +173,15 @@ template <class OptionsT> class ilist_node_impl : OptionsT::node_base_type {
 /// }
 /// \endexample
 ///
+/// When the \a ilist_parent<ParentTy> option is passed to an ilist_node and the
+/// owning ilist, each node contains a pointer to the ilist's owner. This
+/// pointer does not have any automatic behaviour; set it manually, including
+/// for the sentinel node when the list is created. The primary benefit of this
+/// over declaring and using this pointer in the final node class is that the
+/// pointer will be added in the sentinel, meaning that you can safely use \a
+/// ilist_iterator::getNodeParent() to get the node parent from any valid (i.e.
+/// non-null) iterator, even a sentinel value.
+///
 /// See \a is_valid_option for steps on adding a new option.
 template <class T, class... Options>
 class ilist_node
diff --git a/llvm/include/llvm/ADT/ilist_node_base.h b/llvm/include/llvm/ADT/ilist_node_base.h
index f6c518e6eed7a..e396bb22fca02 100644
--- a/llvm/include/llvm/ADT/ilist_node_base.h
+++ b/llvm/include/llvm/ADT/ilist_node_base.h
@@ -16,9 +16,9 @@ namespace llvm {
 /// Base class for ilist nodes.
 ///
 /// Optionally tracks whether this node is the sentinel.
-template <bool EnableSentinelTracking> class ilist_node_base;
+template <bool EnableSentinelTracking, class ParentPtrTy> class ilist_node_base;
 
-template <> class ilist_node_base<false> {
+template <> class ilist_node_base<false, void> {
   ilist_node_base *Prev = nullptr;
   ilist_node_base *Next = nullptr;
 
@@ -28,11 +28,14 @@ template <> class ilist_node_base<false> {
   ilist_node_base *getPrev() const { return Prev; }
   ilist_node_base *getNext() const { return Next; }
 
+  void setNodeBaseParent(void) {}
+  inline void getNodeBaseParent() const {}
+
   bool isKnownSentinel() const { return false; }
   void initializeSentinel() {}
 };
 
-template <> class ilist_node_base<true> {
+template <> class ilist_node_base<true, void> {
   PointerIntPair<ilist_node_base *, 1> PrevAndSentinel;
   ilist_node_base *Next = nullptr;
 
@@ -42,6 +45,48 @@ template <> class ilist_node_base<true> {
   ilist_node_base *getPrev() const { return PrevAndSentinel.getPointer(); }
   ilist_node_base *getNext() const { return Next; }
 
+  void setNodeBaseParent(void) {}
+  inline void getNodeBaseParent() const {}
+
+  bool isSentinel() const { return PrevAndSentinel.getInt(); }
+  bool isKnownSentinel() const { return isSentinel(); }
+  void initializeSentinel() { PrevAndSentinel.setInt(true); }
+};
+
+template <class ParentPtrTy> class ilist_node_base<false, ParentPtrTy> {
+  ilist_node_base *Prev = nullptr;
+  ilist_node_base *Next = nullptr;
+  ParentPtrTy Parent = nullptr;
+
+public:
+  void setPrev(ilist_node_base *Prev) { this->Prev = Prev; }
+  void setNext(ilist_node_base *Next) { this->Next = Next; }
+  ilist_node_base *getPrev() const { return Prev; }
+  ilist_node_base *getNext() const { return Next; }
+
+  void setNodeBaseParent(ParentPtrTy Parent) { this->Parent = Parent; }
+  inline const ParentPtrTy getNodeBaseParent() const { return Parent; }
+  inline ParentPtrTy getNodeBaseParent() { return Parent; }
+
+  bool isKnownSentinel() const { return false; }
+  void initializeSentinel() {}
+};
+
+template <class ParentPtrTy> class ilist_node_base<true, ParentPtrTy> {
+  PointerIntPair<ilist_node_base *, 1> PrevAndSentinel;
+  ilist_node_base *Next = nullptr;
+  ParentPtrTy Parent = nullptr;
+
+public:
+  void setPrev(ilist_node_base *Prev) { PrevAndSentinel.setPointer(Prev); }
+  void setNext(ilist_node_base *Next) { this->Next = Next; }
+  ilist_node_base *getPrev() const { return PrevAndSentinel.getPointer(); }
+  ilist_node_base *getNext() const { return Next; }
+
+  void setNodeBaseParent(ParentPtrTy Parent) { this->Parent = Parent; }
+  inline const ParentPtrTy getNodeBaseParent() const { return Parent; }
+  inline ParentPtrTy getNodeBaseParent() { return Parent; }
+
   bool isSentinel() const { return PrevAndSentinel.getInt(); }
   bool isKnownSentinel() const { return isSentinel(); }
   void initializeSentinel() { PrevAndSentinel.setInt(true); }
diff --git a/llvm/include/llvm/ADT/ilist_node_options.h b/llvm/include/llvm/ADT/ilist_node_options.h
index e6e1068953e36..aa32162cd51e4 100644
--- a/llvm/include/llvm/ADT/ilist_node_options.h
+++ b/llvm/include/llvm/ADT/ilist_node_options.h
@@ -15,8 +15,8 @@
 
 namespace llvm {
 
-template <bool EnableSentinelTracking> class ilist_node_base;
-template <bool EnableSentinelTracking> class ilist_base;
+template <bool EnableSentinelTracking, class ParentPtrTy> class ilist_node_base;
+template <bool EnableSentinelTracking, class ParentPtrTy> class ilist_base;
 
 /// Option to choose whether to track sentinels.
 ///
@@ -39,6 +39,19 @@ template <class Tag> struct ilist_tag {};
 /// iterator class to store that information.
 template <bool ExtraIteratorBits> struct ilist_iterator_bits {};
 
+/// Option to add a pointer to this list's owner in every node.
+///
+/// This option causes the \a ilist_base_node for this list to contain a pointer
+/// ParentTy *Parent, returned by \a ilist_base_node::getNodeBaseParent() and
+/// set by \a ilist_base_node::setNodeBaseParent(ParentTy *Parent). The parent
+/// value is not set automatically; the ilist owner should set itself as the
+/// parent of the list sentinel, and the parent should be set on each node
+/// inserted into the list. This value is also not used by
+/// \a ilist_node_with_parent::getNodeParent(), but is used by \a
+/// ilist_iterator::getNodeParent(), which allows the parent to be fetched from
+/// any valid (non-null) iterator to this list, including the sentinel.
+template <class ParentTy> struct ilist_parent {};
+
 namespace ilist_detail {
 
 /// Helper trait for recording whether an option is specified explicitly.
@@ -114,6 +127,21 @@ template <> struct extract_iterator_bits<> : std::false_type, is_implicit {};
 template <bool IteratorBits>
 struct is_valid_option<ilist_iterator_bits<IteratorBits>> : std::true_type {};
 
+/// Extract node parent option.
+///
+/// Look through \p Options for the \a ilist_parent option, pulling out the
+/// custom parent type, using void as a default.
+template <class... Options> struct extract_parent;
+template <class ParentTy, class... Options>
+struct extract_parent<ilist_parent<ParentTy>, Options...> {
+  typedef ParentTy *type;
+};
+template <class Option1, class... Options>
+struct extract_parent<Option1, Options...> : extract_parent<Options...> {};
+template <> struct extract_parent<> { typedef void type; };
+template <class ParentTy>
+struct is_valid_option<ilist_parent<ParentTy>> : std::true_type {};
+
 /// Check whether options are valid.
 ///
 /// The conjunction of \a is_valid_option on each individual option.
@@ -128,7 +156,7 @@ struct check_options<Option1, Options...>
 ///
 /// This is usually computed via \a compute_node_options.
 template <class T, bool EnableSentinelTracking, bool IsSentinelTrackingExplicit,
-          class TagT, bool HasIteratorBits>
+          class TagT, bool HasIteratorBits, class ParentPtrTy>
 struct node_options {
   typedef T value_type;
   typedef T *pointer;
@@ -140,15 +168,18 @@ struct node_options {
   static const bool is_sentinel_tracking_explicit = IsSentinelTrackingExplicit;
   static const bool has_iterator_bits = HasIteratorBits;
   typedef TagT tag;
-  typedef ilist_node_base<enable_sentinel_tracking> node_base_type;
-  typedef ilist_base<enable_sentinel_tracking> list_base_type;
+  typedef ParentPtrTy parent_ptr_ty;
+  typedef ilist_node_base<enable_sentinel_tracking, parent_ptr_ty>
+      node_base_type;
+  typedef ilist_base<enable_sentinel_tracking, parent_ptr_ty> list_base_type;
 };
 
 template <class T, class... Options> struct compute_node_options {
   typedef node_options<T, extract_sentinel_tracking<Options...>::value,
                        extract_sentinel_tracking<Options...>::is_explicit,
                        typename extract_tag<Options...>::type,
-                       extract_iterator_bits<Options...>::value>
+                       extract_iterator_bits<Options...>::value,
+                       typename extract_parent<Options...>::type>
       type;
 };
 
diff --git a/llvm/include/llvm/IR/BasicBlock.h b/llvm/include/llvm/IR/BasicBlock.h
index e1220966e7e6e..80067f2652a2b 100644
--- a/llvm/include/llvm/IR/BasicBlock.h
+++ b/llvm/include/llvm/IR/BasicBlock.h
@@ -59,7 +59,8 @@ class DbgMarker;
 class BasicBlock final : public Value, // Basic blocks are data objects also
                          public ilist_node_with_parent<BasicBlock, Function> {
 public:
-  using InstListType = SymbolTableList<Instruction, ilist_iterator_bits<true>>;
+  using InstListType = SymbolTableList<Instruction, ilist_iterator_bits<true>,
+                                       ilist_parent<BasicBlock>>;
   /// Flag recording whether or not this block stores debug-info in the form
   /// of intrinsic instructions (false) or non-instruction records (true).
   bool IsNewDbgInfoFormat;
@@ -172,10 +173,11 @@ class BasicBlock final : public Value, // Basic blocks are data objects also
   friend BasicBlock::iterator Instruction::eraseFromParent();
   friend BasicBlock::iterator Instruction::insertInto(BasicBlock *BB,
                                                       BasicBlock::iterator It);
-  friend class llvm::SymbolTableListTraits<llvm::Instruction,
-                                           ilist_iterator_bits<true>>;
+  friend class llvm::SymbolTableListTraits<
+      llvm::Instruction, ilist_iterator_bits<true>, ilist_parent<BasicBlock>>;
   friend class llvm::ilist_node_with_parent<llvm::Instruction, llvm::BasicBlock,
-                                            ilist_iterator_bits<true>>;
+                                            ilist_iterator_bits<true>,
+                                            ilist_parent<BasicBlock>>;
 
   // Friendly methods that need to access us for the maintenence of
   // debug-info attachments.
diff --git a/llvm/include/llvm/IR/Instruction.h b/llvm/include/llvm/IR/Instruction.h
index 249be2799fa93..e1cb362ad20d0 100644
--- a/llvm/include/llvm/IR/Instruction.h
+++ b/llvm/include/llvm/IR/Instruction.h
@@ -46,11 +46,13 @@ getDbgRecordRange(DbgMarker *);
 
 class Instruction : public User,
                     public ilist_node_with_parent<Instruction, BasicBlock,
-                                                  ilist_iterator_bits<true>> {
+                                                  ilist_iterator_bits<true>,
+                                                  ilist_parent<BasicBlock>> {
 public:
-  using InstListType = SymbolTableList<Instruction, ilist_iterator_bits<true>>;
+  using InstListType = SymbolTableList<Instruction, ilist_iterator_bits<true>,
+                                       ilist_parent<BasicBlock>>;
+
 private:
-  BasicBlock *Parent;
   DebugLoc DbgLoc;                         // 'dbg' Metadata cache.
 
   /// Relative order of this instruction in its parent basic block. Used for
@@ -149,8 +151,8 @@ class Instruction : public User,
   Instruction       *user_back()       { return cast<Instruction>(*user_begin());}
   const Instruction *user_back() const { return cast<Instruction>(*user_begin());}
 
-  inline const BasicBlock *getParent() const { return Parent; }
-  inline       BasicBlock *getParent()       { return Parent; }
+  inline const BasicBlock *getParent() const { return getNodeBaseParent(); }
+  inline BasicBlock *getParent() { return getNodeBaseParent(); }
 
   /// Return the module owning the function this instruction belongs to
   /// or nullptr it the function does not have a module.
@@ -980,7 +982,8 @@ class Instruction : public User,
   };
 
 private:
-  friend class SymbolTableListTraits<Instruction, ilist_iterator_bits<true>>;
+  friend class SymbolTableListTraits<Instruction, ilist_iterator_bits<true>,
+                                     ilist_parent<BasicBlock>>;
   friend class BasicBlock; // For renumbering.
 
   // Shadow Value::setValueSubclassData with a private forwarding method so that
diff --git a/llvm/include/llvm/IR/ValueSymbolTable.h b/llvm/include/llvm/IR/ValueSymbolTable.h
index 6350f6a2435e4..cd1dbbe1688a1 100644
--- a/llvm/include/llvm/IR/ValueSymbolTable.h
+++ b/llvm/include/llvm/IR/ValueSymbolTable.h
@@ -28,6 +28,7 @@ class GlobalIFunc;
 class GlobalVariable;
 class Instruction;
 template <bool ExtraIteratorBits> struct ilist_iterator_bits;
+template <class ParentTy> struct ilist_parent;
 template <unsigned InternalLen> class SmallString;
 template <typename ValueSubClass, typename ... Args> class SymbolTableListTraits;
 
@@ -42,7 +43,8 @@ class ValueSymbolTable {
   friend class SymbolTableListTraits<GlobalAlias>;
   friend class SymbolTableListTraits<GlobalIFunc>;
   friend class SymbolTableListTraits<GlobalVariable>;
-  friend class SymbolTableListTraits<Instruction, ilist_iterator_bits<true>>;
+  friend class SymbolTableListTraits<Instruction, ilist_iterator_bits<true>,
+                                     ilist_parent<BasicBlock>>;
   friend class Value;
 
 /// @name Types
diff --git a/llvm/lib/IR/BasicBlock.cpp b/llvm/lib/IR/BasicBlock.cpp
index 29f2cbf611fa3..a5199f36c1cb5 100644
--- a/llvm/lib/IR/BasicBlock.cpp
+++ b/llvm/lib/IR/BasicBlock.cpp
@@ -175,8 +175,8 @@ template <> void llvm::invalidateParentIListOrdering(BasicBlock *BB) {
 
 // Explicit instantiation of SymbolTableListTraits since some of the methods
 // are not in the public header file...
-template class llvm::SymbolTableListTraits<Instruction,
-                                           ilist_iterator_bits<true>>;
+template class llvm::SymbolTableListTraits<
+    Instruction, ilist_iterator_bits<true>, ilist_parent<BasicBlock>>;
 
 BasicBlock::BasicBlock(LLVMContext &C, const Twine &Name, Function *NewParent,
                        BasicBlock *InsertBefore)
@@ -189,6 +189,7 @@ BasicBlock::BasicBlock(LLVMContext &C, const Twine &Name, Function *NewParent,
     assert(!InsertBefore &&
            "Cannot insert block before another block with no function!");
 
+  end().getNodePtr()->setNodeBaseParent(this);
   setName(Name);
   if (NewParent)
     setIsNewDbgInfoFormat(NewParent->IsNewDbgInfoFormat);
diff --git a/llvm/lib/IR/Instruction.cpp b/llvm/lib/IR/Instruction.cpp
index 29272e627a1d1..8e8131113a230 100644
--- a/llvm/lib/IR/Instruction.cpp
+++ b/llvm/lib/IR/Instruction.cpp
@@ -27,8 +27,7 @@ using namespace llvm;
 
 Instruction::Instruction(Type *ty, unsigned it, Use *Ops, unsigned NumOps,
                          InstListType::iterator InsertBefore)
-    : User(ty, Value::InstructionVal + it, Ops, NumOps), Parent(nullptr) {
-
+    : User(ty, Value::InstructionVal + it, Ops, NumOps) {
   // When called with an iterator, there must be a block to insert into.
   BasicBlock *BB = InsertBefore->getParent();
   assert(BB && "Instruction to insert before is not in a basic block!");
@@ -37,7 +36,7 @@ Instruction::Instruction(Type *ty, unsigned it, Use *Ops, unsigned NumOps,
 
 Instruction::Instruction(Type *ty, unsigned it, Use *Ops, unsigned NumOps,
                          Instruction *InsertBefore)
-  : User(ty, Value::InstructionVal + it, Ops, NumOps), Parent(nullptr) {
+    : User(ty, Value::InstructionVal + it, Ops, NumOps) {
 
   // If requested, insert this instruction into a basic block...
   if (InsertBefore) {
@@ -49,15 +48,14 @@ Instruction::Instruction(Type *ty, unsigned it, Use *Ops, unsigned NumOps,
 
 Instruction::Instruction(Type *ty, unsigned it, Use *Ops, unsigned NumOps,
                          BasicBlock *InsertAtEnd)
-    : User(ty, Value::InstructionVal + it, Ops, NumOps), Parent(nullptr) {
-
+    : User(ty, Value::InstructionVal + it, Ops, NumOps) {
   // If requested, append this instruction into the basic block.
   if (InsertAtEnd)
     insertInto(InsertAtEnd, InsertAtEnd->end());
 }
 
 Instruction::~Instruction() {
-  assert(!Parent && "Instruction still linked in the program!");
+  assert(!getParent() && "Instruction still linked in the program!");
 
   // Replace any extant metadata uses of this instruction with undef to
   // preserve debug info accuracy. Some alternatives include:
@@ -76,9 +74,7 @@ Instruction::~Instruction() {
   setMetadata(LLVMContext::MD_DIAssignID, nullptr);
 }
 
-void Instruction::setParent(BasicBlock *P) {
-  Parent = P;
-}
+void Instruction::setParent(BasicBlock *P) { setNodeBaseParent(P); }
 
 const Module *Instruction::getModule() const {
   return getParent()->getModule();
@@ -96,7 +92,7 @@ void Instruction::removeFromParent() {
 }
 
 void Instruction::handleMarkerRemoval() {
-  if (!Parent->IsNewDbgInfoFormat || !DebugMarker)
+  if (!getParent()->IsNewDbgInfoFormat || !DebugMarker)
     return;
 
   DebugMarker->removeMarker();
@@ -329,11 +325,12 @@ void Instruction::dropOneDbgRecord(DbgRecord *DVR) {
 }
 
 bool Instruction::comesBefore(const Instruction *Other) const {
-  assert(Parent && Other->Parent &&
+  assert(getParent() && Other->getParent() &&
          "instructions without BB parents have no order");
-  assert(Parent == Other->Parent && "cross-BB instruction order comparison");
-  if (!Parent->isInstrOrderValid())
-    Parent->renumberI...
[truncated]

Copy link

github-actions bot commented Jun 3, 2024

✅ With the latest revision this PR passed the C/C++ code formatter.

Copy link
Collaborator

@dexonsmith dexonsmith left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't have time for a detailed review (travelling for the next few months)... but the plan in principle SGTM! Seems like a great way to detect the end iterator. Thanks for working on this.

void initializeSentinel() { PrevAndSentinel.setInt(true); }
};

template <class ParentPtrTy> class ilist_node_base<false, ParentPtrTy> {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can this inherit from <false, void> to reduce code duplication? Or if that's confusing, extract a common base class for the previous/next logic?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Unfortunately not directly, but the common base class approach could work!

Copy link
Contributor

@nikic nikic left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not really familiar with all the infrastructure here, but conceptually this looks good to me. I've always disliked how a proper insertion point requires both an iterator and a BasicBlock, and this solves the problem quite elegantly.

@SLTozer
Copy link
Contributor Author

SLTozer commented Jun 4, 2024

Alright, I've made some modifications - using some mixin classes to 1) reduce a bit of duplication in ilist_node_base, and 2) make the parent access a bit "smarter": ilist_node_impl will now have public getParent()/setParent(ParentTy*) methods iff the list uses ilist_parent, and similarly the ilist_iterator only provides getNodeParent() if the node type uses ilist_parent. This reduces the potential for misuse (unknowingly trying to getNodeParent() from an iterator that doesn't actually support it), and means that Instruction doesn't need to declare getParent()/setParent(BasicBlock*) at all.

I think this is getting close to "too much template logic"; I also experimented making changes to SymbolTableList that would set the parent on the sentinel node automatically (as it does for other nodes), but it ended up at 30 lines of garbled template nonsense in order to avoid one call to end().getNodePtr()->setParent(this) in the BasicBlock constructor, so I decided not to add that to the patch.

@SLTozer
Copy link
Contributor Author

SLTozer commented Jun 18, 2024

Ping

@nikic
Copy link
Contributor

nikic commented Jun 18, 2024

@SLTozer There is a compilation failure in LLVM unit tests (IListBaseTest.cpp).

@SLTozer
Copy link
Contributor Author

SLTozer commented Jun 18, 2024

@SLTozer There is a compilation failure in LLVM unit tests (IListBaseTest.cpp).

Ah, I missed the unit tests! Fixed the compilation errors and added some tests for the new ilist_node_base behaviour; I'll also add some unit tests for the other IList classes that got functional changes in this patch.

Copy link
Contributor

@nikic nikic left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM assuming this has no undue impact on build times.

@SLTozer
Copy link
Contributor Author

SLTozer commented Jun 18, 2024

LGTM assuming this has no undue impact on build times.

The first test I ran had no significant impact on compile times (results), other than a slight increase in the time to build clang; this increase in time-to-build-clang is more than compensated for by the followup patch for InsertPosition (results). I'll submit a quick check to the tracker to verify that the recent commits and rebase hasn't affected compile times, but most likely it will be fine. Thank you for the review!

@SLTozer
Copy link
Contributor Author

SLTozer commented Jun 18, 2024

Latest update, there's no major impact on compile times (results); the only cost is the time to build clang, which as noted will be fixed by the next patch along.

@nikic
Copy link
Contributor

nikic commented Jun 18, 2024

Latest update, there's no major impact on compile times (results); the only cost is the time to build clang, which as noted will be fixed by the next patch along.

Thanks, those numbers look good to me.

@SLTozer SLTozer merged commit 0863bd8 into llvm:main Jun 18, 2024
7 checks passed
@dklimkin
Copy link
Member

H Stephen,

We ran into some issues with getParent() and getNodeBaseParent() returning a const pointer rather than a pointer to a const, resulting in iterators returned also not being const. I'll send a PR with a workaround but adding a note here in case you have a better solution in mind.

@SLTozer
Copy link
Contributor Author

SLTozer commented Jun 19, 2024

Ouch, that's an oversight on my part - I think the correct solution probably involves adding a const_parent_ptr_ty to node_options, though I think that either requires us to pass both around (making the template specialization logic a bit messier/nastier), or to pass around the parent type directly, which is probably the better choice - the only cost is that we can't set a void* parent pointer, but we probably don't need that!

If you upload your workaround soon I'll happily take a look at it, but otherwise I can fix it up quickly enough.

@dklimkin
Copy link
Member

Sent you #96053 but I'd prefer passing around the parent type directly. Adding to the options makes things messy, I did try that. Thanks!

SLTozer added a commit to SLTozer/llvm-project that referenced this pull request Jun 19, 2024
Fixes issue reported in: llvm#94224

The recent commit above added an ilist_parent<ParentTy> option, which
added a parent pointer to the ilist_node_base type for the list. The
const methods for returning that parent pointer however were incorrectly
implemented, returning `const ParentPtrTy`, which is equivalent to
`ParentTy * const` rather than `const ParentTy *`. This patch fixes this
by passing around `ParentTy` in ilist's internal logic rather than
`ParentPtrTy`, removing the ability to have a `void*` parent pointer but
cleanly fixing this error.
SLTozer added a commit that referenced this pull request Jun 19, 2024
Fixes issue reported in: #94224

The recent commit above added an ilist_parent<ParentTy> option, which
added a parent pointer to the ilist_node_base type for the list. The
const methods for returning that parent pointer however were incorrectly
implemented, returning `const ParentPtrTy`, which is equivalent to
`ParentTy * const` rather than `const ParentTy *`. This patch fixes this
by passing around `ParentTy` in ilist's internal logic rather than
`ParentPtrTy`, removing the ability to have a `void*` parent pointer but
cleanly fixing this error.
AlexisPerry pushed a commit to llvm-project-tlp/llvm-project that referenced this pull request Jul 9, 2024
)

This patch adds a new option for `ilist`, `ilist_parent<ParentTy>`, that
enables storing an additional pointer in the `ilist_node_base` type to a
specified "parent" type, and uses that option for `Instruction`.

This is distinct from the `ilist_node_with_parent` class, despite their
apparent similarities. The `ilist_node_with_parent` class is a subclass
of `ilist_node` that requires its own subclasses to define a `getParent`
method, which is then used by the owning `ilist` for some of its
operations; it is purely an interface declaration. The `ilist_parent`
option on the other hand is concerned with data, adding a parent field
to the `ilist_node_base` class.

Currently, we can use `BasicBlock::iterator` to insert instructions,
_except_ when either the iterator is invalid (`NodePtr=0x0`), or when
the iterator points to a sentinel value (`BasicBlock::end()`). This patch
results in the sentinel value also having a valid pointer to its owning
basic block, which allows us to use iterators for all insertions,
without needing to store or pass an extra `BasicBlock *BB` argument
alongside it.
AlexisPerry pushed a commit to llvm-project-tlp/llvm-project that referenced this pull request Jul 9, 2024
Fixes issue reported in: llvm#94224

The recent commit above added an ilist_parent<ParentTy> option, which
added a parent pointer to the ilist_node_base type for the list. The
const methods for returning that parent pointer however were incorrectly
implemented, returning `const ParentPtrTy`, which is equivalent to
`ParentTy * const` rather than `const ParentTy *`. This patch fixes this
by passing around `ParentTy` in ilist's internal logic rather than
`ParentPtrTy`, removing the ability to have a `void*` parent pointer but
cleanly fixing this error.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants