Skip to content

[DenseMap] Optimize find/erase #100517

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 2 commits into from
Jul 25, 2024
Merged

Conversation

MaskRay
Copy link
Member

@MaskRay MaskRay commented Jul 25, 2024

LookupBucketFor is used for find, insert, erase, and their
variants. While tombstone comparison isn't needed by find/erase,
LookupBucketFor calls getTombstoneKey regardless, which might have
an opaque implementation or just not optimized out, leading to
unnecessary overhead for find and erase.

Created using spr 1.3.5-bogner
@llvmbot
Copy link
Member

llvmbot commented Jul 25, 2024

@llvm/pr-subscribers-llvm-adt

Author: Fangrui Song (MaskRay)

Changes

LookupBucketFor is used for both find and insert.
LookupBucketFor calls getTombstoneKey, which might be opaque or just
not optimized out, leading to unnecessary overhead for find.


Full diff: https://github.com/llvm/llvm-project/pull/100517.diff

1 Files Affected:

  • (modified) llvm/include/llvm/ADT/DenseMap.h (+47-29)
diff --git a/llvm/include/llvm/ADT/DenseMap.h b/llvm/include/llvm/ADT/DenseMap.h
index 7ccc9445c0a7b..e130861307ac9 100644
--- a/llvm/include/llvm/ADT/DenseMap.h
+++ b/llvm/include/llvm/ADT/DenseMap.h
@@ -143,8 +143,7 @@ class DenseMapBase : public DebugEpochBase {
 
   /// Return true if the specified key is in the map, false otherwise.
   bool contains(const_arg_type_t<KeyT> Val) const {
-    const BucketT *TheBucket;
-    return LookupBucketFor(Val, TheBucket);
+    return doFind(Val) != nullptr;
   }
 
   /// Return 1 if the specified key is in the map, 0 otherwise.
@@ -153,21 +152,17 @@ class DenseMapBase : public DebugEpochBase {
   }
 
   iterator find(const_arg_type_t<KeyT> Val) {
-    BucketT *TheBucket;
-    if (LookupBucketFor(Val, TheBucket))
-      return makeIterator(TheBucket,
-                          shouldReverseIterate<KeyT>() ? getBuckets()
-                                                       : getBucketsEnd(),
-                          *this, true);
+    if (BucketT *Bucket = doFind(Val))
+      return makeIterator(
+          Bucket, shouldReverseIterate<KeyT>() ? getBuckets() : getBucketsEnd(),
+          *this, true);
     return end();
   }
   const_iterator find(const_arg_type_t<KeyT> Val) const {
-    const BucketT *TheBucket;
-    if (LookupBucketFor(Val, TheBucket))
-      return makeConstIterator(TheBucket,
-                               shouldReverseIterate<KeyT>() ? getBuckets()
-                                                            : getBucketsEnd(),
-                               *this, true);
+    if (const BucketT *Bucket = doFind(Val))
+      return makeConstIterator(
+          Bucket, shouldReverseIterate<KeyT>() ? getBuckets() : getBucketsEnd(),
+          *this, true);
     return end();
   }
 
@@ -178,31 +173,26 @@ class DenseMapBase : public DebugEpochBase {
   /// type used.
   template<class LookupKeyT>
   iterator find_as(const LookupKeyT &Val) {
-    BucketT *TheBucket;
-    if (LookupBucketFor(Val, TheBucket))
-      return makeIterator(TheBucket,
-                          shouldReverseIterate<KeyT>() ? getBuckets()
-                                                       : getBucketsEnd(),
-                          *this, true);
+    if (BucketT *Bucket = doFind(Val))
+      return makeIterator(
+          Bucket, shouldReverseIterate<KeyT>() ? getBuckets() : getBucketsEnd(),
+          *this, true);
     return end();
   }
   template<class LookupKeyT>
   const_iterator find_as(const LookupKeyT &Val) const {
-    const BucketT *TheBucket;
-    if (LookupBucketFor(Val, TheBucket))
-      return makeConstIterator(TheBucket,
-                               shouldReverseIterate<KeyT>() ? getBuckets()
-                                                            : getBucketsEnd(),
-                               *this, true);
+    if (const BucketT *Bucket = doFind(Val))
+      return makeConstIterator(
+          Bucket, shouldReverseIterate<KeyT>() ? getBuckets() : getBucketsEnd(),
+          *this, true);
     return end();
   }
 
   /// lookup - Return the entry for the specified key, or a default
   /// constructed value if no such entry exists.
   ValueT lookup(const_arg_type_t<KeyT> Val) const {
-    const BucketT *TheBucket;
-    if (LookupBucketFor(Val, TheBucket))
-      return TheBucket->getSecond();
+    if (const BucketT *Bucket = doFind(Val))
+      return Bucket->getSecond();
     return ValueT();
   }
 
@@ -643,6 +633,34 @@ class DenseMapBase : public DebugEpochBase {
     return TheBucket;
   }
 
+  template <typename LookupKeyT> BucketT *doFind(const LookupKeyT &Val) {
+    BucketT *BucketsPtr = getBuckets();
+    const unsigned NumBuckets = getNumBuckets();
+    if (NumBuckets == 0)
+      return nullptr;
+
+    const KeyT EmptyKey = getEmptyKey();
+    unsigned BucketNo = getHashValue(Val) & (NumBuckets - 1);
+    unsigned ProbeAmt = 1;
+    while (true) {
+      BucketT *Bucket = BucketsPtr + BucketNo;
+      if (LLVM_LIKELY(KeyInfoT::isEqual(Val, Bucket->getFirst())))
+        return Bucket;
+      if (LLVM_LIKELY(KeyInfoT::isEqual(Bucket->getFirst(), EmptyKey)))
+        return nullptr;
+
+      // Otherwise, it's a hash collision or a tombstone, continue quadratic
+      // probing.
+      BucketNo += ProbeAmt++;
+      BucketNo &= NumBuckets - 1;
+    }
+  }
+
+  template <typename LookupKeyT>
+  const BucketT *doFind(const LookupKeyT &Val) const {
+    return const_cast<DenseMapBase *>(this)->doFind(Val); // NOLINT
+  }
+
   /// LookupBucketFor - Lookup the appropriate bucket for Val, returning it in
   /// FoundBucket.  If the bucket contains the key and a value, this returns
   /// true, otherwise it returns a bucket with an empty marker or tombstone and

@MaskRay
Copy link
Member Author

MaskRay commented Jul 25, 2024

https://llvm-compile-time-tracker.com/compare.php?from=8608cc1c89640bd3d8120f24c964af21310253b6&to=c955a66eb0afec50bb24e6cf486513fe6bef4570&stat=instructions:u

stage2-O3:

Benchmark Old New
kimwitu++ 38637M 38591M (-0.12%)
sqlite3 34850M 34782M (-0.19%)
consumer-typeset 31718M 31694M (-0.08%)
Bullet 92469M 92316M (-0.16%)
tramp3d-v4 77740M 77612M (-0.16%)
mafft 32553M 32541M (-0.04%)
ClamAV 50093M 50055M (-0.08%)
lencod 60602M 60537M (-0.11%)
SPASS 42667M 42591M (-0.18%)
7zip 190575M 190549M (-0.01%)
geomean 54855M 54793M (-0.11%)

@aengelke
Copy link
Contributor

lgtm, nice improvement. I think erase could also use doFind.

Created using spr 1.3.5-bogner
@MaskRay MaskRay changed the title [DenseMap] Optimize find [DenseMap] Optimize find/erase Jul 25, 2024
@MaskRay
Copy link
Member Author

MaskRay commented Jul 25, 2024

lgtm, nice improvement. I think erase could also use doFind.

Nice catch! Added support for erase

@MaskRay MaskRay requested a review from aengelke July 25, 2024 19:07
Copy link
Contributor

@kazutakahirata kazutakahirata left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM. Thanks!

@MaskRay MaskRay merged commit c9e5af3 into main Jul 25, 2024
7 checks passed
@MaskRay MaskRay deleted the users/MaskRay/spr/densemap-optimize-find branch July 25, 2024 21:07
@aganea
Copy link
Member

aganea commented Aug 2, 2024

Hello @MaskRay! Somehow this commit (and after) is failing one clang test when building the Windows LLVM package. git bisect lead me here.

C:\src\git\llvm-project>git checkout c9e5af3944e85c5f1272c48522b4e9eda398b462
Updating files: 100% (3558/3558), done.
Note: switching to 'c9e5af3944e85c5f1272c48522b4e9eda398b462'.
...
C:\src\git\llvm-project>llvm\utils\release\build_llvm_release.bat --x64 --version 19.0.0 --skip-checkout --local-python
...
-- Testing: 20961 tests, 60 workers --
Testing:
FAIL: Clang :: CodeGenObjC/gnustep2-direct-method.m (1 of 20961)
******************** TEST 'Clang :: CodeGenObjC/gnustep2-direct-method.m' FAILED ********************
Exit Code: 2

Command Output (stdout):
--
# RUN: at line 1
c:\src\git\llvm-project\llvm_package_19.0.0\build64_stage0\bin\clang.exe -cc1 -internal-isystem C:\src\git\llvm-project\llvm_package_19.0.0\build64_stage0\lib\clang\20\include -nostdsysteminc -triple x86_64-unknown-freebsd -emit-llvm -fobjc-runtime=gnustep-2.2 -o - C:\src\git\llvm-project\clang\test\CodeGenObjC\gnustep2-direct-method.m | c:\src\git\llvm-project\llvm_package_19.0.0\build64_stage0\bin\filecheck.exe C:\src\git\llvm-project\clang\test\CodeGenObjC\gnustep2-direct-method.m
# executed command: 'c:\src\git\llvm-project\llvm_package_19.0.0\build64_stage0\bin\clang.exe' -cc1 -internal-isystem 'C:\src\git\llvm-project\llvm_package_19.0.0\build64_stage0\lib\clang\20\include' -nostdsysteminc -triple x86_64-unknown-freebsd -emit-llvm -fobjc-runtime=gnustep-2.2 -o - 'C:\src\git\llvm-project\clang\test\CodeGenObjC\gnustep2-direct-method.m'
# .---command stderr------------
# | C:\src\git\llvm-project\clang\test\CodeGenObjC\gnustep2-direct-method.m:3:12: warning: class 'X' defined without specifying a base class [-Wobjc-root-class]
# |     3 | @interface X
# |       |            ^
# | C:\src\git\llvm-project\clang\test\CodeGenObjC\gnustep2-direct-method.m:3:13: note: add a super class to fix this problem
# |     3 | @interface X
# |       |             ^
# | huge alignment values are unsupported
# |   %2 = load i64, ptr %1, align 9223372036854775808
# | fatal error: error in backend: Broken module found, compilation aborted!
# | PLEASE submit a bug report to https://github.com/llvm/llvm-project/issues/ and include the crash backtrace, preprocessed source, and associated run script.
# | Stack dump:
# | 0.  Program arguments: c:\\src\\git\\llvm-project\\llvm_package_19.0.0\\build64_stage0\\bin\\clang.exe -cc1 -internal-isystem C:\\src\\git\\llvm-project\\llvm_package_19.0.0\\build64_stage0\\lib\\clang\\20\\include -nostdsysteminc -triple x86_64-unknown-freebsd -emit-llvm -fobjc-runtime=gnustep-2.2 -o - C:\\src\\git\\llvm-project\\clang\\test\\CodeGenObjC\\gnustep2-direct-method.m
# | 1.  <eof> parser at end of file
# | 2.  Optimizer
# | 3.  Running pass "verify" on module "C:\src\git\llvm-project\clang\test\CodeGenObjC\gnustep2-direct-method.m"
# `-----------------------------
# error: command failed with exit status: 70
# executed command: 'c:\src\git\llvm-project\llvm_package_19.0.0\build64_stage0\bin\filecheck.exe' 'C:\src\git\llvm-project\clang\test\CodeGenObjC\gnustep2-direct-method.m'
# .---command stderr------------
# | FileCheck error: '<stdin>' is empty.
# | FileCheck command line:  c:\src\git\llvm-project\llvm_package_19.0.0\build64_stage0\bin\filecheck.exe C:\src\git\llvm-project\clang\test\CodeGenObjC\gnustep2-direct-method.m
# `-----------------------------
# error: command failed with exit status: 2

--

********************
Testing:  0.. 10.. 20.. 30.. 40.. 50.. 60.. 70.. 80.. 90..
********************
Failed Tests (1):
  Clang :: CodeGenObjC/gnustep2-direct-method.m


Testing Time: 192.61s

Total Discovered Tests: 37881
  Skipped          :     8 (0.02%)
  Unsupported      :   300 (0.79%)
  Passed           : 37539 (99.10%)
  Expectedly Failed:    33 (0.09%)
  Failed           :     1 (0.00%)

FAILED: tools/clang/test/CMakeFiles/check-clang C:/src/git/llvm-project/llvm_package_19.0.0/build64_stage0/tools/clang/test/CMakeFiles/check-clang
C:\Windows\system32\cmd.exe /C "cd /D C:\src\git\llvm-project\llvm_package_19.0.0\build64_stage0\tools\clang\test && C:\Users\aganea_havenstudios\AppData\Local\Programs\Python\Python312\python3.exe C:/src/git/llvm-project/llvm_package_19.0.0/build64_stage0/./bin/llvm-lit.py -sv --param USE_Z3_SOLVER=0 C:/src/git/llvm-project/llvm_package_19.0.0/build64_stage0/tools/clang/test"
ninja: build stopped: subcommand failed.

The problem does not occur on the previous commit 4aa4ee909029cd7cd85d67b41d488a6edb802dce.

@MaskRay
Copy link
Member Author

MaskRay commented Aug 3, 2024

Hello @MaskRay! Somehow this commit (and after) is failing one clang test when building the Windows LLVM package. git bisect lead me here.

C:\src\git\llvm-project>git checkout c9e5af3944e85c5f1272c48522b4e9eda398b462
Updating files: 100% (3558/3558), done.
Note: switching to 'c9e5af3944e85c5f1272c48522b4e9eda398b462'.
...
C:\src\git\llvm-project>llvm\utils\release\build_llvm_release.bat --x64 --version 19.0.0 --skip-checkout --local-python
...
-- Testing: 20961 tests, 60 workers --
Testing:
FAIL: Clang :: CodeGenObjC/gnustep2-direct-method.m (1 of 20961)
******************** TEST 'Clang :: CodeGenObjC/gnustep2-direct-method.m' FAILED ********************
Exit Code: 2

Command Output (stdout):
--
# RUN: at line 1
c:\src\git\llvm-project\llvm_package_19.0.0\build64_stage0\bin\clang.exe -cc1 -internal-isystem C:\src\git\llvm-project\llvm_package_19.0.0\build64_stage0\lib\clang\20\include -nostdsysteminc -triple x86_64-unknown-freebsd -emit-llvm -fobjc-runtime=gnustep-2.2 -o - C:\src\git\llvm-project\clang\test\CodeGenObjC\gnustep2-direct-method.m | c:\src\git\llvm-project\llvm_package_19.0.0\build64_stage0\bin\filecheck.exe C:\src\git\llvm-project\clang\test\CodeGenObjC\gnustep2-direct-method.m
# executed command: 'c:\src\git\llvm-project\llvm_package_19.0.0\build64_stage0\bin\clang.exe' -cc1 -internal-isystem 'C:\src\git\llvm-project\llvm_package_19.0.0\build64_stage0\lib\clang\20\include' -nostdsysteminc -triple x86_64-unknown-freebsd -emit-llvm -fobjc-runtime=gnustep-2.2 -o - 'C:\src\git\llvm-project\clang\test\CodeGenObjC\gnustep2-direct-method.m'
# .---command stderr------------
# | C:\src\git\llvm-project\clang\test\CodeGenObjC\gnustep2-direct-method.m:3:12: warning: class 'X' defined without specifying a base class [-Wobjc-root-class]
# |     3 | @interface X
# |       |            ^
# | C:\src\git\llvm-project\clang\test\CodeGenObjC\gnustep2-direct-method.m:3:13: note: add a super class to fix this problem
# |     3 | @interface X
# |       |             ^
# | huge alignment values are unsupported
# |   %2 = load i64, ptr %1, align 9223372036854775808
# | fatal error: error in backend: Broken module found, compilation aborted!
# | PLEASE submit a bug report to [https://github.com/llvm/llvm-project/issues/](https://github.com/llvm/llvm-project/issues/?q=sort%3Aupdated-desc+is%3Aissue+is%3Aopen) and include the crash backtrace, preprocessed source, and associated run script.
# | Stack dump:
# | 0.  Program arguments: c:\\src\\git\\llvm-project\\llvm_package_19.0.0\\build64_stage0\\bin\\clang.exe -cc1 -internal-isystem C:\\src\\git\\llvm-project\\llvm_package_19.0.0\\build64_stage0\\lib\\clang\\20\\include -nostdsysteminc -triple x86_64-unknown-freebsd -emit-llvm -fobjc-runtime=gnustep-2.2 -o - C:\\src\\git\\llvm-project\\clang\\test\\CodeGenObjC\\gnustep2-direct-method.m
# | 1.  <eof> parser at end of file
# | 2.  Optimizer
# | 3.  Running pass "verify" on module "C:\src\git\llvm-project\clang\test\CodeGenObjC\gnustep2-direct-method.m"
# `-----------------------------
# error: command failed with exit status: 70
# executed command: 'c:\src\git\llvm-project\llvm_package_19.0.0\build64_stage0\bin\filecheck.exe' 'C:\src\git\llvm-project\clang\test\CodeGenObjC\gnustep2-direct-method.m'
# .---command stderr------------
# | FileCheck error: '<stdin>' is empty.
# | FileCheck command line:  c:\src\git\llvm-project\llvm_package_19.0.0\build64_stage0\bin\filecheck.exe C:\src\git\llvm-project\clang\test\CodeGenObjC\gnustep2-direct-method.m
# `-----------------------------
# error: command failed with exit status: 2

--

********************
Testing:  0.. 10.. 20.. 30.. 40.. 50.. 60.. 70.. 80.. 90..
********************
Failed Tests (1):
  Clang :: CodeGenObjC/gnustep2-direct-method.m


Testing Time: 192.61s

Total Discovered Tests: 37881
  Skipped          :     8 (0.02%)
  Unsupported      :   300 (0.79%)
  Passed           : 37539 (99.10%)
  Expectedly Failed:    33 (0.09%)
  Failed           :     1 (0.00%)

FAILED: tools/clang/test/CMakeFiles/check-clang C:/src/git/llvm-project/llvm_package_19.0.0/build64_stage0/tools/clang/test/CMakeFiles/check-clang
C:\Windows\system32\cmd.exe /C "cd /D C:\src\git\llvm-project\llvm_package_19.0.0\build64_stage0\tools\clang\test && C:\Users\aganea_havenstudios\AppData\Local\Programs\Python\Python312\python3.exe C:/src/git/llvm-project/llvm_package_19.0.0/build64_stage0/./bin/llvm-lit.py -sv --param USE_Z3_SOLVER=0 C:/src/git/llvm-project/llvm_package_19.0.0/build64_stage0/tools/clang/test"
ninja: build stopped: subcommand failed.

The problem does not occur on the previous commit 4aa4ee909029cd7cd85d67b41d488a6edb802dce.

Hi! The failure looks weird. (a) Quite a few Windows bots are happy with this change. (b) If there were an issue, I suspect that quite a few tests would fail as well.

I think figuring this out might require the assistance of a debugger? Do you have luck with that ? :)

@aganea
Copy link
Member

aganea commented Aug 3, 2024

Yes I will debug it. This is with latest MSVC, it could be very well a UB triggered by this test.

@dmpots
Copy link
Contributor

dmpots commented Aug 3, 2024

@MaskRay @aganea We are seeing the same test failure on our windows build of LLVM. Only that one test is failing, and a git bisect lead back to this PR. It only fails when running with a release build. A debug build does not hit the failure.

We are building with Microsoft (R) C/C++ Optimizing Compiler Version 19.40.33812 for x64

MaskRay added a commit that referenced this pull request Aug 8, 2024
Port the ADT optimization #100517. In addition, add `contains`
(https://reviews.llvm.org/D145895) while changing `count`.

Pull Request: #101785
@aganea
Copy link
Member

aganea commented Aug 9, 2024

This is caused by bad codegen in latest MSVC. I'm using latest version available: Microsoft (R) C/C++ Optimizing Compiler Version 19.40.33813 for x64.

As seen in the previous message, the symptom is that a bad alignement is generated in one of the load instructions:

huge alignment values are unsupported
  %2 = load i64, ptr %1, align 9223372036854775808

When Builder.CreateLoad is called here, somehow this call to Addr.getAlignment().getAsAlign() returns a bad alignement. The problem occurs at the highlighted line (sub bh,cl):

Screenshot 2024-08-09 154835

The code on the right is translated to the assembly on the left. llvm::count_zero returns a proper value (as seen in RCX), however sub bh, cl uses a bad constant in bh (it is not 63 as expected). I think the optimizer meant to use dil not bh. A few lines below it does mov byte ptr [rsp + 40h], dil. If I manually set 6 in RDI after sub is executed, as it should have been, the test passes.

It's a tad scary, but just shuffling around the lines of code in CGObjCGNU.cpp fixes the issue. I'll send out a PR and will fill a bug report with Microsoft.

aganea added a commit that referenced this pull request Aug 10, 2024
…#102681)

Before this PR, when using the latest MSVC `Microsoft (R) C/C++
Optimizing Compiler Version 19.40.33813 for x64` one of the Clang unit
test used to fail: `CodeGenObjC/gnustep2-direct-method.m`, see full
failure log:
[here](#100517 (comment)).

This PR temporarily shuffles around the code to make the MSVC inliner/
optimizer happy and avoid the bug.

MSVC bug report:
https://developercommunity.visualstudio.com/t/Bad-code-generation-when-building-LLVM-w/10719589?port=1025&fsid=e572244a-cde7-4d75-a73d-9b8cd94204dd
MaskRay added a commit that referenced this pull request Aug 19, 2024
Port #100517 for DenseMap.

Pull Request: #104740
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants