-
Notifications
You must be signed in to change notification settings - Fork 13.6k
[compiler-rt] Also consider SIGPROF as a synchronous signal #85188
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[compiler-rt] Also consider SIGPROF as a synchronous signal #85188
Conversation
@llvm/pr-subscribers-compiler-rt-sanitizer Author: None (serge-sans-paille) ChangesBlocking that signal causes inter-blocking for profilers that monitor threads through that signal. Full diff: https://github.com/llvm/llvm-project/pull/85188.diff 1 Files Affected:
diff --git a/compiler-rt/lib/tsan/rtl/tsan_interceptors_posix.cpp b/compiler-rt/lib/tsan/rtl/tsan_interceptors_posix.cpp
index 8ffc703b05eace..80dda58aacbf2d 100644
--- a/compiler-rt/lib/tsan/rtl/tsan_interceptors_posix.cpp
+++ b/compiler-rt/lib/tsan/rtl/tsan_interceptors_posix.cpp
@@ -2168,7 +2168,8 @@ static bool is_sync_signal(ThreadSignalContext *sctx, int sig,
return false;
#endif
return sig == SIGSEGV || sig == SIGBUS || sig == SIGILL || sig == SIGTRAP ||
- sig == SIGABRT || sig == SIGFPE || sig == SIGPIPE || sig == SIGSYS;
+ sig == SIGABRT || sig == SIGFPE || sig == SIGPIPE || sig == SIGSYS ||
+ sig == SIGPROF;
}
void sighandler(int sig, __sanitizer_siginfo *info, void *ctx) {
|
Blocking that signal causes inter-blocking for profilers that monitor threads through that signal. Fix llvm#83844 and llvm#83561
028e755
to
da8180f
Compare
Awesome, this indeed does fix our problem with the Firefox Profiler. Thanks! |
thanks @melver for the fast review \o |
Hello there! I believe this change may have caused the following build bot to start failing, are you able to take a look? |
Hi. I think this change is causing some tsan tests to fail on our builders (https://ci.chromium.org/ui/p/fuchsia/builders/toolchain.ci/clang-linux-x64/b8753487020222109921/overview):
Could you take a look and send out a fix or revert? Thanks. |
…lvm#85188)" This reverts commit 6f3f659. This change is causing TSAN failures on a bot: https://lab.llvm.org/buildbot/#/builders/247/builds/15279
…85188)" (#85296) This reverts commit 6f3f659. This change is causing TSAN failures on a bot as reported in #85188: https://lab.llvm.org/buildbot/#/builders/247/builds/15279
Go tit, some tests where using SIGPROF as a way to test some signals, switched to another signal and that works correctly |
Something special about powerpc, can you please take a look? |
hell, I'll have a look on monday |
It's still broken https://lab.llvm.org/buildbot/#/builders/18/builds/15884 |
yeah, reproducing the issue on a power machine is taking me some time. |
You may revert the change until then, and then recommit with the fix. This is our usual practice if it takes too long to come up with the fix. |
ok. I have a fix that works locally, but we're never sure. I'll sumit it and if it doesn't fixes the test case on our buildbot I'll revert. |
…IRTUAL, ...) Followup to #85188.
…IRTUAL, ...) Followup to #85188
It seems to be back to normal \o/ |
Thank you! |
…t builds r=sergesanspaille,glandium Due to SIGPROF being async, it was hanging on some cases because some functions were incorrectly marked non-blocking. This patch is merge to LLVM in: llvm/llvm-project#85188 But we want to patch our clang here to start benefiting from that quickly. We are also patching our rustc here. Even though they are not used by default during our normal builds, this custom rustc is needed for building and running TSan already: https://firefox-source-docs.mozilla.org/tools/sanitizer/tsan.html#llvm-clang-rust Differential Revision: https://phabricator.services.mozilla.com/D204631
The recommit (ddcbab3) causes the compiler-rt/test/tsan/signal_sync2.cpp test to be flaky on x86-64 linux (~25 failures in 1000 runs). |
I'll fix it. |
Otherwise it may crash too early. This is followup to #85188
…IRTUAL, ...) Followup to llvm#85188.
…IRTUAL, ...) Followup to llvm#85188
This looks wrong. SIGPROF is generally not synchronous. Handling it synchronously will cause memory corruptions. |
@dvyukov Thanks for confirming that. I will revert this and related patches. |
Marking functions as blocking with |
I am sorry for your experience. |
For raw syscalls sanitizers provide these hooks, but they need to be inserted manually into code. Potentially tsan can be made to delay signals only for the duration of a single runtime callback, but that's a large endeavor and it's not completely clear how to do it. |
@dvyukov Thanks for the answer!
Oh, do you mean that they need to be inserted by the program that's calling the syscall (Rust stdlib internals in our case) or be inserted somewhere in the compiler-rt? I actually tried adding these but couldn't make them work. I can give it another try.
Yeah, tbh even though this PR wasn't "technically correct", I was happy to have it because current async signal implementation in TSan looks pretty flaky. We have encountered these 2 functions that are marked incorrectly while testing the Firefox Profiler, but there is no guarantee that we would not encounter another one in the future (or when these are solved). Currently nearly all the Firefox Profiler tests are disabled on TSan builds in CI because of hangs and timeouts. And I wanted to enable them to give us more confidence but that proved to be a big challenge. I'm happy to help to make it better but indeed this suggested mechanism sounds like a big change. |
Yes, unfortunately, in the program code. Currently sanitizers don't have any other means to intercept syscalls. |
I would say signal handling is the single major source of problems and fixes for tsan. |
Probably "first time commiters" can't pick reviewers anyway. I guess we suppose to see some notifications, I remember @MaskRay set ed up something. I assumes it will auto-assign at some point? This patch is in https://github.com/orgs/llvm/teams/pr-subscribers-compiler-rt-sanitizer and I guess at least ~12 people have it in mailbox, but didn't act on that yet. |
https://llvm.org/docs/Contributing.html#how-to-submit-a-patch provides some information about reviewer selection. llvm-project notifications are of huge volume and I think nobody can keep track of everything... I personally use quite a few filters and often miss "Team mention" ones (lot of times "Mention" as well; a lot of work is volunteer work).
|
…tsan (#86537) Fixes #83844. This PR adds callbacks to mark futex syscalls as blocking. Unfortunately we didn't have a mechanism before to mark syscalls as a blocking call, so I had to implement it, but it mostly reuses the `BlockingCall` implementation [here](https://github.com/llvm/llvm-project/blob/96819daa3d095cf9f662e0229dc82eaaa25480e8/compiler-rt/lib/tsan/rtl/tsan_interceptors_posix.cpp#L362-L380). The issue includes some information but this issue was discovered because Rust uses futexes directly. So most likely we need to update Rust as well to use these callbacks. Also see the latest comments in #85188 for some context. I also sent another PR #84162 to mark `pthread_*_lock` calls as blocking.
Blocking that signal causes inter-blocking for profilers that monitor threads through that signal.
Fix #83844 and #83561