-
Notifications
You must be signed in to change notification settings - Fork 13.6k
[MachineVerifier] Report errors from one thread at a time #111605
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
bd7f0ea
to
985d7c3
Compare
985d7c3
to
4750222
Compare
# RUN: not --crash llc -mtriple=aarch64 -o /dev/null -run-pass=none -verify-machineinstrs %s 2>&1 | FileCheck %s --implicit-check-not="Bad machine code" | ||
# REQUIRES: aarch64-registered-target | ||
|
||
# Check that errors are reported from only one function. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This behavior is really annoying if it's fixable
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Even before this patch, AbortOnError
defaults to true which means we will abort after reporting the first error.
llvm-project/llvm/lib/CodeGen/MachineVerifier.cpp
Lines 393 to 394 in a075e78
if (AbortOnErrors && FoundErrors) | |
report_fatal_error("Found "+Twine(FoundErrors)+" machine code errors."); |
The problem I encountered is that if the report is super long, it's possible that the verifier would find another error and start print it before the program aborts. The output would be unreadable since two separate errors were printing at the same time.
This patch aims to fix that scenario by ensuring a single error gets reported at a time. Since we abort after the first error, this test checks that we only print errors from one function.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I updated the wording to be a bit more clear.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Probably fine, but I don't think we really need the verifier to be thread safe. Step 1 of debugging any issue would be to get a standalone reproducer which fails
We also should probably stop dumping the entire function on failure.
I did find a reproducer, but because it included multiple failing modules, the error was scrambled, so I couldn't minimize the reproducer easily. I couldn't even tell which modules were failing until I introduced a lock in the reporting code.
Makes sense to me, but that should probably be a separate PR. |
Create the `ReportedErrors` class to track the number of reported errors during verification. The class will block reporting errors if some other thread is currently reporting an error. I've encountered a case where there were many different verifications reporting errors at the same time on different threads. This ensures that we don't start printing the error from one case until we are completely done printing errors from other cases. Most of the time `AbortOnError = true` so we usually abort after reporting the first error. Depends on llvm#111602.
Create the
ReportedErrors
class to track the number of reported errors during verification. The class will block reporting errors if some other thread is currently reporting an error.I've encountered a case where there were many different verifications reporting errors at the same time on different threads. This ensures that we don't start printing the error from one case until we are completely done printing errors from other cases. Most of the time
AbortOnError = true
so we usually abort after reporting the first error.Depends on #111602.