Description
Rust's LLVM InstrProf-based source code coverage implementation instruments Rust code via the MIR pass InstrumentCoverage
. Most criteria for identifying coverage regions and counter locations are very general, based on Control Flow Graph (CFG) analysis of the MIR, and a fairly straightforward mapping of MIR Statement
s and Terminator
s to their source code regions (Span
s).
TerminatorKind::Goto
s are an exception, requiring special handling.
This issue is created to highlight some of the unique requirements and issues addressed in the current coverage implementation, in case someone has ideas for improving things, to reduce the reliance on the Goto
-specific logic, either by improving InstrumentCoverage
if something was overlooked, or improving the Goto
representation (such as refining its Span
representation, or providing additional context that InstrumentCoverage
might leverage).
Current State
One of the first steps in the InstrumentCoverage
process is to extract relevant code Span
s from the MIR Statement
s and Terminator
s. (These Span
s are later combined into sets of sequential statements and with contiguous source code regions that can be counted via a single counter; i.e., if any statement in the set was executed, all statements in the same set would also have been executed.)
bcb_to_initial_coverage_spans()
iterates through the BasicBlock
s of the CoverageGraph
(a subset of the MIR, essentially skipping panic/unwind paths), and their Statement
s and Terminator
s. Some Statement
s and Terminator
s are relevant to Coverage
, and others are not. The Statement
and Terminator
filtering is handled by filtered_statement_span()
and filtered_terminator_span()
, respectively.
In almost all cases, if not filtered out, the initial coverage Span
contributed by either a Statement
or a Terminator
is the source_info.span
(within the function body) of the Statement
or Terminator
; because, in most cases, the source code span carried forward from the parsed source to its MIR representation is a fairly accurate mapping from intent to execution.
For example, filtered_terminator_span()
uses the entire source_info.span
for the following TerminatorKind
s:
fn filtered_terminator_span(terminator: &'a Terminator<'tcx>, body_span: Span) -> Option<Span> {
match terminator.kind {
...
// Retain spans from all other terminators
TerminatorKind::Resume
| TerminatorKind::Abort
| TerminatorKind::Return
| TerminatorKind::Call { .. }
| TerminatorKind::Yield { .. }
| TerminatorKind::GeneratorDrop
| TerminatorKind::FalseUnwind { .. }
| TerminatorKind::InlineAsm { .. } => {
Some(function_source_span(terminator.source_info.span, body_span))
}
All other TerminatorKind
s are filtered out, except for Goto
.
Goto
terminators play an important role in the control flow, so they cannot be filtered out, but their source_info.span
typically includes the Span
s of the statements that precede it, making the Span
redundant, in most cases.
One example: `Goto`s are often the targets of `SwitchInt` branches, and certain important optimizations to replace some `Counter`s with `Expression`s require a separate `BasicCoverageBlock` for each branch, to support the `Counter`, when needed.
Since a Goto
-based CoverageSpan
still needs a span to indicate if a region of actual source code was executed or not, the span returned from filtered_terminator_span()
, for Goto
s, is an empty span, positioned at the Goto
span's last byte position:
TerminatorKind::Goto { .. } => {
Some(function_source_span(terminator.source_info.span.shrink_to_hi(), body_span))
}
This byte position can--most often--be leveraged to contribute to a CoverageSpan
for certain execution branches.
For example, an if
block without an else
shows the block was executed if the condition was true
, but there would be no way to indicate coverage (or lack thereof) of the false
branch without using the associated Goto
s hi()
byte position (which is expanded by one character to the left, for a non-empty CoverageSpan
.
However, in other cases, a visible CoverageSpan
is not wanted, but the Goto
block must still be counted (for example, to contribute its count to an Expression
that reports the execution count for some other block). In these cases, the code region
is set to None
.
This decision (whether to include a one-character coverage span for a Goto
or to count a Goto
block without a code region) is handled in inject_coverage_span_counters()
, beginning with the call to is_code_region_redundant()
, which encapsulates the decision on how to handle these special cases.
At the time of this writing, the decision criteria is only looking for Goto
terminators with spans that end at the last byte position in the file, because these Goto
spans--if present--are redundant with the spans from every function's final Return
terminator. When they are present, they can cause the function's last line to appear to have been executed twice, when it was only executed once.