refactor lint handling to be more "eager"

Lint handling at present works like this:

- During most of compilation, when we see something we might want to lint about, we insert this potential lint into a [mutable table in the session](https://github.com/rust-lang/rust/blob/f062832b208e94f2f0f26ed7fb5c48c172069fbe/src/librustc/session/mod.rs#L79). This comes along with a `NodeId` that identifies the AST/HIR node that the lint is attached to, along with the lint name, message, etc.
- We then have a dedicated pass that walks over the HIR. At each point, it tracks which lints are "allowed", "denied", etc. At the same time, it extracts entries from this table and reports them. It also invokes "lint visitors" that may generate additional lints.
    - In particular, some lints are just a custom pass, but others are generated as a side-effect of other things (like type-checking). It just depends on when the information that is needed is most readily available.

This is not a very good setup. Among other things:

- It requires us to generate a big table of "pending" lints, even in the case where a lint is `#[allow]`. Wasteful.
- It means that lints all get displayed fairly late in compilation -- this limits their value, since often a lint is detected early which might be useful in solving a compilation error, but we can't report the lint.
- It's not very incremental friendly because of the Big Mutable Table.

@eddyb proposed an alternative architecture that I think is better.

- In HIR lowering, we would compute which lints are in scope at each point. We would store this information in the HIR in some way (some thoughts on this below). 
- When we detect (in some phase) a lintable scenario, we would check *immediately* whether the lint is "allow", "warn", etc, and -- if needed -- issue the appropriate diagnostic right then.
- We can still run the custom lint code, it would simply query the HIR settings in the same way.

The best way to store this lint table is not 100% clear. Something sparse seems obviously like a good idea, since most nodes in the HIR do not have lint-related attributes on them (and hence just inherit the settings from their parent). I think maybe having each item store a link to the "current" settings for each lint makes sense; this can be easily shared between items. Then, *within* an item, if we need to compute the settings for some particular node, we would walk up the parents in the HIR map, looking for lint-related attributes and -- when we reach the item -- just using the settings stored on the item itself. 

Alternatively, we could just cache nothing and compute *everything* this way, at least to start. If more caching is desired, maybe have passes keep a local cache that would then be populated within the things within a particular item.

I'm available to mentor this. It is a bit of a "design needed" sort of job, so I'm not sure what the immediate instructions would be. The first step is obviously familiarizing yourself with the current setup. After that, I would probably try to make a commit that adds a method to the HIR map that will compute the innermost applicable lint setting for a particular node (just by walking up the parents). We could then use this and remove all the existing lint code for tracking. That should work, but be slow (O(n^2) lookups). Presumably we can then adopt some sort of caching strategy.

@eddyb also mentioned that for bonus points, we should probably "decode" attributes during HIR lowering more generally, so that we are not walking the general attribute data structures, but rather something specific to the stuff the compiler cares about. 



Provide feedback

Saved searches

Use saved searches to filter your results more quickly

refactor lint handling to be more "eager" #42511

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

refactor lint handling to be more "eager" #42511

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions