Skip to content

make enum size not depend on the order of variants #131684

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Conversation

adwinwhite
Copy link
Contributor

Fixes #117238.

I fixed the off-by-one error by allocating discriminant starting from the variant after the untagged variant and wrapping around to the first variant when it reaches the last variant.

enum {
    A,           // 1
    B,           // 2
    C(bool),     // untagged_variant, no discriminant
    D,           // has a discriminant of 0
}

Check the detailed algorithm in the definition of TagEncoding::Niche.
This is only applied when the untagged_variant is contained in the niche_variants wasting one value from the niche range.

Undoubtedly this complicates the codegen, but it should not have a large performance penalty.
It adds one arithmetic instruction in the general case and no additional branch instruction.

I'm not totally sure if the complexity is justified, considering the use case is rare.

r? @the8472

@rustbot rustbot added S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. T-compiler Relevant to the compiler team, which will review and decide on the PR/issue. labels Oct 14, 2024
@rustbot
Copy link
Collaborator

rustbot commented Oct 14, 2024

Some changes occurred in compiler/rustc_codegen_cranelift

cc @bjorn3

Some changes occurred to the CTFE / Miri interpreter

cc @rust-lang/miri

@rust-log-analyzer

This comment has been minimized.

@adwinwhite adwinwhite force-pushed the niche-layout-not-depend-on-ordering branch from 1af0330 to 7af5de3 Compare October 14, 2024 13:44
@rust-log-analyzer

This comment has been minimized.

@adwinwhite adwinwhite marked this pull request as draft October 14, 2024 14:57
@adwinwhite adwinwhite force-pushed the niche-layout-not-depend-on-ordering branch from 7af5de3 to 7db99b3 Compare October 15, 2024 01:40
@rust-log-analyzer

This comment has been minimized.

@adwinwhite adwinwhite force-pushed the niche-layout-not-depend-on-ordering branch from 7db99b3 to a420caa Compare October 15, 2024 02:25
@rust-log-analyzer

This comment has been minimized.

@adwinwhite adwinwhite force-pushed the niche-layout-not-depend-on-ordering branch from a420caa to 93eb02e Compare October 15, 2024 04:17
@rust-log-analyzer

This comment has been minimized.

@adwinwhite adwinwhite force-pushed the niche-layout-not-depend-on-ordering branch from 93eb02e to 26cc036 Compare October 15, 2024 05:44
} else {
untagged_variant
if variant_index_relative < u128::from(discr_len) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It looks like there is some duplication of logic here? The if variant_index_relative < u128::from(discr_len) { ... } else { untagged_variant } occurs twice. Can that be cleaned up?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

They have different conditions. I deleted a subtraction expression by mistake.
This reinforces the need for more thorough test coverage.

// We need to use machine arithmetic when taking into account `niche_start`:
// tag_val = variant_index_relative + niche_start_val
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why did you remove this comment? The code below still does the same math as before, doesn't it?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I probably didn't notice that this is a comment rather than outdated code to be deleted.

Copy link
Member

@RalfJung RalfJung Oct 15, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is there any test that actually runs the new get/set discriminant logic? Seems like this test only checks that the size is what it should be.

You added some new if in a critical codepath in codegen and the interpreter. Both of them need thorough test coverage to ensure all branches are tested.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You're right. I definitely should add more tests.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

tests/ui/enum-discriminant/get_discr.rs should be the right place, it already covers a bunch of dense enum encodings.

@adwinwhite adwinwhite marked this pull request as ready for review October 16, 2024 10:08
@rustbot
Copy link
Collaborator

rustbot commented Oct 16, 2024

rust-analyzer is developed in its own repository. If possible, consider making this change to rust-lang/rust-analyzer instead.

cc @rust-lang/rust-analyzer

@bjorn3
Copy link
Member

bjorn3 commented Oct 16, 2024

For reference: It took me a long time to implement the discriminant handling code in cg_clif with all corner cases like discriminants bigger than 64bits correctly handled.

@adwinwhite
Copy link
Contributor Author

For reference: It took me a long time to implement the discriminant handling code in cg_clif with all corner cases like discriminants bigger than 64bits correctly handled.

I'll take a closer look. Is there any specific test to check this? Maybe I can add more test coverage.

@bjorn3
Copy link
Member

bjorn3 commented Oct 16, 2024

It's been a while, so I don't remember all the problems I've had. It was meant more of a cautionary tale for making the discriminant handling code even more complex.

@adwinwhite
Copy link
Contributor Author

I am also kind of hesitant to make it more complex. It's totally fine to close this PR.

Would you mind sharing your thoughts on this? @the8472 @RalfJung

@workingjubilee workingjubilee added A-layout Area: Memory layout of types A-ABI Area: Concerning the application binary interface (ABI) labels Oct 21, 2024
Comment on lines +1513 to +1518
/// enum SomeEnum {
/// A, // 1
/// B, // 2
/// C(bool), // untagged_variant, no discriminant
/// D, // has a discriminant of 0
/// }
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you add the tag/payload values here too, not just the discriminants?

}
ty => fx.bcx.ins().iconst(ty, discr_len as i64),
};
let tagged_discr = fx.bcx.ins().urem(tagged_discr, discr_len);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

discr_len is not a power of two, right? This doesn't look good for performance.

In the motivating issue (#117238) the full discriminant range is essentially packed 2 values for the bool, 254 values for the remaining variants. In that case we can just use wrapping arithmetic of the right type size without any remainder.

So for smaller cases like the A-D case above, can we rearrange the discriminants so that they also fit the natural tag wrap-around?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I didn't realize modulo operation has such bad performance. Now the foundation of this PR is mostly gone.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also from this comment it sounds like it should be possible to fix this issue without changing codegen at all. IIUC, we don't need to change anything about TagEncoding::Niche, we just need to be more clever in how we construct enums with that encoding. Or did I misunderstand?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I basically implemented what that comment suggested, like allocating discriminant starting right after the untagged variant and wrapping around. This changes how we convert variant_index <--> tag, thus codegen.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Codegen needs to compute the tag from the discriminant. The mapping between variant index and tag only exists during compilation and can be adjusted without codegen impact. Currently, enums with Niche encoding make the tag equal to the variant index, but it doesn't have to be that way.

Anyway maybe @the8472 could clarify what they meant in that comment.

/// ```rust,ignore (pseudo-code)
/// adjusted_len = niche_variants.end - niche_variants.start + 1
/// d = tag_value.wrapping_sub(niche_start)
/// variant_index = if niche_variants.contains(untagged_variant) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This code mixes compile-time and runtime behavior. I think that's unnecessarily confusing. It could be split into separate sections (one for interior untagged variants, one without) which each have their own code samples.

Comment on lines +1513 to +1518
/// enum SomeEnum {
/// A, // 1
/// B, // 2
/// C(bool), // untagged_variant, no discriminant
/// D, // has a discriminant of 0
/// }
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also, the given enum already has a size of 1. I think it's the edge-case in #117238 where we need the wraparound. If there's enough room the layout logic already shifts up the entire tag range, including a gap where the untagged variant would otherwise be. It's only when the tag range is full that that becomes a problem.

Well, maybe there also are cases with range-restricted payloads other than bool. Something using rustc_layout_scalar_valid_range_start and rustc_layout_scalar_valid_range_end might also benefit from wraparound.

But generally those should be exceptions, not the rule. So we shouldn't take a more expensive code-path for the normal case.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, we can still wrap around when the tag range is full to avoid the off-by-1 bug. But taking a step back, a clippy warning might be a better approach for this rare case than complicating codegen?

@bors
Copy link
Collaborator

bors commented Oct 28, 2024

☔ The latest upstream changes (presumably #132262) made this pull request unmergeable. Please resolve the merge conflicts.

@alex-semenyuk alex-semenyuk added S-waiting-on-author Status: This is awaiting some action (such as code changes or more information) from the author. and removed S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. labels Dec 6, 2024
@alex-semenyuk
Copy link
Member

@adwinwhite
From wg-triage. Could you please take it seems not all comments were addressed? Beside that can you please rebase due to merge conflicts

@adwinwhite
Copy link
Contributor Author

I think my wrapping around approach is not viable currently.

Let's say we only consider wrapping around when the tag range is full and untagged_variant is contained in niche_variants.
The divisor used by modulo operation is,

discr_len = niche_variants.end - niche_variants.start + 1

The problem is that discr_len is not a power of 2 when the tag range is full. In the specific case of #117238, the discr_len is 255.
Thus we can't simplify the modulo operation into wrapping arithmetic.
And modulo operation has bad performance when the divisor is not a power of 2.

@adwinwhite adwinwhite closed this Dec 10, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
A-ABI Area: Concerning the application binary interface (ABI) A-layout Area: Memory layout of types S-waiting-on-author Status: This is awaiting some action (such as code changes or more information) from the author. T-compiler Relevant to the compiler team, which will review and decide on the PR/issue.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Enum size depends on ordering of members
9 participants