Description
I have seen BytePos's operations showing up in profiles. In particular, common operations like to_usize() and from_usize() aren't getting inlined. Presumably this is because BytePos is in the libsyntax_pos crate but most of its uses are in other crates and it's not marked as #[inline]
.
Just as an experiment, I tried changing BytePos to just be a typedef of u32
. Results were surprisingly good, with lots of improvements in the 1--4% range.
coercions-check
avg: -2.5% min: -4.5% max: -1.1%
helloworld-check
avg: -4.0% min: -4.2% max: -3.7%
unify-linearly-check
avg: -3.2% min: -3.8% max: -2.6%
deeply-nested-check
avg: -2.8% min: -3.6% max: -2.2%
issue-46449-check
avg: -3.0% min: -3.3% max: -2.8%
parser-check
avg: -2.8% min: -3.3% max: -2.2%
deeply-nested
avg: -1.9% min: -3.1% max: -1.2%
deeply-nested-opt
avg: -1.6% min: -3.1% max: -0.7%
coercions-opt
avg: -1.8% min: -3.1% max: -1.1%
coercions
avg: -1.1% min: -3.0% max: 0.2%
issue-46449
avg: -1.5% min: -2.9% max: -1.2%
issue-46449-opt
avg: -1.1% min: -2.8% max: -0.6%
regression-31157-check
avg: -1.8% min: -2.5% max: -1.1%
tokio-webpush-simple-check
avg: -1.7% min: -2.2% max: -1.3%
tuple-stress
avg: -1.0% min: -2.0% max: -0.5%
unused-warnings-check
avg: -1.4% min: -1.8% max: -1.3%
tuple-stress-check
avg: -0.9% min: -1.8% max: -0.6%
encoding-check
avg: -1.3% min: -1.7% max: -0.9%
encoding-opt
avg: -0.8% min: -1.7% max: -0.1%
tuple-stress-opt
avg: -0.9% min: -1.7% max: -0.5%
encoding
avg: -0.9% min: -1.5% max: -0.0%
regression-31157-opt
avg: -0.6% min: -1.4% max: -0.0%
unify-linearly-opt
avg: -1.2% min: -1.4% max: -1.0%
futures-check
avg: -1.0% min: -1.4% max: -0.5%
parser-opt
avg: -1.3% min: -1.4% max: -1.1%
parser
avg: -1.3% min: -1.4% max: -1.1%
unused-warnings
avg: -1.2% min: -1.4% max: -1.1%
futures
avg: -0.9% min: -1.4% max: -0.4%
unused-warnings-opt
avg: -1.2% min: -1.4% max: -1.1%
futures-opt
avg: -0.8% min: -1.4% max: -0.3%
helloworld
avg: -1.3% min: -1.4% max: -1.3%
helloworld-opt
avg: -1.3% min: -1.4% max: -1.3%
unify-linearly
avg: -1.2% min: -1.4% max: -1.1%
regression-31157
avg: -0.8% min: -1.3% max: -0.4%
syn-check
avg: -0.8% min: -1.3% max: -0.3%
syn-opt
avg: -0.5% min: -1.2% max: -0.0%
regex
avg: -0.7% min: -1.2% max: -0.3%
regex-check
avg: -1.0% min: -1.2% max: -0.6%
regex-opt
avg: -0.5% min: -1.2% max: 0.0%
syn
avg: -0.6% min: -1.2% max: -0.3%
piston-image-check
avg: -0.6% min: -1.1% max: -0.4%
inflate-opt
avg: -0.2% min: -1.1% max: 0.4%
hyper
avg: -0.7% min: -1.1% max: -0.4%
hyper-opt
avg: -0.5% min: -1.0% max: -0.1%
hyper-check
avg: -0.8% min: -1.0% max: -0.5%
html5ever-opt
avg: -0.5% min: -1.0% max: -0.2%
tokio-webpush-simple-opt
avg: -0.3% min: -1.0% max: 0.1%
inflate
avg: -0.4% min: -1.0% max: -0.1%
It definitely seems worth doing something here. Though converting to u32
is not good, because it's having BytePos be a newtype is definitely good for code readability.
Would simply adding #[inline]
or #[inline(always)]
to the relevant methods work? Does that have other consequences, such as increasing compile time of rustc itself?