Description
To solve #248 we are moving away from #[rustc_args_required_const]
and switching to using the recently stabilized const generics instead. The goal is to massively reduce the binary size of libcore.rlib
which is currently bloated due to the massive match
statements needed to properly handle constant arguments for intrinsics.
Instructions
-
Move constant arguments to const generics.
-
Change the
#[rustc_args_required_const(N)]
attribute to#[rustc_legacy_const_generics(N)]
. TheN
arguments to the attribute remains the same. This attribute allows legacy code to continue calling this function by passing constants as a parameter. The legacy support will likely be deprecated in Rust 2021. -
Use the
static_assert!
macro to perform bounds checking on the value. This is necessary because passing invalid value to intrinsics tends to trigger assert failures in LLVM if they reach codegen. Simply usingassert!
ormatch
is not enough since they don't prevent the intrinsic from reaching codegen. -
Eliminate unneeded
constify_*
/shuffle
macros and pass the constant directly to the underlying LLVM intrinsic. -
Update tests to call the function using const generics. Note that
#[rustc_legacy_const_generics]
does not work for functions defined in the same crate.
Example
Before
#[inline]
#[target_feature(enable = "sse")]
#[cfg_attr(test, assert_instr(shufps, mask = 3))]
#[rustc_args_required_const(2)]
#[stable(feature = "simd_x86", since = "1.27.0")]
pub unsafe fn _mm_shuffle_ps(a: __m128, b: __m128, mask: i32) -> __m128 {
let mask = (mask & 0xFF) as u8;
macro_rules! shuffle_done {
($x01:expr, $x23:expr, $x45:expr, $x67:expr) => {
simd_shuffle4(a, b, [$x01, $x23, $x45, $x67])
};
}
macro_rules! shuffle_x67 {
($x01:expr, $x23:expr, $x45:expr) => {
match (mask >> 6) & 0b11 {
0b00 => shuffle_done!($x01, $x23, $x45, 4),
0b01 => shuffle_done!($x01, $x23, $x45, 5),
0b10 => shuffle_done!($x01, $x23, $x45, 6),
_ => shuffle_done!($x01, $x23, $x45, 7),
}
};
}
macro_rules! shuffle_x45 {
($x01:expr, $x23:expr) => {
match (mask >> 4) & 0b11 {
0b00 => shuffle_x67!($x01, $x23, 4),
0b01 => shuffle_x67!($x01, $x23, 5),
0b10 => shuffle_x67!($x01, $x23, 6),
_ => shuffle_x67!($x01, $x23, 7),
}
};
}
macro_rules! shuffle_x23 {
($x01:expr) => {
match (mask >> 2) & 0b11 {
0b00 => shuffle_x45!($x01, 0),
0b01 => shuffle_x45!($x01, 1),
0b10 => shuffle_x45!($x01, 2),
_ => shuffle_x45!($x01, 3),
}
};
}
match mask & 0b11 {
0b00 => shuffle_x23!(0),
0b01 => shuffle_x23!(1),
0b10 => shuffle_x23!(2),
_ => shuffle_x23!(3),
}
}
After
#[inline]
#[target_feature(enable = "sse")]
#[cfg_attr(test, assert_instr(shufps, mask = 3))]
#[rustc_legacy_const_generics(2)]
#[stable(feature = "simd_x86", since = "1.27.0")]
pub unsafe fn _mm_shuffle_ps<const mask: i32>(a: __m128, b: __m128) -> __m128 {
static_assert!(mask: i32 where mask >= 0 && mask <= 255);
simd_shuffle4(
a,
b,
[
mask as u32 & 0b11,
(mask as u32 >> 2) & 0b11,
((mask as u32 >> 4) & 0b11) + 4,
((mask as u32 >> 6) & 0b11) + 4,
],
)
}