Closed
Description
I tried this code:
#![feature(avx512_target_feature)]
#![feature(stdsimd)]
#[target_feature(enable = "lzcnt,avx512f")]
pub unsafe fn test(a: &mut [u64]) {
let mut f = move || {
std::arch::x86_64::_mm512_mask_loadu_epi64(
std::arch::x86_64::_mm512_setzero_si512(),
0,
a.as_mut_ptr() as *mut i64,
);
};
f()
}
I expected to see this happen: all the avx512f functions get inlined into test
Instead, this happened: _mm512_mask_loadu_epi64
did not get inlined into test
codegen: https://godbolt.org/z/PbvexEWd8
here's a repro that mostly doesn't depend on std
#![feature(avx512_target_feature)]
#![feature(stdsimd)]
#![feature(repr_simd)]
#[repr(simd)]
#[allow(non_camel_case_types)]
#[derive(Copy, Clone)]
pub struct __m512i(i64, i64, i64, i64, i64, i64, i64, i64);
#[inline]
#[target_feature(enable = "avx512f")]
pub unsafe fn asm_fn(_: __m512i) -> __m512i {
let mut dst = __m512i(0, 0, 0, 0, 0, 0, 0, 0);
core::arch::asm!(
"/* {dst} */",
dst = inout(zmm_reg) dst,
);
dst
}
#[inline]
#[target_feature(enable = "avx512f")]
unsafe fn do_nothing() {}
#[target_feature(enable = "lzcnt,avx512f")]
pub unsafe fn test() {
({
#[inline(always)]
move || {
do_nothing();
asm_fn(__m512i(0, 0, 0, 0, 0, 0, 0, 0));
}
})();
}
codegen: https://godbolt.org/z/joG9xxq4c
there seem to be quite a few moving parts here.
- if i remove the
lzcnt
feature (or replace it with a feature that is implied by avx512f), - or if i remove the inout parameter in the asm in
asm_fn
, - or if i remove the call to
do_nothing
, - or if i take the code outside the closure,
the function gets inlined as expected, so i'm not sure which is the culprit here
Meta
rustc --version --verbose
:
rustc 1.69.0-nightly (5243ea5c2 2023-02-20)
binary: rustc
commit-hash: 5243ea5c29b136137c36bd773e5baa663790e097
commit-date: 2023-02-20
host: x86_64-unknown-linux-gnu
release: 1.69.0-nightly
LLVM version: 15.0.7