Skip to content

Combining inline and target_feature attributes with recursive functions causing incorrect results. #53117

Closed
@jackmott

Description

@jackmott

add_stuff is a function with AVX2 simd intrinsics, set to inline always.
add_stuff_helper is a function with the target_feature set to enable AVX2 instructions.

If you run this in a release build, you will get an incorrect result, the return values from the recursive function do not add up. I believe this is because there is code being generated around the recursion that is not getting the avx2 target_feature applied.

If you make add_stuff not recursive, this all works fine.

The code below obviously does not make sense to use by itself, it works for instance if you put the target_feature on the add_stuff function directly, but this technique is useful for doing some nice SIMD metaprogramming with traits, and this bug makes that not work.

Is this a bug that can be fixed? Or an innate limitation of the inlining, target_features, and recursion? Is there a workaround?

#[cfg(target_arch = "x87")]
use std::arch::x86::*;
#[cfg(target_arch = "x86_64")]
use std::arch::x86_64::*;
use std::fmt::Debug;

#[inline(always)]
unsafe fn add_stuff(a: f32, count: i32) -> __m256 {
    let b = _mm256_set1_ps(2.0);
    let a2 = _mm256_set1_ps(a);
    if count < 4 {
        println!("count:{}",count);
        let r = _mm256_add_ps(_mm256_add_ps(a2, b), add_stuff(a, count + 1));
        println!("r:{:?}",r);
        r
    } else {
        _mm256_add_ps(a2, b)
    }
}

#[target_feature(enable = "avx2")]
unsafe fn add_stuff_helper() {
    let r = add_stuff(2.0,1);
    println!("raw avx {:?}",r);
}

fn main() {
    unsafe {
        add_stuff_helper();
    }
}

Environment:

This happens with rustc 1.27 stable through 1.31.0 nightly (at least)
All tested on linux, on a cpu that supports AVX2 instructions. Intel core i7 6700

Metadata

Metadata

Assignees

No one assigned

    Labels

    A-LLVMArea: Code generation parts specific to LLVM. Both correctness bugs and optimization-related issues.A-SIMDArea: SIMD (Single Instruction Multiple Data)T-compilerRelevant to the compiler team, which will review and decide on the PR/issue.

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions