Skip to content

Failure to remove before/after reverse shuffles from loop with decrementing iterator #48960

Open
@RKSimon

Description

@RKSimon
Bugzilla Link 49616
Version trunk
OS Windows NT
CC @anton-afanasyev,@fhahn,@LebedevRI,@rotateright

Extended Description

https://godbolt.org/z/dWTd9e

void loop_dec(unsigned short *p, int max, int n) {
    for (int i = 0; i < (n & 31); i++) {
        unsigned m = *--p;
        *p = (unsigned short)(m >= max ? m-max : 0);
    }
}

By decrementing the pointer in the loop we end up with loop bodies like this:

52:
  %53 = phi i64 [ %41, %40 ], [ %64, %52 ]
  %54 = sub i64 0, %53
  %55 = getelementptr inbounds i16, i16* %51, i64 %54
  %56 = bitcast i16* %55 to <4 x i16>*
  %57 = load <4 x i16>, <4 x i16>* %56, align 2, !tbaa !2
  %58 = shufflevector <4 x i16> %57, <4 x i16> poison, <4 x i32> <i32 3, i32 2, i32 1, i32 0>
  %59 = zext <4 x i16> %58 to <4 x i32>
  %60 = call <4 x i32> @llvm.usub.sat.v4i32(<4 x i32> %59, <4 x i32> %50)
  %61 = trunc <4 x i32> %60 to <4 x i16>
  %62 = shufflevector <4 x i16> %61, <4 x i16> poison, <4 x i32> <i32 3, i32 2, i32 1, i32 0>
  %63 = bitcast i16* %55 to <4 x i16>*
  store <4 x i16> %62, <4 x i16>* %63, align 2, !tbaa !2
  %64 = add i64 %53, 4
  %65 = icmp eq i64 %64, %45
  br i1 %65, label %66, label %52, !llvm.loop !9 

AFAICT we should be able to remove both these 'reverse' shufflevectors.

I'm not sure if we should be trying to fix this in InstCombine/VectorCombine or inside the LoopVectorizer.

Metadata

Metadata

Assignees

No one assigned

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions