s390x: `__builtin_reduce_and` does not optimize well

given this C code

https://godbolt.org/z/WvfG8TTxf

```c
#include <vecintrin.h>
#include <stdbool.h>

bool vectors_equal_builtin(vector int a, vector int b) {
    return vec_all_eq(a, b);
}

typedef int vec4i __attribute__((vector_size(16)));

bool vectors_equal_manual(vec4i a, vec4i b) {
    return __builtin_reduce_and(a == b);
}
``` 

The manual implementation fails to optimize to the builtin one. 

```asm
vectors_equal_builtin:
        vceqfs  %v0, %v24, %v26
        lghi    %r2, 0
        locghie %r2, 1
        br      %r14

vectors_equal_manual:
        aghi    %r15, -168
        vceqf   %v0, %v24, %v26
        vno     %v0, %v0, %v0
        vlgvf   %r1, %v0, 0
        vlgvf   %r0, %v0, 1
        sll     %r1, 3
        rosbg   %r1, %r0, 61, 61, 2
        vlgvf   %r0, %v0, 2
        rosbg   %r1, %r0, 62, 62, 1
        vlgvf   %r0, %v0, 3
        rosbg   %r1, %r0, 63, 63, 0
        tmll    %r1, 15
        lghi    %r2, 0
        locghie %r2, 1
        aghi    %r15, 168
        br      %r14
```

```llvm
define dso_local noundef zeroext i1 @vectors_equal_manual(<4 x i32> noundef %a, <4 x i32> noundef %b) local_unnamed_addr {
entry:
  %0 = icmp ne <4 x i32> %a, %b
  %1 = bitcast <4 x i1> %0 to i4
  %2 = icmp eq i4 %1, 0
  ret i1 %2
}
``` 

There are many varitions on `vec_all_eq` (see https://www.ibm.com/docs/en/zos/2.4.0?topic=functions-any-predicates), and it would be neat if those all optimized. It might be possible to simplify clang's `vecintrin.h` too. 

This came up while implementing `vec_all_eq` in the rust standard library, where fewer custom intrinsics are better in every way. 

cc @uweigand (posted here so it can be linked to)


Provide feedback

Saved searches

Use saved searches to filter your results more quickly

s390x: `__builtin_reduce_and` does not optimize well #129434

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

s390x: __builtin_reduce_and does not optimize well #129434

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions

s390x: `__builtin_reduce_and` does not optimize well #129434