Skip to content

s390x: manual vec_subc_u128 is not recognized #129608

Closed
@folkertdev

Description

@folkertdev

https://godbolt.org/z/EjoWhj8MM

I expect these to optimize to the same output, but they do not:

define noundef <16 x i8> @vec_subc_u128_intrinsic(<16 x i8> %a, <16 x i8> %b) unnamed_addr {
start:
  %0 = bitcast <16 x i8> %a to i128
  %1 = bitcast <16 x i8> %b to i128
  %_3 = tail call noundef i128 @llvm.s390.vscbiq(i128 noundef %0, i128 noundef %1) #3
  %2 = bitcast i128 %_3 to <16 x i8>
  ret <16 x i8> %2
}

define <16 x i8> @vec_subc_u128_manual(<16 x i8> %a, <16 x i8> %b) unnamed_addr {
start:
  %0 = bitcast <16 x i8> %a to i128
  %1 = bitcast <16 x i8> %b to i128
  %_8.1 = icmp uge i128 %0, %1
  %_5 = zext i1 %_8.1 to i128
  %2 = bitcast i128 %_5 to <16 x i8>
  ret <16 x i8> %2
}

declare i128 @llvm.s390.vscbiq(i128, i128) unnamed_addr #2

The equivalent with vec_addc_u128 does get optimized into just a vaccq instruction. For the subtraction here we get this:

vec_subc_u128_intrinsic:
        vscbiq  %v24, %v24, %v26
        br      %r14

.LCPI1_0:
        .quad   0
        .quad   1
vec_subc_u128_manual:
        veclg   %v24, %v26
        jlh     .LBB1_2
        vchlgs  %v0, %v26, %v24
.LBB1_2:
        ipm     %r0
        xilf    %r0, 268435456
        afi     %r0, 1879048192
        vlvgp   %v0, %r0, %r0
        larl    %r1, .LCPI1_0
        vl      %v1, 0(%r1), 3
        vrepib  %v2, 31
        vsrlb   %v0, %v0, %v2
        vsrl    %v0, %v0, %v2
        vn      %v24, %v0, %v1
        br      %r14

which is unfortunate.

in the addition case, we see

define <16 x i8> @vec_addc_u128_manual(<16 x i8> %a, <16 x i8> %b) unnamed_addr {
start:
  %0 = bitcast <16 x i8> %a to i128
  %1 = bitcast <16 x i8> %b to i128
  %2 = tail call { i128, i1 } @llvm.uadd.with.overflow.i128(i128 %0, i128 %1)
  %_7.1 = extractvalue { i128, i1 } %2, 1
  %_5 = zext i1 %_7.1 to i128
  %3 = bitcast i128 %_5 to <16 x i8>
  ret <16 x i8> %3
}

so here the @llvm.uadd.with.overflow.i128 is explicitly there. That won't work for the signed overflowing subtraction, which is too clever and just performs a compare.

Metadata

Metadata

Assignees

No one assigned

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions