Skip to content

[AVX-512] llvm.experimental.vector.compress emits a vector-zeroing instruction instead of using {z} #113263

Closed
@Validark

Description

@Validark

Godbolt link

define dso_local <64 x i8> @compress(<64 x i8> %0, i64 %1) local_unnamed_addr {
Entry:
  %2 = bitcast i64 %1 to <64 x i1>
  %3 = tail call fastcc <64 x i8> @llvm.experimental.vector.compress.v64i8(<64 x i8> %0, <64 x i1> %2, <64 x i8> zeroinitializer)
  ret <64 x i8> %3
}

declare fastcc <64 x i8> @llvm.experimental.vector.compress.v64i8(<64 x i8>, <64 x i1>, <64 x i8>) #1

Compiled for Zen 5, we get:

compress:
.Lcompress$local:
        kmovq   k1, rdi
        vpxor   xmm1, xmm1, xmm1
        vpcompressb     zmm1 {k1}, zmm0
        vmovdqa64       zmm0, zmm1
        ret

The vpxor is unnecessary. We could just use the {z} variant.

Metadata

Metadata

Assignees

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions