Skip to content

Consider using of tzcnt instruction for better performance in foreach loop and iterator of bitsets #11418

Open
@plokhotnyuk

Description

@plokhotnyuk

This instruction has a low latency and throughput on most contemporary CPUs.

Also it is implemented as intrinsic and turned on by default in JDK 8:

$ java -XX:+PrintFlagsFinal -version | grep Trailing
     bool UseCountTrailingZerosInstruction          = true                                {ARCH product}
java version "1.8.0_192"
Java(TM) SE Runtime Environment (build 1.8.0_192-b12)
Java HotSpot(TM) 64-Bit Server VM (build 25.192-b12, mixed mode)

Below is a snippet how it can be used:

  override def foreach[U](f: Int => U): Unit = {
    val n = nwords
    var i = 0
    while (i < n) {
      var w = word(i)
      while (w != 0) {
        f(java.lang.Long.numberOfTrailingZeros(w) + (i << 6))
        w &= w - 1
      }
      i += 1
    }
  }

Inspired by the Daniel Lemire's blog post.

Metadata

Metadata

Assignees

No one assigned

    Type

    No type

    Projects

    No projects

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions