Skip to content

Commit 282510d

Browse files
lntuesibidanov
authored andcommitted
[libc][math] Improve the performance of sqrtf128. (llvm#122578)
Use a combination of polynomial approximation and Newton-Raphson iterations in 64-bit and 128-bit integers to improve the performance of sqrtf128. The correct rounding is provided by squaring the result and comparing it with the argument. Performance improvement using the newly added perf test: - My function = the improved implementation from this PR - Other function = current implementation using `libc/src/__support/FPUtil/generic/sqrt.h` ``` Performance tests with inputs in denormal range: -- My function -- Total time : 1260765265 ns Average runtime : 125.951 ns/op Ops per second : 7939623 op/s -- Other function -- Total time : 7160726518 ns Average runtime : 715.357 ns/op Ops per second : 1397902 op/s -- Average runtime ratio -- Mine / Other's : 0.176067 Performance tests with inputs in normal range: -- My function -- Total time : 373003808 ns Average runtime : 37.2631 ns/op Ops per second : 26836189 op/s -- Other function -- Total time : 7353398916 ns Average runtime : 734.605 ns/op Ops per second : 1361275 op/s -- Average runtime ratio -- Mine / Other's : 0.0507254 ``` --------- Co-authored-by: Alexei Sibidanov <[email protected]>
1 parent 8ae85b2 commit 282510d

17 files changed

+648
-38
lines changed

libc/src/__support/big_int.h

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -241,7 +241,7 @@ LIBC_INLINE constexpr void quick_mul_hi(cpp::array<word, N> &dst,
241241
}
242242

243243
template <typename word, size_t N>
244-
LIBC_INLINE constexpr bool is_negative(cpp::array<word, N> &array) {
244+
LIBC_INLINE constexpr bool is_negative(const cpp::array<word, N> &array) {
245245
using signed_word = cpp::make_signed_t<word>;
246246
return cpp::bit_cast<signed_word>(array.back()) < 0;
247247
}

libc/src/math/generic/CMakeLists.txt

Lines changed: 5 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -2978,6 +2978,11 @@ add_entrypoint_object(
29782978
HDRS
29792979
../sqrtf128.h
29802980
DEPENDS
2981+
libc.src.__support.CPP.bit
2982+
libc.src.__support.FPUtil.fenv_impl
2983+
libc.src.__support.FPUtil.fp_bits
2984+
libc.src.__support.FPUtil.rounding_mode
2985+
libc.src.__support.macros.optimization
29812986
libc.src.__support.macros.properties.types
29822987
libc.src.__support.FPUtil.sqrt
29832988
)

0 commit comments

Comments
 (0)