Skip to content

Fix GH-12143: Improved handling of adjusting result digits #12162

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Conversation

SakiTakamachi
Copy link
Member

@SakiTakamachi SakiTakamachi commented Sep 9, 2023

This PR fixes #12143

Two types of problems were discovered in this issue, this will fix one of them.

During the "pre round" process that adjusts the number of digits of values ​​greater than 1e14 to 1e14, round() results may lead to incorrect because rounding is done.

Therefore, I have removed the process of rounding and simply converted them to integers using floor or ceil.

incorrect example:

<?php
var_dump(round(1.700000000000145, 13, PHP_ROUND_HALF_UP));
// float(1.7000000000002)

var_dump(round(-1.700000000000145, 13, PHP_ROUND_HALF_UP));
// float(-1.7000000000002)

If you want to figure out what's going on, this comment of @jorgsowa 's is easy to understand.
#12143 (comment)


This change causes the following tests to fail:
https://github.com/php/php-src/blob/PHP-8.1/ext/standard/tests/math/bug24142.phpt

In this test, it added 0.01 using a for loop and checked the result of round().
With this fix, the test was failing due to the following:

// $v === 0.084999999999999992
echo "round({$v}, 2) -> ".round($v, 2)."\n";

// float(0.08)

In the first place, it is impossible to perform accurate calculations with floating point numbers. If we add 0.01 at multiple times, the error will increase.
0.084999999999999992 is not 0.085.

It should not be necessary to absorb such "error accumulation" with round().
Therefore, I stopped using for in the test and rewrote all 9 lines.


(How does everyone do review requests? It seems I don't have that authority.)

@hormus
Copy link

hormus commented Sep 9, 2023

@SakiTakamachi

  1. Use var_dump and not string.
<?php
ini_set('precision', -1);
ini_set('serialize_precision', -1);
$v = 0.005;
for ($i = 0; $i < 9; $i++) {
    $round = round($v, 2);
    $v =  number_format($v, 3);
    $arr[$i] = 'round($v, 2) -> ' . $round; // string
    $v = (float) $v;
    $float[$i] = $v;
    $v = ($v) + (0.01);
}
var_dump($arr, $float); // var_dump use serialize_precision
echo $v; // special precision for zero conversion to string
?>

Expected result:

array(9) {
  [0]=>
  string(23) "round(0.005, 2) -> 0.01"
  [1]=>
  string(23) "round(0.015, 2) -> 0.02"
  [2]=>
  string(23) "round(0.025, 2) -> 0.03"
  [3]=>
  string(23) "round(0.035, 2) -> 0.04"
  [4]=>
  string(23) "round(0.045, 2) -> 0.05"
  [5]=>
  string(23) "round(0.055, 2) -> 0.06"
  [6]=>
  string(23) "round(0.065, 2) -> 0.07"
  [7]=>
  string(23) "round(0.075, 2) -> 0.08"
  [8]=>
  string(23) "round(0.085, 2) -> 0.09"
}
array(9) {
  [0]=>
  float(0.005)
  [1]=>
  float(0.015)
  [2]=>
  float(0.025)
  [3]=>
  float(0.035)
  [4]=>
  float(0.045)
  [5]=>
  float(0.055)
  [6]=>
  float(0.065)
  [7]=>
  float(0.075)
  [8]=>
  float(0.085)
}
0.095

This is a fix for the more precise float rfc which uses -1, it is not recommended to use number_format but it is recommended to write the code without loops instead.
Change -1 to 14 for old test script.

@SakiTakamachi
Copy link
Member Author

SakiTakamachi commented Sep 10, 2023

@hormus

Thank you. However, based on the issue related this test, we should not test with changing the floating point precision.

This is a very convenient method when using php, but I don't think it's suitable for the purpose of this test.

I agree with using var_dump().

https://bugs.php.net/bug.php?id=24142

@SakiTakamachi
Copy link
Member Author

@bukka

Can I ask you for a review?

@bukka
Copy link
Member

bukka commented Sep 10, 2023

I will try to take a proper look to all of this on Thursday or Friday.

Copy link
Member

@bukka bukka left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have been looking through all of this, reading some FP resources the old RFC / email that introduced this and can't really see why that helper should be used in pre-rounding. I might need to do a bit more research but from what I gather it makes sense to do it this way.

Comment on lines +5 to +7
var_dump(round(1.700000000000145, 13, PHP_ROUND_HALF_UP));
var_dump(round(-1.700000000000145, 13, PHP_ROUND_HALF_UP));
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

could you maybe add more tests to cover more large number test cases and different rounding modes.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ideally at least the three numbers in the title of the original issue would explicitly be tested for all rounding modes:

#12143

@bukka
Copy link
Member

bukka commented Sep 15, 2023

Considering that this fixes more edge case and changing somewhat the algorithm I would prefer to target master only.

@SakiTakamachi SakiTakamachi changed the base branch from PHP-8.1 to master September 15, 2023 13:21
@SakiTakamachi SakiTakamachi force-pushed the fix/gh-12143-adjust-result-digits branch from 5cd31c6 to f80526f Compare September 15, 2023 13:23
@SakiTakamachi
Copy link
Member Author

@bukka
I set the target to master and added some test cases.
If we need more test cases, please let me know.

@SakiTakamachi
Copy link
Member Author

SakiTakamachi commented Sep 16, 2023

I was just wondering, isn't the "pre round" process itself unnecessary in the first place?

It may be necessary after all since I failed the test. I take a closer look.


I found that if you don't do pre-round, or rather adjust the number of digits in advance, the helper will not be able to check 0.5, so behavior such as HALF DOWN will become strange.

@SakiTakamachi SakiTakamachi force-pushed the fix/gh-12143-adjust-result-digits branch from a8fc337 to bcf01cf Compare September 20, 2023 00:01
@SakiTakamachi SakiTakamachi requested a review from bukka September 20, 2023 00:17
@SakiTakamachi
Copy link
Member Author

SakiTakamachi commented Sep 20, 2023

@TimWolla @bukka
I think this is a specification issue, but what do you think the following code should look like?

var_dump(round(1.700000000000556, 12, PHP_ROUND_HALF_DOWN));

// float(1.7) or float(1.700000000001) ?
// With the current implementation, it will be float(1.700000000001)

https://3v4l.org/HrlXq


It seems like the behavior is a little counterintuitive.

places = 13
Breakpoint 1, _php_math_round (value=1.700000000000556, places=13, mode=2) at /php-src/ext/standard/math.c:171
171	PHPAPI double _php_math_round(double value, int places, int mode) {
(gdb) n
176		if (!zend_finite(value) || value == 0.0) {
(gdb) 
180		places = places < INT_MIN+1 ? INT_MIN+1 : places;
(gdb) 
181		precision_places = 14 - php_intlog10abs(value);
(gdb) 
183		f1 = php_intpow10(abs(places));
(gdb) p precision_places
$1 = 14
(gdb) n
188		if (precision_places > places && precision_places - 15 < places) {
(gdb) 
189			int64_t use_precision = precision_places < INT_MIN+1 ? INT_MIN+1 : precision_places;
(gdb) 
191			f2 = php_intpow10(abs((int)use_precision));
(gdb) 
192			if (use_precision >= 0) {
(gdb) 
193				tmp_value = value * f2;
(gdb) 
199			tmp_value = (tmp_value >= 0.0) ? floor(tmp_value) : ceil(tmp_value);
(gdb) p tmp_value
$2 = 170000000000055.59
(gdb) n
201			use_precision = places - precision_places;
(gdb) p tmp_value
$3 = 170000000000055
(gdb) n
202			use_precision = use_precision < INT_MIN+1 ? INT_MIN+1 : use_precision;
(gdb) 
204			f2 = php_intpow10(abs((int)use_precision));
(gdb) 
206			tmp_value = tmp_value / f2;
(gdb) 
188		if (precision_places > places && precision_places - 15 < places) {
(gdb) p tmp_value
$4 = 17000000000005.5
(gdb) c
Continuing.
float(1.7000000000005)
places = 12
Breakpoint 1, _php_math_round (value=1.700000000000556, places=12, mode=2) at /php-src/ext/standard/math.c:171
171	PHPAPI double _php_math_round(double value, int places, int mode) {
(gdb) n
176		if (!zend_finite(value) || value == 0.0) {
(gdb) 
180		places = places < INT_MIN+1 ? INT_MIN+1 : places;
(gdb) 
181		precision_places = 14 - php_intlog10abs(value);
(gdb) 
183		f1 = php_intpow10(abs(places));
(gdb) p precision_places
$5 = 14
(gdb) n
188		if (precision_places > places && precision_places - 15 < places) {
(gdb) 
189			int64_t use_precision = precision_places < INT_MIN+1 ? INT_MIN+1 : precision_places;
(gdb) 
191			f2 = php_intpow10(abs((int)use_precision));
(gdb) 
192			if (use_precision >= 0) {
(gdb) 
193				tmp_value = value * f2;
(gdb) 
199			tmp_value = (tmp_value >= 0.0) ? floor(tmp_value) : ceil(tmp_value);
(gdb) p tmp_value
$6 = 170000000000055.59
(gdb) n
201			use_precision = places - precision_places;
(gdb) p tmp_value
$7 = 170000000000055
(gdb) n
202			use_precision = use_precision < INT_MIN+1 ? INT_MIN+1 : use_precision;
(gdb) 
204			f2 = php_intpow10(abs((int)use_precision));
(gdb) 
206			tmp_value = tmp_value / f2;
(gdb) 
188		if (precision_places > places && precision_places - 15 < places) {
(gdb) p tmp_value
$8 = 1700000000000.55
(gdb) c
Continuing.
float(1.700000000001)

others:

var_dump(round(1.50000000000001, 0, PHP_ROUND_HALF_DOWN));
// float(2)

var_dump(round(1.500000000000001, 0, PHP_ROUND_HALF_DOWN));
// float(1)

https://3v4l.org/OHhdT

@SakiTakamachi
Copy link
Member Author

SakiTakamachi commented Sep 20, 2023

I think it makes sense to change the pre-round behavior.

However, I seem to have discovered another problem.

Whether 0.51 becomes 0.0 or 1.0 in HALF_DOWN is simply a matter of specification, but ignoring numbers after 16 floating point digits only confuses users. Therefore, it seems that this also needs to be corrected.

@SakiTakamachi
Copy link
Member Author

SakiTakamachi commented Sep 20, 2023

@TimWolla @bukka

Reading this document, it seems like modes like HALF_DOWN are for determining how edge cases are handled.
And the edge case seems to assume a value of exactly 0.5.

https://wiki.php.net/rfc/rounding

In other words, if you still follow this philosophy, it's behavior should be as follows.

var_dump(round(1.500000000000001, 0, PHP_ROUND_HALF_DOWN));
// to be float(2)

The changes may have a bigger impact than you expected.

@TimWolla TimWolla self-requested a review September 20, 2023 07:06
@SakiTakamachi
Copy link
Member Author

#define DBL_DIG 15
For IBM and others, this would be 16. This is a constant that represents the maximum precision that a floating point number can guarantee accuracy. This is the value referenced by PHP_FLOAT_DIG.

I think this is probably the basis for the "pre round" threshold of round().

If round() were to handle more precision than it currently does, allowing rounding 4503599627370495.5, for example, it would conflict with this specification.

@SakiTakamachi
Copy link
Member Author

It is also possible to implement an implementation that supports rounding with higher precision than DBL_DIG while leaving the digit adjustment based on DBL_DIG as is.

However, in that case, the following specifications will remain unchanged.

var_dump(round(1.50000000000001, 0, PHP_ROUND_HALF_DOWN));
// float(2)

var_dump(round(1.500000000000001, 0, PHP_ROUND_HALF_DOWN));
// float(1)

@SakiTakamachi
Copy link
Member Author

This is not a technical issue, but rather a specification issue, so if you have any opinions, I would love to hear them.

@TimWolla
Copy link
Member

Whether 0.51 becomes 0.0 or 1.0 in HALF_DOWN is simply a matter of specification, but ignoring numbers after 16 floating point digits only confuses users. Therefore, it seems that this also needs to be corrected.

Sorry, I don't understand that. Rounding 0.51 to 0 (i.e. the second parameter being 0) decimal digits should reliably result in 1 for any of the current rounding modes. Rounding it to precision = 1 should result in 0.5 and rounding it to precision = 2 should result in 0.51.

@TimWolla
Copy link
Member

In other words, if you still follow this philosophy, it's behavior should be as follows.

var_dump(round(1.500000000000001, 0, PHP_ROUND_HALF_DOWN));
// to be float(2)

The changes may have a bigger impact than you expected.

Yes, that should result in 2, because the input is clearly across the halfway-mark.

@TimWolla
Copy link
Member

For IBM and others, this would be 16. This is a constant that represents the maximum precision that a floating point number can guarantee accuracy. This is the value referenced by PHP_FLOAT_DIG.

My expectation as a user is that numbers are rounded to the nearest integer (with the halfway point being decided based on the rounding mode) based on actual bit pattern in memory. If I pass in a number that is not exactly representable it will implicitly be rounded. This is unavoidable. But when I then perform explicit rounding to a certain precision I expected it to behave accurately as far as possible. Specifically I expect rounding to behave correctly if I specify a precision of 0 (i.e. rounding to the nearest integer without needing to move around the decimal point).

@TimWolla
Copy link
Member

If round() were to handle more precision than it currently does, allowing rounding 4503599627370495.5, for example, it would conflict with this specification.

Rounding 4503599627370495.5 to the nearest integer (i.e. precision = 0) must either result in 4503599627370495 or 4503599627370496 (depending on the mode), because both results are exactly representable and thus no implicit rounding needs to happen.

@SakiTakamachi
Copy link
Member Author

SakiTakamachi commented Sep 21, 2023

@TimWolla

Sorry, I don't understand that. Rounding 0.51 to 0 (i.e. the second parameter being 0) decimal digits should reliably result in 1 for any of the current rounding modes. Rounding it to precision = 1 should result in 0.5 and rounding it to precision = 2 should result in 0.51.

As mentioned above, the result will be:

var_dump(round(1.500000000000001, 0, PHP_ROUND_HALF_DOWN));
// float(1)

Based on the conversations we have had so far and what we have investigated, we have found that the above results are incorrect, so we are retracting this statement.


My expectation as a user is that numbers are rounded to the nearest integer (with the halfway point being decided based on the rounding mode) based on actual bit pattern in memory. If I pass in a number that is not exactly representable it will implicitly be rounded. This is unavoidable. But when I then perform explicit rounding to a certain precision I expected it to behave accurately as far as possible. Specifically I expect rounding to behave correctly if I specify a precision of 0 (i.e. rounding to the nearest integer without needing to move around the decimal point).

Rounding 4503599627370495.5 to the nearest integer (i.e. precision = 0) must either result in 4503599627370495 or 4503599627370496 (depending on the mode), because both results are exactly representable and thus no implicit rounding needs to happen.

For example, if you try to round 1.5, but internally it is 1.5000000000000001, this will work counterintuitively.

I think that the implementation of the existing code probably takes such phenomena into account.

But 1.5 isn't actually like that, and I haven't found any other numbers like that yet.

So I'm starting to think that my worries may be unfounded.

@SakiTakamachi
Copy link
Member Author

In the first place, "pre-round" seems to be done to widen the scope of application of edge cases.

Therefore, I think the correct answer is to eliminate the "pre-round" processing itself altogether.

@SakiTakamachi
Copy link
Member Author

SakiTakamachi commented Sep 21, 2023

When I tried removing "pre-round "completely, a problem occurred.

var_dump (round (0.285, 2, PHP_ROUND_HALF_UP));
// float(0.28)
removed pre-round
Breakpoint 1, _php_math_round (value=0.28499999999999998, places=2, mode=1) at /php-src/ext/standard/math.c:127
127	PHPAPI double _php_math_round(double value, int places, int mode) {
(gdb) n
131		if (!zend_finite(value) || value == 0.0) {
(gdb) 
135		places = places < INT_MIN+1 ? INT_MIN+1 : places;
(gdb) 
137		f1 = php_intpow10(abs(places));
(gdb) 
140		if (places >= 0) {
(gdb) 
141			tmp_value = value * f1;
(gdb) 
146		if (fabs(tmp_value) >= 1e15) {
(gdb) p tmp_value
$1 = 28.499999999999996
use pre-round + my fix
Breakpoint 1, _php_math_round (value=0.28499999999999998, places=2, mode=1) at /php-src/ext/standard/math.c:171
171	PHPAPI double _php_math_round(double value, int places, int mode) {
(gdb) n
176		if (!zend_finite(value) || value == 0.0) {
(gdb) 
180		places = places < INT_MIN+1 ? INT_MIN+1 : places;
(gdb) 
181		precision_places = 14 - php_intlog10abs(value);
(gdb) 
183		f1 = php_intpow10(abs(places));
(gdb) 
188		if (precision_places > places && precision_places - 15 < places) {
(gdb) 
189			int64_t use_precision = precision_places < INT_MIN+1 ? INT_MIN+1 : precision_places;
(gdb) 
191			f2 = php_intpow10(abs((int)use_precision));
(gdb) 
192			if (use_precision >= 0) {
(gdb) 
193				tmp_value = value * f2;
(gdb) 
199			tmp_value = (tmp_value >= 0.0) ? floor(tmp_value) : ceil(tmp_value);
(gdb) p tmp_value
$1 = 285000000000000

0.285 seems to be internally 0.28499999999999998, so in this case, what do you think is the correct way to behave?


The last digit of the floating point number is really unreliable since something like this happens just by moving the decimal point position...
At first glance, 0.28499999999999998 seems to be correct, but just by moving the decimal point, it becomes 285000000000000, so I don't think there is an answer to "which is correct" in this case.

@SakiTakamachi
Copy link
Member Author

SakiTakamachi commented Sep 21, 2023

I opened another PR.
#12260

Please check it.

This method didn't work. Too many edge cases.


var_dump(round(1.235, 2, PHP_ROUND_HALF_DOWN)); // float(1.24)
var_dump(round(1.234565, 5, PHP_ROUND_HALF_UP)); // float(1.23456)
var_dump(round(1.23456785, 7, PHP_ROUND_HALF_UP)); // float(1.2345678)

...etc

@SakiTakamachi
Copy link
Member Author

SakiTakamachi commented Sep 21, 2023

Pre-round was probably necessary because of this problem.
Apparently I need to think more deeply about these issues.


I am very confused whether to include 0.285 as an edge case or not.

@SakiTakamachi
Copy link
Member Author

SakiTakamachi commented Sep 21, 2023

There are at least two problems.

  1. Whether to consider the value of PHP_FLOAT_DIG (Whether to ignore digits 16 and 17)
  2. Whether to treat a value like 0.285 as an edge case or as 0.28499999999999998

@hormus
Copy link

hormus commented Sep 21, 2023

@SakiTakamachi yes it's the real number for 17 digit

<?php

ini_set('precision', -1);
ini_set('serialize_precision', -1); // -1 from php 7.1 or set 3 for all php version
$float = 0.285;
$resultfloat = round($float, 2);
var_dump('Uses zend_dtoa()\'s mode 0 precision:', $float, $resultfloat);
ini_set('precision', 18);
ini_set('serialize_precision', 18);
var_dump('Uses 18 and 17 precision:', $float, $resultfloat);
/*math.c _php_math_round example without pre_round
var_dump($float * 1e15 / 1e13 / 1e2);*/

?>
``
Expected result:
```code
string(36) "Uses zend_dtoa()'s mode 0 precision:"
float(0.285)
float(0.29)
string(25) "Uses 18 and 17 precision:"
float(0.284999999999999976)
float(0.28999999999999998)

If $float "display" with 17 digit is: 0.28499999999999998, with 18 digit is: 0.284999999999999976 and with 16 digit is: 0.285
if 0.8 "display" with 16 digit is: 0.7999999999999999 and 15 digit is: 0.8

@SakiTakamachi
Copy link
Member Author

I might have come up with a better way. However, I'm very sleepy so I'll continue tomorrow...

@SakiTakamachi
Copy link
Member Author

I've created a PR that completely fixes the issue.
This PR will be closed.

#12268

@SakiTakamachi SakiTakamachi deleted the fix/gh-12143-adjust-result-digits branch September 24, 2023 12:21
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Incorrect round($num, 0, PHP_ROUND_HALF_UP) result for $num = 1.4999999999999998 / 4503599627370495.5
4 participants