Skip to content

Fix #76093: Format strings w/o loss of precision w/ FORMAT_TYPE_DECIMAL #12432

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 2 commits into
base: master
Choose a base branch
from

Conversation

lucaswerkmeister
Copy link

Resubmit of #9909, which was itself a resubmit of #7782. Rebased onto latest master. Request #76093 is classified as a feature/change request rather than a bug, which I think I agree with, so IIUC this is supposed to be based on master rather than PHP-8.0.

@lucaswerkmeister
Copy link
Author

Hm, there’s a CI error I’m not sure I understand :/

runtime error: 1e+19 is outside the range of representable values of type 'long' – UndefinedBehaviorSanitizer
========DIFF========
--
       ["currency"]=>
       string(29) "$9,999,999,999,999,999,999.00"
     }
029+ /__w/php-src/php-src/ext/intl/formatter/formatter_format.c:105:13: runtime error: 1e+19 is outside the range of representable values of type 'long'
030+ SUMMARY: UndefinedBehaviorSanitizer: undefined-behavior /__w/php-src/php-src/ext/intl/formatter/formatter_format.c:105:13 in 
     array(6) {
       ["input"]=>
       float(1.0E+19)
--
========DONE========
FAIL Bug #76093 (NumberFormatter::format loses precision) [ext/intl/tests/bug76093.phpt] 

If I understand correctly, the UB is in the if branch of the following part of the diff:

@@ -79 +103,7 @@ PHP_FUNCTION( numfmt_format )
-			int64_t value = (Z_TYPE_P(number) == IS_DOUBLE)?(int64_t)Z_DVAL_P(number):Z_LVAL_P(number);
+			int64_t value;
+			if (Z_TYPE_P(number) == IS_DOUBLE) {
+				value = (int64_t)Z_DVAL_P(number);
+			} else {
+				convert_to_long(number);
+				value = Z_LVAL_P(number);
+			}

But according to that diff, the cast already existed (in a ternary instead of if/else). So is the error only showing up because this previously wasn’t hit by a test?

@lucaswerkmeister
Copy link
Author

Hm, I think this comment in the test is referring to the same thing, actually:

# Also, casting from double to int64 when the int64 range
# is exceeded results in an implementation-defined value.

Comment on lines 107 to 106
convert_to_long(number);
value = Z_LVAL_P(number);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This will cast non-numeric strings to 0, which is not what should happen.

Same as the other cases using convert_to_long() or convert_to_double()

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Okay, what should happen instead?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Or to put the question slightly differently: I’m not very familiar with the php-src codebase, and couldn’t even find documentation on what exactly convert_to_long() does – if there’s another paradigm for what the code should do with non-numeric strings, such as throwing an error, can you point me to another place in PHP that uses this paradigm, so I can see how it’s implemented there?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please use the zval_try_get_long() function. Convert changes the zval in place and does a forced (int) cast.

This is a problem for non-numeric strings, and floats that have a fractional part.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I switched it to zval_try_get_long() (in a separate commit), but it seems to introduce a lot of deprecation warnings that are currently causing tests to fail. Should I add the deprecations to the expected test output, or should the deprecations not happen after all?

Also added a throw for cases where the INT32 type is used with numbers that exceed 32 bits of precision, but I’m not sure whether that’s what you had in mind or not. Happy to change this part.

convert_to_double() not converted, because I couldn’t find a zval_try_get_double() or equivalent function.

Copy link
Member

@Girgias Girgias left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not sure adding a ZPP modifier for one (/two) case is totally worth it, as the logic inside of it doesn't even really utilize the benefit of it as values may still be cast around within the function body.

In any case, Stringable objects need to be handle properly.

Comment on lines 107 to 106
convert_to_long(number);
value = Z_LVAL_P(number);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please use the zval_try_get_long() function. Convert changes the zval in place and does a forced (int) cast.

This is a problem for non-numeric strings, and floats that have a fractional part.

}
INTL_METHOD_CHECK_STATUS( nfo, "Number formatting failed" );
break;

case FORMAT_TYPE_CURRENCY:
if (getThis()) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Aside Nit: this could now be if (object)

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I tried it, but it caused several test failures (seemingly object was still null when I wouldn’t have expected it to be).

Comment on lines +136 to +145
if (!try_convert_to_string(number)) {
RETURN_THROWS();
}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This could be simplified as at this point the value should be int|float|string and integers and floats always have string representations.

To improve precision handling in cases where the number doesn’t fit into
a 32-bit or 64-bit integer. A float that doesn’t fit into a long will
produce a deprecation warning; for NumberFormatter::TYPE_INT32, a number
that doesn’t fit into 32 bits will additionally produce a ValueError.
(This is inconsistent, but I’m not sure how to do it better.)
@lucaswerkmeister
Copy link
Author

Rebased and updated – reverted the argument parsing to zend_parse_method_parameters() style (with z instead of n, and manual zend_argument_type_error() for unexpected argument type), and used zval_try_get_long(). Stringable handling still TBD.

@lucaswerkmeister
Copy link
Author

lucaswerkmeister commented Nov 27, 2023

(I’m pretty sure I force-with-lease-pushed a new version that squashed the WIP into two commits again and also added Stringable support, but it’s not showing up in GitHub yet. Hopefully that’ll resolve itself. It resolved itself.)

Passing the argument to NumberFormat::format() as a number loses
precision if the value can not be represented precisely as a
double or long integer.  The icu library provides a "decimal number"
type that avoids the loss of prevision when the value is passed
as a string.  Add a new FORMAT_TYPE_DECIMAL to explicitly request
the argument be converted to a string and then passed to icu
that way.

Co-authored-by: Gina Peter Banyard <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants