Skip to content

MessageFormatter parse error improvements #6306

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
wants to merge 2 commits into from

Conversation

pilif
Copy link
Contributor

@pilif pilif commented Oct 9, 2020

Looking into MessageFormatter and the underlying ICU pattern documentation, I was blown away by the functionality offered by the patterns that MessageFormatter understands. This is incredibly useful and I was totally going to look into making use of this in our main product which already relies on intl for number and date formatting, but uses a home-grown solution for message localization.

However, while playing around with this, I came across a usability-issue around MessageFormatter::formatMessage() and especially MessageFormatter::__construct():

As the formatting patterns can be pretty complex, it's likely that they contain parsing errors (especially in my case as I was just getting into this), so it would be helpful to get to details of such parsing errors.

Unfortunately, MessageFormatter::formatMessage() via intl_get_error_message() would just report a generic U_PATTERN_SYNTAX_ERROR and, even worse, MessageFormatter:__construct() would turn any possible internal error into a generic Constructor failed error message.

It doesn't have to be that way though: ICU is capable of reporting additional information about a parser error and indeed, the intl extension is already making use of this in other places.

So this PR contains two commits that

  1. fix the problem for MessageFormatter::formatMessage() and intl_get_error_message() to report the additional parsing error message (see updated tests)
  2. fix MessageFormatter::__construct() to report the actual error message rather than a generic "Constructor failed"

@pilif
Copy link
Contributor Author

pilif commented Oct 9, 2020

oh crap - you've just cut the PHP-8 branch.

If you think this could go into PHP 8.0, I'll rebase and re-target the PR

@Girgias
Copy link
Member

Girgias commented Oct 9, 2020

Test failure is unrelated and has been marked as XFAIL on master.

Copy link
Member

@nikic nikic left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This looks good to me, and I think it should go into PHP 7.4 (unless it's conflict-y). Doesn't change any behavior, just improves diagnostics.

smart_str_free( &parse_error_str );
intl_error_set_code( NULL, INTL_DATA_ERROR_CODE( mfo ) );

if( msg != NULL )
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This null check is unnecessary, spprintf always produces a non-NULL string.

Copy link
Contributor Author

@pilif pilif Oct 12, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'll add a fixup (and rebase on top of 7.4), but FYI, this code is somewhat "heavily inspired" by

parse_error_str = intl_parse_error_to_string( &parse_error );
spprintf( &msg, 0, "transliterator_create_from_rules: unable to "
"create ICU transliterator from rules (%s)", parse_error_str.s? ZSTR_VAL(parse_error_str.s) : "" );
smart_str_free( &parse_error_str );
if( msg != NULL )
{
intl_errors_set_custom_msg( INTL_DATA_ERROR_P( to ), msg, 1 );
efree( msg );
}

I'll make a followup PR to get rid of the null check there too 😝

smart_str_free( &parse_error_str );
intl_error_set_code( NULL, INTL_DATA_ERROR_CODE( mfo ) );

if( msg != NULL )
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Again, this check is not needed.

pilif added 2 commits October 12, 2020 14:45
the message patterns can be pretty complex, so reporting a generic
U_PARSE_ERROR without any additional information makes it needlessly
hard to fix erroneous patterns.

This commit makes use of the additional UParseError* parameter to
umsg_open to retrieve more details about the parse error to report that
to the user via intl_get_error_message()
All possible failures when calling IntlMessage::__construct() would be
masked away with a generic "Constructor failed" message. This would
include invalid patterns.

This commit makes sure that the underlying error that caused the
constructor failure is reported as part of the IntlException error
message.

Additionally, similar to the previous commit, this also uses this
improved error reporting facility to also report details on parsing
errors while trying to parse the message pattern
@pilif pilif force-pushed the intl-message-parse-error-improvements branch from 2d792ce to 6aac147 Compare October 12, 2020 12:49
@pilif
Copy link
Contributor Author

pilif commented Oct 12, 2020

there will be a few conflicts on PHP-7.4 (mostly due to format string changes in tests). I'll make a second PR targeting 7.4 - then you can pick which one you'd be willing to merge.

@pilif
Copy link
Contributor Author

pilif commented Oct 12, 2020

rebased on top of current master (which means that the false-positive failure due to the timelib changes now went away) and implemented the suggestion by @nikic.

I have also made a secondary PR (#6325) on top of PHP-7.4 if you prefer to merge this into 7.4

@nikic
Copy link
Member

nikic commented Oct 12, 2020

Closing this as I merged #6325.

@nikic nikic closed this Oct 12, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants