Skip to content

The behavior of mb_strcut in mbstring has been changed in PHP8.1 #9535

Closed
@pakutoma

Description

@pakutoma

Description

When using the mb_strcut function to cut out ISO-2022-JP encoded strings, depending on the number of bytes specified, the behavior is different in PHP 8.1 than before.

The following code:

<?php
$input = 'あaいb';
$bytes_length = 10;
$encoding = "ISO-2022-JP";
$converted_str = mb_convert_encoding($input, $encoding, mb_internal_encoding());
$cut_str = mb_strcut($converted_str, 0, $bytes_length, $encoding);
$reconverted_str = mb_convert_encoding($cut_str, mb_internal_encoding(), $encoding);
var_dump($reconverted_str);

Resulted in this output:

string(5) "あa?"

But I expected this output instead:

string(4) "あa"

This behavior has changed since PHP 8.1.
3v4l: https://3v4l.org/FaVWR

I believe the behavior up to PHP 8.0 is correct.
This is because PHP 8.1 and later output contains a "?", which is not supposed to be there.

PHP Version

PHP 8.1.10

Operating System

No response

Metadata

Metadata

Assignees

No one assigned

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions