Skip to content

[libc++] codecvt<wchar_t, char, mbstate_t>::do_in returns ok but not partial when having incompletely parsed UTF-8 input #131107

Open
@debugee

Description

@debugee

sample

debug config

launch.json

{
    // Use IntelliSense to learn about possible attributes.
    // Hover to view descriptions of existing attributes.
    // For more information, visit: https://go.microsoft.com/fwlink/?linkid=830387
    "version": "0.2.0",
    "configurations": [
        {
            "type": "lldb",
            "request": "launch",
            "name": "Launch",
            "program": "${workspaceFolder}/build/testcpp",
            "args": [],
            "cwd": "${workspaceFolder}",
            "initCommands": [
                "settings set target.process.thread.step-avoid-regexp \"\"",
            ],
            "console": "integratedTerminal",
            "env": {
                "LANG":"en_US.UTF-8"
            }
        }
    ]
}

please add "LANG":"en_US.UTF-8"

Image

in file locale.cpp
codecvt<wchar_t, char, mbstate_t>::do_in

codecvt<wchar_t, char, mbstate_t>::result codecvt<wchar_t, char, mbstate_t>::do_in(

have bug when trans utf-8 to wchar_t

when input 你123

do_in trans uncompleted utf-8 bytes with first byte 0xe4
and return ok, but correct is partial

so bug come out

Metadata

Metadata

Assignees

No one assigned

    Labels

    c++libc++libc++ C++ Standard Library. Not GNU libstdc++. Not libc++abi.localeissues related to localization

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions