Skip to content

Check UTF-8 validity for all constant strings on compile time #10853

Open
@mvorisek

Description

@mvorisek

Description

Currently the UTF-8 string validity is checked on demand and cached the string if not interned.

However:

  • when a string is interned, such flag cannot be cached on runtime (at least not due TS)
  • it is not stored back to the source script cache/opcache, ie. the validity is checked on every request at least once
  • when a string is created from unvalidated source before checking, the validation cannot be cached
  • currently only "valid UTF-8" flag is cached , but when a string is "invalid UTF-8", nothing is cached at all

This is a feature request to:
a) check UTF-8 validity on compile time on every const string
b) store a flag if UTF-8 validity was checked or not

Thanks to #10436 the UTF-8 validity check is very fast and the compile time impact should be minimal.

Metadata

Metadata

Assignees

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions