Description
Description
The following code:
<?php
var_dump(filter_var('https://sub_domain.example.com', FILTER_VALIDATE_URL));
var_dump(filter_var('https://ex_ample.com', FILTER_VALIDATE_URL));
Resulted in this output:
bool(false)
bool(false)
But I expected this output instead:
string(30) "https://sub_domain.example.com"
string(20) "https://ex_ample.com"
The underscore is a valid character according to the RFC 2396 section 2.3:
Unreserved Characters
Data characters that are allowed in a URI but do not have a reserved
purpose are called unreserved. These include upper and lower case
letters, decimal digits, and a limited set of punctuation marks and
symbols.unreserved = alphanum | mark mark = "-" | "_" | "." | "!" | "~" | "*" | "'" | "(" | ")"
Unreserved characters can be escaped without changing the semantics
of the URI, but this should not be done unless the URI is being used
in a context that does not allow the unescaped character to appear.
But this filter fails if a underscore is present in the domain or subdomain portion of the URL.
This RFC is superseded by RFC 3986, but the underscore is still in the unreserved characters:
unreserved = ALPHA / DIGIT / "-" / "." / "_" / "~"
PHP Version
PHP 8.4.4
Operating System
No response