Skip to content

Commit a8c8fb2

Browse files
nielsdosGirgias
authored andcommitted
Fix incorrect check in cs_8559_5 in map_from_unicode()
The condition `code == 0x0450 || code == 0x045D` is always false because of an incorrect range check on code. According to the BMP coverage in the encoding spec for ISO-8859-5 (https://encoding.spec.whatwg.org/iso-8859-5-bmp.html) the range of valid characters is 0x0401 - 0x045F (except for 0x040D, 0x0450, 0x045D). The current check has an upper bound of 0x044F instead of 0x045F. Fix this by changing the upper bound. Closes GH-10399 Signed-off-by: George Peter Banyard <[email protected]>
1 parent b7a158a commit a8c8fb2

File tree

3 files changed

+16
-15
lines changed

3 files changed

+16
-15
lines changed

NEWS

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -15,6 +15,7 @@ PHP NEWS
1515
- Standard:
1616
. Fixed bug GH-10292 (Made the default value of the first param of srand() and
1717
mt_srand() unknown). (kocsismate)
18+
. Fix incorrect check in cs_8559_5 in map_from_unicode(). (nielsdos)
1819

1920
02 Feb 2023, PHP 8.1.15
2021

ext/standard/html.c

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -477,7 +477,7 @@ static inline int map_from_unicode(unsigned code, enum entity_charset charset, u
477477
*res = 0xF0; /* numero sign */
478478
} else if (code == 0xA7) {
479479
*res = 0xFD; /* section sign */
480-
} else if (code >= 0x0401 && code <= 0x044F) {
480+
} else if (code >= 0x0401 && code <= 0x045F) {
481481
if (code == 0x040D || code == 0x0450 || code == 0x045D)
482482
return FAILURE;
483483
*res = code - 0x360;

ext/standard/tests/strings/html_entity_decode_iso8859-5.phpt

Lines changed: 14 additions & 14 deletions
Original file line numberDiff line numberDiff line change
@@ -358,47 +358,47 @@ CYRILLIC SMALL LETTER YA: &#x44F; => ef
358358
NUMERO SIGN: &#x2116; => f0
359359
&#xF0; => &#xF0;
360360

361-
CYRILLIC SMALL LETTER IO: &#x451; => 2623783435313b
361+
CYRILLIC SMALL LETTER IO: &#x451; => f1
362362
&#xF1; => &#xF1;
363363

364-
CYRILLIC SMALL LETTER DJE: &#x452; => 2623783435323b
364+
CYRILLIC SMALL LETTER DJE: &#x452; => f2
365365
&#xF2; => &#xF2;
366366

367-
CYRILLIC SMALL LETTER GJE: &#x453; => 2623783435333b
367+
CYRILLIC SMALL LETTER GJE: &#x453; => f3
368368
&#xF3; => &#xF3;
369369

370-
CYRILLIC SMALL LETTER UKRAINIAN IE: &#x454; => 2623783435343b
370+
CYRILLIC SMALL LETTER UKRAINIAN IE: &#x454; => f4
371371
&#xF4; => &#xF4;
372372

373-
CYRILLIC SMALL LETTER DZE: &#x455; => 2623783435353b
373+
CYRILLIC SMALL LETTER DZE: &#x455; => f5
374374
&#xF5; => &#xF5;
375375

376-
CYRILLIC SMALL LETTER BYELORUSSIAN-UKRAINIAN I: &#x456; => 2623783435363b
376+
CYRILLIC SMALL LETTER BYELORUSSIAN-UKRAINIAN I: &#x456; => f6
377377
&#xF6; => &#xF6;
378378

379-
CYRILLIC SMALL LETTER YI: &#x457; => 2623783435373b
379+
CYRILLIC SMALL LETTER YI: &#x457; => f7
380380
&#xF7; => &#xF7;
381381

382-
CYRILLIC SMALL LETTER JE: &#x458; => 2623783435383b
382+
CYRILLIC SMALL LETTER JE: &#x458; => f8
383383
&#xF8; => &#xF8;
384384

385-
CYRILLIC SMALL LETTER LJE: &#x459; => 2623783435393b
385+
CYRILLIC SMALL LETTER LJE: &#x459; => f9
386386
&#xF9; => &#xF9;
387387

388-
CYRILLIC SMALL LETTER NJE: &#x45A; => 2623783435413b
388+
CYRILLIC SMALL LETTER NJE: &#x45A; => fa
389389
&#xFA; => &#xFA;
390390

391-
CYRILLIC SMALL LETTER TSHE: &#x45B; => 2623783435423b
391+
CYRILLIC SMALL LETTER TSHE: &#x45B; => fb
392392
&#xFB; => &#xFB;
393393

394-
CYRILLIC SMALL LETTER KJE: &#x45C; => 2623783435433b
394+
CYRILLIC SMALL LETTER KJE: &#x45C; => fc
395395
&#xFC; => &#xFC;
396396

397397
SECTION SIGN: &#xA7; => fd
398398
&#xFD; => &#xFD;
399399

400-
CYRILLIC SMALL LETTER SHORT U: &#x45E; => 2623783435453b
400+
CYRILLIC SMALL LETTER SHORT U: &#x45E; => fe
401401
&#xFE; => &#xFE;
402402

403-
CYRILLIC SMALL LETTER DZHE: &#x45F; => 2623783435463b
403+
CYRILLIC SMALL LETTER DZHE: &#x45F; => ff
404404
&#xFF; => &#xFF;

0 commit comments

Comments
 (0)