-
Notifications
You must be signed in to change notification settings - Fork 7.9k
Fixed bug #74371 strip_tags altering attributes #3570
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
--FILE-- | ||
<?php | ||
|
||
echo strip_tags('<img src="example.jpg" alt=":> :<">', '<img>'); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Create some additional cases and try to break things.
<img alt=< />
<img alt='foo\' :<' />
etc...
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Your second example is interesting, the entire tag gets removed (also in the current PHP version) because the in_q
logic seems to get confused by the \'
. From what I can find, escaping a quote using backslash is not allowed in XHTML/HTML (https://stackoverflow.com/a/34880976), but the documentation says that broken attributes are not removed.. What do you think, should we leave the current behaviour intact or try to fix that as well here?
@@ -5123,6 +5123,7 @@ PHPAPI size_t php_strip_tags_ex(char *rbuf, size_t len, uint8_t *stateptr, const | |||
break; | |||
case '<': |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Would it be crazy to encode < and > rather than merely allow them? XHTML and HTML4/5 agree that < > are acceptable forms of <> in attributes.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't think that would be the responsibility of strip_tags. So I think if the documentation promises that attributes won't be altered, the function should just allow them.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
As strip_tags is commonly misused as a security mechanism, I think it is best to err on the side of caution here and encode < and >. This will also limit the collateral damage if the attribute handling is in some way incorrect.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hmm, before I go crazy on this, the return value of php_strip_tags_ex
is the new length, and the string is not expected to grow. So I'm a bit uncertain here, change the return value to a new string, which would mean touching all locations where strip_tags is used (in sanitizing_filters.c, file.c and filters.c). Or do you see an alternative way to approach this?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Okay, I'm not sure if the direction I went is the smartest solution, but I just figured let's go a certain route and see what you think of this. My alternative solution was to modify the rbuf parameter to char**.
Comment on behalf of petk at php.net: Labelling |
@htesligte can you resolve merge conflicts please ? |
What's the status here? |
Apparently, this PR is abandoned, so I'm closing. If you're still interested in this PR, please fix the merge conflicts and re-open. Thanks for your work! :) |
So, my first attempt at contributing something back to PHP :-)
Original bug: https://bugs.php.net/bug.php?id=74371
I'm not really sure whether this fix is the best way, or if the documentation should be improved. I would agree with the bug reporter that the behaviour is hard to understand. So I'm very curious to your opinions!
I'm not the most expert github user, so I'm a bit uncertain about submitting to a specific version.. so if I need to modify this PR please let me know.