Skip to content

Fixed bug #74371 strip_tags altering attributes #3570

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
wants to merge 3 commits into from

Conversation

htesligte
Copy link

So, my first attempt at contributing something back to PHP :-)

Original bug: https://bugs.php.net/bug.php?id=74371

I'm not really sure whether this fix is the best way, or if the documentation should be improved. I would agree with the bug reporter that the behaviour is hard to understand. So I'm very curious to your opinions!

I'm not the most expert github user, so I'm a bit uncertain about submitting to a specific version.. so if I need to modify this PR please let me know.

--FILE--
<?php

echo strip_tags('<img src="example.jpg" alt=":> :<">', '<img>');
Copy link
Member

@sgolemon sgolemon Oct 2, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Create some additional cases and try to break things.
<img alt=< />
<img alt='foo\' :<' />
etc...

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Your second example is interesting, the entire tag gets removed (also in the current PHP version) because the in_q logic seems to get confused by the \'. From what I can find, escaping a quote using backslash is not allowed in XHTML/HTML (https://stackoverflow.com/a/34880976), but the documentation says that broken attributes are not removed.. What do you think, should we leave the current behaviour intact or try to fix that as well here?

@@ -5123,6 +5123,7 @@ PHPAPI size_t php_strip_tags_ex(char *rbuf, size_t len, uint8_t *stateptr, const
break;
case '<':
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Would it be crazy to encode < and > rather than merely allow them? XHTML and HTML4/5 agree that < > are acceptable forms of <> in attributes.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think that would be the responsibility of strip_tags. So I think if the documentation promises that attributes won't be altered, the function should just allow them.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As strip_tags is commonly misused as a security mechanism, I think it is best to err on the side of caution here and encode < and >. This will also limit the collateral damage if the attribute handling is in some way incorrect.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmm, before I go crazy on this, the return value of php_strip_tags_ex is the new length, and the string is not expected to grow. So I'm a bit uncertain here, change the return value to a new string, which would mean touching all locations where strip_tags is used (in sanitizing_filters.c, file.c and filters.c). Or do you see an alternative way to approach this?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Okay, I'm not sure if the direction I went is the smartest solution, but I just figured let's go a certain route and see what you think of this. My alternative solution was to modify the rbuf parameter to char**.

@php-pulls
Copy link

Comment on behalf of petk at php.net:

Labelling

@php-pulls php-pulls added the Bug label Oct 8, 2018
@krakjoe
Copy link
Member

krakjoe commented Oct 3, 2019

@htesligte can you resolve merge conflicts please ?

@cmb69
Copy link
Member

cmb69 commented Sep 21, 2021

What's the status here?

@cmb69
Copy link
Member

cmb69 commented Dec 15, 2021

Apparently, this PR is abandoned, so I'm closing. If you're still interested in this PR, please fix the merge conflicts and re-open.

Thanks for your work! :)

@cmb69 cmb69 closed this Dec 15, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants