Description
TL;DR https://play.golang.org/p/LPkPTRF7fC
The above code looks quite plain and obvious, except it does something completely different than you'd expect (feel free to run it). The obvious thing that should happen is that it counts the number of bits set in the given string. The non-obvious thing that happens is that the mask is actually 0
, not 0x01
.
So, what happened there? The abuse in the above code is around the invisible Unicode characters that mark following text to be right-to-left or left-to-right. Since Go permits arbitrary Unicode characters to be present in string literals, it also allows me to have a string of the form "bla bla blaabc". Since we're dealing with valid Unicode sequences here, any modern editor/website will actually interpret those special characters, causing the content in between the two special marks to be reversed to the end of the line (alas still part of the string literal).
In my playground code this is abused by having the following source code:
str, mask := "Hello, World!<rtl>10x<ltr>", 0
Which will be displayed by all modern editors/websites as:
str, mask := "Hello, World!", 0x01
The security aspect of this issue is social engineering attacks. The "display" line of my sample code is obvious beyond doubt, so noone will ever inspect such a thing; however it managed to flip one bit in a mask (imagine doing this for file permission umasks). The issue is that such code could easily get past reviews and into a final product.
The tricky part is how to avoid these issues. My only meaningful suggestion would be to disallow these two special characters in string literals. This does break Go 1.0 compatibility guarantees, however I think it's worth it (I can't really figure out a meaningful use case for it). Using it in a single line string will screw up the display of the source code, so it doesn't make sense to do it, and using it in a multi line string is questionable at best. I think requiring users to explicitly use \x
notation for adding these characters it a good compromise to protect source code sanity.