-
Notifications
You must be signed in to change notification settings - Fork 7.9k
fix: unsafe function casts #15013
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: master
Are you sure you want to change the base?
fix: unsafe function casts #15013
Conversation
This is a weird one. The actual error message talks about array bounds: In file included from /usr/local/src/php/Zend/zend.h:32, from /usr/local/src/php/Zend/zend_API.c:22: In function 'zend_check_magic_method_implementation', inlined from 'zend_register_functions' at /usr/local/src/php/Zend/zend_API.c:3182:4: /usr/local/src/php/Zend/zend_API.c:2759:34: error: array subscript 'zend_function {aka const union _zend_function}[0]' is partly outside array bounds of 'unsigned char[160]' [-Werror=array-bounds] 2759 | if (ZSTR_VAL(fptr->common.function_name)[0] != '_' /usr/local/src/php/Zend/zend_string.h:68:26: note: in definition of macro 'ZSTR_VAL' 68 | #define ZSTR_VAL(zstr) (zstr)->val | ^~~~ /usr/local/src/php/Zend/zend_API.c: In function 'zend_register_functions': /usr/local/src/php/Zend/zend_API.c:3052:32: note: object of size 160 allocated by 'malloc' 3052 | reg_function = malloc(sizeof(zend_internal_function)); | ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ The core issue is that you cannot make a `zend_internal_function *`, and then cast it to a wider `zend_function *`. There are two ways to fix this: 1. Malloc `zend_function` instead and initialize the zend_internal_function member. 2. Cast `zend_internal_function *` to just the narrower common struct, `zend_function_common *`, and work with that. The first option wastes memory. The second option seems better and conceptually helpful to reason about. Note that if we needed to cast from `zend_function_common *` back out to `zend_internal_function *`, that would be safe.
How do you get this error message? (compiler/version/flags). |
I wasn't able to reproduce this either, with the same flags and same gcc version as Levi (we talked on R11). |
For context, from R11:
|
You need to build with # cat config.nice
#! /bin/sh
#
# Created by configure
CFLAGS='-O2 -g1' \
'./configure' \
'--enable-option-checking=fatal' \
'--enable-werror' \
'--prefix=/opt/php/extern-inline' \
'--with-config-file-path=/opt/php/extern-inline/etc/php.ini' \
'--with-config-file-scan-dir=/opt/php/extern-inline/etc/conf.d' \
'--enable-mbstring' \
'--enable-pcntl' \
'--enable-zend-test=static' \
'--disable-debug' \
'--disable-zts' \
'--with-openssl' \
'--with-pear' \
'--with-zip' \
'--with-zlib' \
"$@" |
I see. I can reproduce this with GCC-13.3.1 and |
After all, I think this is a false positive because of a problem in GCC. (CLANG works fine). diff --git a/Zend/zend_API.c b/Zend/zend_API.c
index 00a4df3de3f..daa98a89253 100644
--- a/Zend/zend_API.c
+++ b/Zend/zend_API.c
@@ -2751,8 +2751,8 @@ static void zend_check_magic_method_no_return_type(
ZEND_API void zend_check_magic_method_implementation(const zend_class_entry *ce, const zend_function *fptr, zend_string *lcname, int error_type) /* {{{ */
{
- if (ZSTR_VAL(fptr->common.function_name)[0] != '_'
- || ZSTR_VAL(fptr->common.function_name)[1] != '_') {
+ if (ZSTR_VAL(lcname)[0] != '_'
+ || ZSTR_VAL(lcname)[1] != '_') {
return;
} Do you see any problems? |
I think your patch is good, and probably even preferred for PHP 8.2 and 8.3. I think there is an issue here, though, and GCC is just not being very good about communicating it. So yeah, feel free to merge than in and unblock people building with The core issue here is that it's not valid to cast to a wider thing, unless it really is that thing. We malloc a Fixing this can cause quite a bit of code churn. Maybe it's best to do not anything further than your other PR until after PHP 8.4 is branched, to avoid destabilizing the PHP 8.4 release. |
There's some nuance here. It is defined behavior to cross-cast between two structs sharing a common initial sequence, and accessing this sequence iff the two structs are part of some union. See https://stackoverflow.com/a/57411728/1320374. Edit: And reading that paragraph again, I'm not even sure the interpretation of the StackOverflow answer is correct. Edit 2: Yeah, it doesn't look like it. See https://godbolt.org/z/jYhb79PTd. I suppose we could create some pseudo-union that contains a On a similar note, we commonly cast between other types, e.g. from Edit: Maybe it's §6.5 7?
The wording is really confusing. |
Yes, it's confusing. I could be wrong, however, note all the examples I can find use a narrowing of size. They narrow from some type to one of:
And in this case, it's neither. I also think some of the code around The more I think about it, though, any such changes around removing the But if this in fact bad behavior, the only other correct course of action is to always allocate |
I loaded up some typedefs and forward declares and such, and stuck this into Compiler Explorer. Source 1 defines a function The more I think about it, we don't need
|
Removing For example, with your suggestion: https://godbolt.org/z/j5oY5bPs6 typedef struct {
int x;
} zend_function_common;
typedef struct {
int x;
int y;
} zend_internal_function;
typedef struct {
int x;
int z;
} zend_op_array;
int test1(zend_function_common *f1, zend_op_array *f2) {
if (__builtin_expect(!(f1->x == 1), 0)) __builtin_unreachable();
f2->x = 2;
return f1->x;
} test1:
mov DWORD PTR [rsi], 2
mov eax, 1
ret As you can see, gcc assumes the two pointers cannot alias (i.e. point to the same storage) and hence assumes that https://godbolt.org/z/4dbajTrvv typedef struct {
int x;
} zend_function_common;
typedef struct {
zend_function_common common;
int y;
} zend_internal_function;
typedef struct {
zend_function_common common;
int z;
} zend_op_array;
int test1(zend_function_common *f1, zend_op_array *f2) {
if (__builtin_expect(!(f1->x == 1), 0)) __builtin_unreachable();
f2->common.x = 2;
return f1->x;
} test1:
mov DWORD PTR [rsi], 2
mov eax, DWORD PTR [rdi]
ret So, I do think we're relying on UB, but I'm not sure what the best way forward is. |
Right. Initially in this PR I was focused on fixing one specific mistake. I've been reading up C standard stuff since and it seems php-src has rampant abuse of unions and also pointer conversions. For instance, with unions you cannot set union common {
uint32_t quick_args;
struct {
uint8_t type;
uint8_t arg_flags[3];
};
}; This specific thing isn't hard to fix, just annoying. You use a memcpy, and have to be aware of endianness: union common {
uint8_t type;
// relying on no padding here.
uint8_t arg_flags[3];
// other items
};
zend_always_inline uint32_t quick_args(const union common *common) {
uint32_t quick_args;
memcpy(&quick_args, common, sizeof quick_args);
return quick_args;
} You can see it gets optimized away by the compiler (or at least usually is): https://godbolt.org/z/b7vKc4xvb. For pointer conversions, typedef struct _zend_internal_arg_info {
const char *name;
zend_type type;
const char *default_value;
} zend_internal_arg_info;
typedef struct _zend_arg_info {
zend_string *name;
zend_type type;
zend_string *default_value;
} zend_arg_info; It isn't legal to access one of these through a pointer to the other type, which is what's done with the |
For C, that's wrong. That's called type punning and is specifically supported by C, as long as access happens through the union and not through a pointer. Doing so through a pointer would violate strict aliasing rules. See https://stackoverflow.com/a/25672839/1320374. |
You may be right on type punning. One thing to note is the headers/macros and such also get used by C++, which doesn't have type punning like this. However, I'm not particularly worried about this because it's not really hard to fix (as shown with memcpy). I'm much more concerned about the functions and arginfos. |
This is a weird one. The actual error message talks about array bounds:
And the error message seems to me to be pretty bogus, as nothing actually seems to go out-of-bounds.
But there is an issue here regardless. You cannot malloc a
zend_internal_function *
(160 bytes on this platform), and then cast it to a widerzend_function *
(256 bytes on this platform). There are two ways to fix this:zend_function
instead and initialize the zend_internal_function member.zend_internal_function *
to just the narrower common struct,zend_function_common *
, and work with that.The first option wastes memory. The second option seems better and conceptually helpful to reason about. Note that if we needed to cast from
zend_function_common *
back out tozend_internal_function *
, that would be safe.To be really safe, we ought to make
zend_internal_function
andzend_op_array
have azend_function_common
as their first member, to ensure it's always in correct size + alignment. But since that's already not done today, and would involve a lot of code churn, I'm avoiding it for now.