Skip to content

fix: unsafe function casts #15013

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 1 commit into
base: master
Choose a base branch
from

Conversation

morrisonlevi
Copy link
Contributor

@morrisonlevi morrisonlevi commented Jul 18, 2024

This is a weird one. The actual error message talks about array bounds:

In file included from /usr/local/src/php/Zend/zend.h:32,
                 from /usr/local/src/php/Zend/zend_API.c:22:
In function 'zend_check_magic_method_implementation',
    inlined from 'zend_register_functions' at /usr/local/src/php/Zend/zend_API.c:3182:4:
/usr/local/src/php/Zend/zend_API.c:2759:34: error: array subscript 'zend_function {aka const union _zend_function}[0]' is partly outside array bounds of 'unsigned char[160]' [-Werror=array-bounds]
 2759 |         if (ZSTR_VAL(fptr->common.function_name)[0] != '_'
/usr/local/src/php/Zend/zend_string.h:68:26: note: in definition of macro 'ZSTR_VAL'
   68 | #define ZSTR_VAL(zstr)  (zstr)->val
      |                          ^~~~
/usr/local/src/php/Zend/zend_API.c: In function 'zend_register_functions':
/usr/local/src/php/Zend/zend_API.c:3052:32: note: object of size 160 allocated by 'malloc'
 3052 |                 reg_function = malloc(sizeof(zend_internal_function));
      |                                ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

And the error message seems to me to be pretty bogus, as nothing actually seems to go out-of-bounds.

But there is an issue here regardless. You cannot malloc a zend_internal_function * (160 bytes on this platform), and then cast it to a wider zend_function * (256 bytes on this platform). There are two ways to fix this:

  1. Malloc zend_function instead and initialize the zend_internal_function member.
  2. Cast zend_internal_function * to just the narrower common struct, zend_function_common *, and work with that.

The first option wastes memory. The second option seems better and conceptually helpful to reason about. Note that if we needed to cast from zend_function_common * back out to zend_internal_function *, that would be safe.

To be really safe, we ought to make zend_internal_function and zend_op_array have a zend_function_common as their first member, to ensure it's always in correct size + alignment. But since that's already not done today, and would involve a lot of code churn, I'm avoiding it for now.

This is a weird one. The actual error message talks about array bounds:

In file included from /usr/local/src/php/Zend/zend.h:32,
                 from /usr/local/src/php/Zend/zend_API.c:22:
In function 'zend_check_magic_method_implementation',
    inlined from 'zend_register_functions' at /usr/local/src/php/Zend/zend_API.c:3182:4:
/usr/local/src/php/Zend/zend_API.c:2759:34: error: array subscript 'zend_function {aka const union _zend_function}[0]' is partly outside array bounds of 'unsigned char[160]' [-Werror=array-bounds]
 2759 |         if (ZSTR_VAL(fptr->common.function_name)[0] != '_'
/usr/local/src/php/Zend/zend_string.h:68:26: note: in definition of macro 'ZSTR_VAL'
   68 | #define ZSTR_VAL(zstr)  (zstr)->val
      |                          ^~~~
/usr/local/src/php/Zend/zend_API.c: In function 'zend_register_functions':
/usr/local/src/php/Zend/zend_API.c:3052:32: note: object of size 160 allocated by 'malloc'
 3052 |                 reg_function = malloc(sizeof(zend_internal_function));
      |                                ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

The core issue is that you cannot make a `zend_internal_function *`,
and then cast it to a wider `zend_function *`. There are two ways to
fix this:

 1. Malloc `zend_function` instead and initialize the
    zend_internal_function member.
 2. Cast `zend_internal_function *` to just the narrower common struct,
    `zend_function_common *`, and work with that.

The first option wastes memory. The second option seems better and
conceptually helpful to reason about. Note that if we needed to cast
from `zend_function_common *` back out to `zend_internal_function *`,
that would be safe.
@morrisonlevi morrisonlevi marked this pull request as ready for review July 18, 2024 18:14
@dstogov
Copy link
Member

dstogov commented Jul 22, 2024

How do you get this error message? (compiler/version/flags).

@iluuu1994
Copy link
Member

I wasn't able to reproduce this either, with the same flags and same gcc version as Levi (we talked on R11).

@iluuu1994
Copy link
Member

For context, from R11:

# cc --version
cc (GCC) 14.1.0
Copyright (C) 2024 Free Software Foundation, Inc.
This is free software; see the source for copying conditions.  There is NO
warranty; not even for MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.

#  cc -IZend/ -I/usr/local/src/php/Zend/ -I/usr/local/src/php/main -I/usr/local/src/php -I/usr/local/src/php/ext/date/lib -I/usr/include/libxml2 -I/usr/local/src/php/ext/mbstring/libmbfl -I/usr/local/src/php/ext/mbstring/libmbfl/mbfl -I/usr/local/src/php/TSRM -I/usr/local/src/php/Zend -D_GNU_SOURCE -fno-common -Wstrict-prototypes -Wformat-truncation -Wlogical-op -Wduplicated-cond -Wno-clobbered -Wall -Wextra -Wno-unused-parameter -Wno-sign-compare -O2 -g1 -ffp-contract=off -fvisibility=hidden -Wimplicit-fallthrough=1 -DZEND_SIGNALS -Werror -DZEND_ENABLE_STATIC_TSRMLS_CACHE=1 -DSHADOW_STACK_SYSCALL=0 -c /usr/local/src/php/Zend/zend_API.c -MMD -MF Zend/zend_API.dep -MT Zend/zend_API.lo -o Zend/zend_API.o 
In file included from /usr/local/src/php/Zend/zend.h:32,
                 from /usr/local/src/php/Zend/zend_API.c:22:
In function 'zend_check_magic_method_implementation',
    inlined from 'zend_register_functions' at /usr/local/src/php/Zend/zend_API.c:3182:4:
/usr/local/src/php/Zend/zend_API.c:2759:34: error: array subscript 'zend_function {aka const union _zend_function}[0]' is partly outside array bounds of 'unsigned char[160]' [-Werror=array-bounds=]
 2759 |         if (ZSTR_VAL(fptr->common.function_name)[0] != '_'
/usr/local/src/php/Zend/zend_string.h:68:26: note: in definition of macro 'ZSTR_VAL'
   68 | #define ZSTR_VAL(zstr)  (zstr)->val
      |                          ^~~~
/usr/local/src/php/Zend/zend_API.c: In function 'zend_register_functions':
/usr/local/src/php/Zend/zend_API.c:3052:32: note: object of size 160 allocated by 'malloc'
 3052 |                 reg_function = malloc(sizeof(zend_internal_function));
      |                                ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
cc1: all warnings being treated as errors

@morrisonlevi
Copy link
Contributor Author

You need to build with --enable-werror to see this. Here's my complete config.nice. I originally hit this while working on something else and pivoted to master, so ignore the odd prefix/config dirs.

# cat config.nice 
#! /bin/sh
#
# Created by configure

CFLAGS='-O2 -g1' \
'./configure' \
'--enable-option-checking=fatal' \
'--enable-werror' \
'--prefix=/opt/php/extern-inline' \
'--with-config-file-path=/opt/php/extern-inline/etc/php.ini' \
'--with-config-file-scan-dir=/opt/php/extern-inline/etc/conf.d' \
'--enable-mbstring' \
'--enable-pcntl' \
'--enable-zend-test=static' \
'--disable-debug' \
'--disable-zts' \
'--with-openssl' \
'--with-pear' \
'--with-zip' \
'--with-zlib' \
"$@"

@dstogov
Copy link
Member

dstogov commented Jul 22, 2024

I see. I can reproduce this with GCC-13.3.1 and -O2 -g -Wall -Werror.
We usually add -Wall only for debug build with -O0 and miss this problem.
Let me think a bit...

@dstogov
Copy link
Member

dstogov commented Jul 23, 2024

After all, I think this is a false positive because of a problem in GCC. (CLANG works fine).
I wouldn't like to make significant source changes just for a workaround.
I propose a much simple solution.

diff --git a/Zend/zend_API.c b/Zend/zend_API.c
index 00a4df3de3f..daa98a89253 100644
--- a/Zend/zend_API.c
+++ b/Zend/zend_API.c
@@ -2751,8 +2751,8 @@ static void zend_check_magic_method_no_return_type(
 
 ZEND_API void zend_check_magic_method_implementation(const zend_class_entry *ce, const zend_function *fptr, zend_string *lcname, int error_type) /* {{{ */
 {
-	if (ZSTR_VAL(fptr->common.function_name)[0] != '_'
-	 || ZSTR_VAL(fptr->common.function_name)[1] != '_') {
+	if (ZSTR_VAL(lcname)[0] != '_'
+	 || ZSTR_VAL(lcname)[1] != '_') {
 		return;
 	}

Do you see any problems?
I'm going to target this to PHP-8.2.

@morrisonlevi
Copy link
Contributor Author

morrisonlevi commented Jul 23, 2024

I think your patch is good, and probably even preferred for PHP 8.2 and 8.3. I think there is an issue here, though, and GCC is just not being very good about communicating it. So yeah, feel free to merge than in and unblock people building with --enable-werror.

The core issue here is that it's not valid to cast to a wider thing, unless it really is that thing. We malloc a zend_internal_function (160 bytes) and cast the pointer to a zend_function (> 200 bytes). It's just not valid.

Fixing this can cause quite a bit of code churn. Maybe it's best to do not anything further than your other PR until after PHP 8.4 is branched, to avoid destabilizing the PHP 8.4 release.

@iluuu1994
Copy link
Member

iluuu1994 commented Jul 23, 2024

The core issue here is that it's not valid to cast to a wider thing, unless it really is that thing. We malloc a zend_internal_function (160 bytes) and cast the pointer to a zend_function (> 200 bytes). It's just not valid.

There's some nuance here. It is defined behavior to cross-cast between two structs sharing a common initial sequence, and accessing this sequence iff the two structs are part of some union. See https://stackoverflow.com/a/57411728/1320374. Edit: And reading that paragraph again, I'm not even sure the interpretation of the StackOverflow answer is correct. Edit 2: Yeah, it doesn't look like it. See https://godbolt.org/z/jYhb79PTd. test assumes the data pointed to by foo can't change.

I suppose we could create some pseudo-union that contains a zend_function, zend_internal_function and zend_op_array member to make the cast well defined. But I'm not sure if that actually helps with the error.

On a similar note, we commonly cast between other types, e.g. from zend_object * to some parent struct. I was unable to find a reference in the spec that defines this behavior. That said, we do also compile with -Wno-strict-aliasing Edit: First of all, this would only disable the warning (the correct flag is -fno-strict-aliasing) but it's also only set for some FFI files.

Edit: Maybe it's §6.5 7?

An object shall have its stored value accessed only by an lvalue expression that has one of the following types:88)

  • an aggregate or union type that includes one of the aforementioned types among its members (including, recursively, a member of a subaggregate or contained union), or

88) The intent of this list is to specify those circumstances in which an object may or may not be aliased.

The wording is really confusing.

@morrisonlevi
Copy link
Contributor Author

morrisonlevi commented Jul 23, 2024

Yes, it's confusing. I could be wrong, however, note all the examples I can find use a narrowing of size. They narrow from some type to one of:

  1. The struct's members.
  2. A common initial sequence of the members of a union.

And in this case, it's neither.

I also think some of the code around zend_functions are suspect. For instance zend_fcc_addref operates on zend_function objects. It should operate on zend_op_arrays only. This code is incorrect but there are currently no issues in practice or else ASAN would have whined about that one.

The more I think about it, though, any such changes around removing the zend_function union or adding zend_function_common (or some other name for the same struct) to the initial sequence of both zend_internal_function and zend_op_array should be delayed to PHP 9.0. Changing this in a minor branch would be a headache for maintainers and extension developers. I mean, it'd be a headache either way, just more understandable for PHP 9.0.

But if this in fact bad behavior, the only other correct course of action is to always allocate zend_function instead of zend_op_array and zend_internal_function. It wastes bytes, but maybe that's more palatable.

@morrisonlevi
Copy link
Contributor Author

I loaded up some typedefs and forward declares and such, and stuck this into Compiler Explorer. Source 1 defines a function sketchy which mirrors what php-src currently does, and source 2 does not_sketchy and also_not_sketchy which shows two ways to avoid the issue. You can see that sketchy causes a compiler error (-Wall -Wextra -Werror).

The more I think about it, we don't need zend_function. I cannot think of a single case where the engine (or an extension) stores an internal function and replaces it with a user function, or vice versa. That leaves generic stuff:

  1. Stuff that doesn't care about the underlying storage can use zend_function_common * instead.
  2. Stuff that does care about the underlying storage needs to be updated. For instance, zend_call_known_fcc and zend_fcc_addref deal with trampolines and copy zend_function, but aren't these always zend_op_arrays?
  3. Temporaries like dummy frames. I suppose they ought to pick one of the representations, because a dummy must be one of them.

@iluuu1994
Copy link
Member

Removing zend_function would be a massive break. We should be very sure we're 1. solving a problem worth solving and 2. we're doing it in the best way. From my research, I believe that aliasing of zend_function_common and zend_internal_function/zend_op_array would only be allowed if zend_function_common is the first member of each, further exacerbating the BC break.

For example, with your suggestion:

https://godbolt.org/z/j5oY5bPs6

typedef struct {
    int x;
} zend_function_common;

typedef struct {
    int x;
    int y;
} zend_internal_function;

typedef struct {
    int x;
    int z;
} zend_op_array;

int test1(zend_function_common *f1, zend_op_array *f2) {
    if (__builtin_expect(!(f1->x == 1), 0)) __builtin_unreachable();
    f2->x = 2;
    return f1->x;
}
test1:
        mov     DWORD PTR [rsi], 2
        mov     eax, 1
        ret

As you can see, gcc assumes the two pointers cannot alias (i.e. point to the same storage) and hence assumes that f2->x = 2 cannot modify f1, according to strict aliasing rules. Two fix this, as mentioned, both structures need to share zend_function_common:

https://godbolt.org/z/4dbajTrvv

typedef struct {
    int x;
} zend_function_common;

typedef struct {
    zend_function_common common;
    int y;
} zend_internal_function;

typedef struct {
    zend_function_common common;
    int z;
} zend_op_array;

int test1(zend_function_common *f1, zend_op_array *f2) {
    if (__builtin_expect(!(f1->x == 1), 0)) __builtin_unreachable();
    f2->common.x = 2;
    return f1->x;
}
test1:
        mov     DWORD PTR [rsi], 2
        mov     eax, DWORD PTR [rdi]
        ret

So, I do think we're relying on UB, but I'm not sure what the best way forward is. zend_function is used ~830 in php-src, so any change would likely be massive.

@morrisonlevi
Copy link
Contributor Author

morrisonlevi commented Jul 30, 2024

Right. Initially in this PR I was focused on fixing one specific mistake. I've been reading up C standard stuff since and it seems php-src has rampant abuse of unions and also pointer conversions.

For instance, with unions you cannot set type/arg_flags and then access them through quick_args:

union common {
    uint32_t quick_args;
    struct {
        uint8_t type;
        uint8_t arg_flags[3];
    };
};

This specific thing isn't hard to fix, just annoying. You use a memcpy, and have to be aware of endianness:

union common {
    uint8_t type;
    // relying on no padding here.
    uint8_t arg_flags[3];

    // other items
};

zend_always_inline uint32_t quick_args(const union common *common) {
    uint32_t quick_args;
    memcpy(&quick_args, common, sizeof quick_args);
    return quick_args;
}

You can see it gets optimized away by the compiler (or at least usually is): https://godbolt.org/z/b7vKc4xvb.

For pointer conversions, zend_arg_info and zend_internal_arg_info are not compatible:

typedef struct _zend_internal_arg_info {
    const char *name;
    zend_type type;
    const char *default_value;
} zend_internal_arg_info;

typedef struct _zend_arg_info {
    zend_string *name;
    zend_type type;
    zend_string *default_value;
} zend_arg_info;

It isn't legal to access one of these through a pointer to the other type, which is what's done with the common part of zend_function. It's really easy to run afoul of standards violations here, because there isn't good tooling to strictly enforce/warn about it or even detect it. It could be solved with unions (properly, obviously there are footguns there too) or with properly re-interpreting bytes (with memcpy, not pointer casts).

@iluuu1994
Copy link
Member

For instance, with unions you cannot set type/arg_flags and then access them through quick_args:

For C, that's wrong. That's called type punning and is specifically supported by C, as long as access happens through the union and not through a pointer. Doing so through a pointer would violate strict aliasing rules. See https://stackoverflow.com/a/25672839/1320374.

@morrisonlevi
Copy link
Contributor Author

You may be right on type punning. One thing to note is the headers/macros and such also get used by C++, which doesn't have type punning like this. However, I'm not particularly worried about this because it's not really hard to fix (as shown with memcpy).

I'm much more concerned about the functions and arginfos.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants