Skip to content

ZEND_INIT_FCALL is only produced when function exists at compile time #7728

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 2 commits into from
Dec 9, 2021

Conversation

Girgias
Copy link
Member

@Girgias Girgias commented Dec 6, 2021

Was a lot of staring and trying to come up with a test case for the undefined function case here while working on function autoloading, but I'm fairly certain this cannot ever happen from me staring at zend_compile.c

@cmb69
Copy link
Member

cmb69 commented Dec 6, 2021

Hmm, some extensions may not like this. @TysonAndre may have more to say about that.

@Girgias
Copy link
Member Author

Girgias commented Dec 6, 2021

I don't know if this is possible to change in extension land, but a possible fix if possible, is to change the compiler globals by adding the ZEND_COMPILE_IGNORE_INTERNAL_FUNCTIONS and ZEND_COMPILE_IGNORE_USER_FUNCTIONS flags to it, which would generate ZEND_INIT_FCALL_BY_NAME opcodes instead of ZEND_INIT_FCALL ones.

@TysonAndre
Copy link
Contributor

TysonAndre commented Dec 7, 2021

I don't know if this is possible to change in extension land, but a possible fix if possible, is to change the compiler globals by adding the ZEND_COMPILE_IGNORE_INTERNAL_FUNCTIONS and ZEND_COMPILE_IGNORE_USER_FUNCTIONS flags to it, which would generate ZEND_INIT_FCALL_BY_NAME opcodes instead of ZEND_INIT_FCALL ones.

Or to change the opcodes pointing to the function in-place in that extension - that extension already has issues when enabling opcache in the latest php versions.


Unrelatedly, opcache.file_cache is another possible concern - right now, (I'm pretty sure) the file cache used is the same regardless of what extensions were added to -d extension=something.so in individual runs, so someone disabling an extension may have applications crash/segfault when calling the function in question.

  • Upgrading or downgrading a shared library(PECL) in place would have the same possible issues, though less commonly
  • Not sure what environments use opcache.file_cache or what triggers clearing that in practice - ideally the file cache folder would be cleared with every php installation upgrade or ini settings change, or file_cache would be used with docker images/machine images that get replaced rather than have software upgraded in place.

function_exists('foo') is also optimized to true if a shared library is loaded with extension=... when the file cache entry is generated for the php file, so code such as if (function_exists('foo')) foo(); would now crash instead of throw with this change.


Ideally, there might be a list of module/zend_module dependencies in the cache generated by opcache (for file cache) to detect this, but I don't remember there being one? (e.g. if a function foo_xyz() was from PECL/php-src module "foo", add a dependency on foo and overwrite that file cache entry)

@cmb69
Copy link
Member

cmb69 commented Dec 8, 2021

Unrelatedly, opcache.file_cache is another possible concern - right now, (I'm pretty sure) the file cache used is the same regardless of what extensions were added to -d extension=something.so in individual runs, so someone disabling an extension may have applications crash/segfault when calling the function in question.

That should be no longer the case with recent PHP versions, since there is now a zend_system_id (see Zend/zend_system_id.c) which is basically a hash of all module names plus some further data, and that zend_system_id is used for the directory names of the file cache (see #5871 for further details).

@Girgias
Copy link
Member Author

Girgias commented Dec 8, 2021

Or to change the opcodes pointing to the function in-place in that extension - that extension already has issues when enabling opcache in the latest php versions.

Just to clarify, are you okay with this change? I'll add an entry to UPGRADING.INTERNALS in any case as it may break other extensions which try to do funky things.

@TysonAndre
Copy link
Contributor

Just to clarify, are you okay with this change? I'll add an entry to UPGRADING.INTERNALS in any case as it may break other extensions which try to do funky things.

It seems like I can get internal function manipulation to continue to work, so I'm fine with this change.


I looked into it some more. Adding notes in case I or other extension developers need them in the future.

  • ZEND_INIT_FCALL_BY_NAME, uses Z_STR_P(((zval*)RT_CONSTANT(opline, opline->op2))+1) but ZEND_INIT_FCALL uses Z_STR_P(((zval*)RT_CONSTANT(opline, opline->op2))) without adding 1. Even subtracting 1 from the zval didn't work because of subsequent changes to arg handling. (I'm not quite sure where the 1 comes from, but knowing doesn't really matter for functionality - the differences in arg handling )
  • There's differences between subsequent arg handling code in ZEND_INIT_FCALL and ZEND_INIT_FCALL_BY_NAME - the former allows specialized handlers for passing arguments to parameters known to be non-references, parameters known to be references, etc., because the function is known.
  • This was why I recalculated stack size instead of changing the opcode in https://github.com/runkit7/runkit7/pull/25/files

I don't know if this is possible to change in extension land, but a possible fix if possible, is to change the compiler globals by adding the ZEND_COMPILE_IGNORE_INTERNAL_FUNCTIONS and ZEND_COMPILE_IGNORE_USER_FUNCTIONS flags to it, which would generate ZEND_INIT_FCALL_BY_NAME opcodes instead of ZEND_INIT_FCALL ones.

I assume there'd be a few percent performance hit from needing to recalculate stack sizes and using less specialized arg handlers (and possibly worse opcache inference), but haven't benchmarked it.

That could probably be controlled by an ini directive such as runkit.allow_global_function_manipulation to avoid the performance hit on applications that didn't global function manipulation support.

CG(compiler_options) seems to be settable by extensions. I tried this and ZEND_INIT_FCALL_BY_NAME was emitted.

	CG(compiler_options) |= ZEND_COMPILE_IGNORE_USER_FUNCTIONS | ZEND_COMPILE_IGNORE_INTERNAL_FUNCTIONS;
  • If that did work, I could also set ZEND_COMPILE_NO_CONSTANT_SUBSTITUTION | ZEND_COMPILE_NO_PERSISTENT_CONSTANT_SUBSTITUTION to fix modifying constants at runtime. Maybe I had just assumed it wouldn't work or not been aware of the option being modifiable (and that persisting even after individual file compilations).

@Girgias
Copy link
Member Author

Girgias commented Dec 8, 2021

I looked into it some more. Adding notes in case I or other extension developers need them in the future.

ZEND_INIT_FCALL_BY_NAME, uses Z_STR_P(((zval*)RT_CONSTANT(opline, opline->op2))+1) but ZEND_INIT_FCALL uses Z_STR_P(((zval*)RT_CONSTANT(opline, opline->op2))) without adding 1. Even subtracting 1 from the zval didn't work because of subsequent changes to arg handling. (I'm not quite sure where the 1 comes from, but knowing doesn't really matter for functionality - the differences in arg handling )

I can explain where the +1 comes from, when a function call is compiled in zend_compile.c and the function name is a literal constant (so 99% of cases) the original name is stored in the first literal cache slot, but it also stores the lowercase name in the next literal cache slot, which is accessed by the +1. So it can do an immediate lookup in the symbol table (as functions are case insensitive). This however doesn't need to happen in the case of ZEND_INIT_FCALL, as for it to be emitted, it will check at compile time that the name exists in the symbol table already.

Similarly ZEND_INIT_NS_FCALL_BY_NAME does the same as ZEND_INIT_FCALL_BY_NAME, but also stores the lowercase function name without any namespace component in the third literal cache slot (accessed via the +2) so that it can fallback onto the global namespace to find a function.

@Girgias Girgias force-pushed the vm-fcall-init-assertion branch from 39e2a14 to 6b987f2 Compare December 9, 2021 09:52
@TysonAndre
Copy link
Contributor

I can explain where the +1 comes from, when a function call is compiled in zend_compile.c and the function name is a literal constant (so 99% of cases) the original name is stored in the first literal cache slot, but it also stores the lowercase name in the next literal cache slot, which is accessed by the +1. So it can do an immediate lookup in the symbol table (as functions are case insensitive). This however doesn't need to happen in the case of ZEND_INIT_FCALL, as for it to be emitted, it will check at compile time that the name exists in the symbol table already.

Similarly ZEND_INIT_NS_FCALL_BY_NAME does the same as ZEND_INIT_FCALL_BY_NAME, but also stores the lowercase function name without any namespace component in the third literal cache slot (accessed via the +2) so that it can fallback onto the global namespace to find a function.

Thanks, makes sense. And zend_undefined_function_helper uses RT_CONSTANT(opline, opline->op2) which is the original name

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants