-
Notifications
You must be signed in to change notification settings - Fork 7.9k
Make internal run_time_cache a persistent allocation #15040
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
522803f
to
818f6d5
Compare
Zend/zend_map_ptr.h
Outdated
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What's the purpose of the uintptr_t
casts being moved to intptr_t
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Negative offsets with respect to CG(map_ptr_base). (Still positive compared to map_ptr_real_base).
But having negative offsets for the static ones allows allocating positive offsets for non-static map_ptrs during startup as well.
ZEND_API size_t zend_map_ptr_static_size; | ||
ZEND_API size_t zend_map_ptr_static_last; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Any reason these are mutable globals as opposed to thread-locals?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That's the point - they're supposed to be allocated fully during startup. And startup is the same across all threads.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't see any issues otherwise.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nice!
As the internal runtime cache is never re-allocated after startup due to this change, it looks like that we could store a true pointer in ->run_time_cache
instead of an offset, at least in non-ZTS builds?
@arnaud-lb Yes, in NTS that's possible. |
is this change meant to improve performance? if so, can the diff be measured? |
07a4cce
to
477314f
Compare
a28a5f1
to
671d06d
Compare
I have benchmarked this patch on Symfony Demo with callgrind (with and without observer), using the following commands:
So the patch makes some performance improvement for observer (eliminates call to zend_init_internal_run_time_cache() on each request) and "as a side effect" also makes slight improvement for normal execution (keeps "static" part of map_ptr space unchanged between requests). It would be great to saw this explanation in PR comments... |
#if !ZTS | ||
ZEND_MAP_PTR(zend_ffi_new_fn.run_time_cache) = ZEND_MAP_PTR(((zend_internal_function *)zend_hash_str_find_ptr(&zend_ffi_ce->function_table, "new", sizeof("new")-1))->run_time_cache); | ||
ZEND_MAP_PTR(zend_ffi_cast_fn.run_time_cache) = ZEND_MAP_PTR(((zend_internal_function *)zend_hash_str_find_ptr(&zend_ffi_ce->function_table, "cast", sizeof("cast")-1))->run_time_cache); | ||
ZEND_MAP_PTR(zend_ffi_type_fn.run_time_cache) = ZEND_MAP_PTR(((zend_internal_function *)zend_hash_str_find_ptr(&zend_ffi_ce->function_table, "type", sizeof("type")-1))->run_time_cache); | ||
#endif |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why do we need this?
Why this is done only for non-ZTS build?
Will we have any problems with ZTS build?
Please use ZEND_MAP_PTR_INIT
instead of ZEND_MAP_PTR()
on left size.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm using MAP_PTR_INIT for everything where I assign a value, but in this case I want to copy whatever binary data is, thus I found using ZEND_MAP_PTR directly more fitting and symmetric with the right side.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Also, only need it in NTS, as in ZTS this uses a proper map_ptr which is set when the function is created and thus existing here.
In NTS the pointer is initialized later in the startup sequence and thus needs to be copied back.
ZEND_MAP_PTR_NEW(internal_function->run_time_cache); | ||
#if ZTS | ||
ZEND_MAP_PTR_NEW_STATIC(internal_function->run_time_cache); | ||
#else | ||
ZEND_MAP_PTR_INIT(internal_function->run_time_cache, NULL); | ||
#endif |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can you explain why ZTS makes difference?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We need per-thread pointers for ZTS (run-time cache is local to each request). But for NTS we can, as a slight optimization, just skip the indirect lookup via CG(map_ptr_base).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Probably worth adding a comment here in code.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The patch looks more or less good (I may miss some edge cases in implementation)
Please, answer my questions and make minor clean up.
You don't see any problems for Windows, where different workers have different copies of internal functions?
@dstogov This is not problematic on Windows, as this is fully local to each process. There's zero interaction with shared memory here. |
We also add zend_map_ptr_static, so that we do not incur the overhead of constantly recreating the internal run_time_cache pointers on each request. This mechanism might be extended for mutable_data of internal classes too.
671d06d
to
87f038e
Compare
We also add zend_map_ptr_static, so that we do not incur the overhead of constantly recreating the internal run_time_cache pointers on each request when observers are enabled.
Additionally it saves a bit of zeroing of the map_ptrs on each request, independently of whether observers are actually enabled or not, given that they're below the zend_map_ptr_static_size threshold now.
Comparing instruction counts, it shows about 0.1% improvement without observers.
This mechanism might be extended for mutable_data of internal classes too. And possibly also for preloaded code.