Respect script execution time limit when invoking shutdown functions #5543

alexdowad · 2020-05-07T18:28:38Z

A time limit can be set on PHP script execution via set_time_limit() (or .ini file). When the time limit is reached, the OS will notify PHP and a timed_out flag is set. While this flag is regularly checked when executing PHP code, once the end of the script is reached, it is not checked while invoking shutdown functions (registered via register_shutdown_function).

Of course, if the shutdown functions are implemented in PHP, then the script time limit will be checked while they are running and will take effect. But if the shutdown functions are built-in (implemented in C), it will not.

Since the shutdown functions are invoked through call_user_function_ex, add a check of the timed_out flag there. Then, the script time limit will be respected when entering each shutdown function. The fact still remains that if a shutdown function is built-in and runs for a long time, script execution will not time out until it finishes and the interpreter tries to invoke the next one.

Still, the behavior of scripts with execution time limits will be more consistent after this patch. To make the execution time-out feature work even more precisely, it would be necessary to scrutinize all the built-in functions and add checks of the timed_out flag in any which can run for a long time. That might not be worth the effort, though.

Zend/zend_execute_API.c

ext/opcache/jit/zend_jit_x86.dasc

alexdowad · 2020-05-08T19:14:30Z

Grr. There is what looks like a spurious failure in one of the test jobs on Windows (the other one passed). Ran the same test case more than 10,000 times on my Linux box and still couldn't reproduce the failure. (Though the test case is not skipped on Linux.)

Zend/zend_execute_API.c

nikic · 2020-05-11T11:31:01Z

Diff for the AppVeyor failure was:

========DIFF========
004+ 
004- Shutdown
005+ Fatal error: Maximum execution time of 1 second exceeded in Unknown on line 0
========DONE========

So we have the time limit triggered twice. Something definitely not right there. The flag not getting reset somehow?

alexdowad · 2020-05-11T11:33:45Z

Diff for the AppVeyor failure was:
========DIFF========
004+ 
004- Shutdown
005+ Fatal error: Maximum execution time of 1 second exceeded in Unknown on line 0
========DONE========
So we have the time limit triggered twice. Something definitely not right there. The flag not getting reset somehow?

I'm just studying this one.

alexdowad · 2020-05-11T12:49:42Z

And we're green!

I adjusted the call to zend_timeout in zend_call_function so it is only triggered when invoking an internal function, not a user function.

nikic · 2020-05-11T13:00:04Z

The Windows situation here still looks fishy. Looking at the zend_timeout implementation:

php-src/Zend/zend_execute_API.c

Line 1107 in 091d53c

ZEND_API ZEND_NORETURN void ZEND_FASTCALL zend_timeout(int dummy) /* {{{ */

It's quite different on Windows, and indeed doesn't seem to reset timed_out.

alexdowad · 2020-05-11T13:18:08Z

The Windows situation here still looks fishy. Looking at the zend_timeout implementation:

php-src/Zend/zend_execute_API.c

Line 1107 in 091d53c

ZEND_API ZEND_NORETURN void ZEND_FASTCALL zend_timeout(int dummy) /* {{{ */

It's quite different on Windows, and indeed doesn't seem to reset timed_out.

Indeed. I have looked at that function many times but haven't completely figured out what is happening on Windows. The unfortunate thing is that since I don't have a Windows box, I can't run it in a debugger and see what it actually does.

nikic · 2020-05-11T13:31:14Z

Zend/zend_execute_API.c

+		 * So see whether we timed out while the function was running... */
+		if (EG(timed_out)) {
+			zend_timeout();
+		}


After thinking about this a bit, I think instead of checking timed_out specifically here, we should do a VM interrupt check as in

php-src/Zend/zend_vm_def.h

Line 9206 in 5d93eab

ZEND_VM_HELPER(zend_interrupt_helper, ANY, ANY)

. After all the core issue here applies not just to internal timeouts, but also pcntl signal handling, I think.

OK. I think I have figured out what you mean...

pcntl sets a signal handler which sets vm_interrupt to 1. That causes execution to bounce into pcntl's zend_interrupt_function next time the VM checks the interrupt flag. But that only happens when the VM is running PHP bytecode.

If you are running native shutdown functions, and pcntl is enabled, and a signal comes along, it will set the vm_interrupt flag but pcntl's zend_interrupt_function will never be called.

Do you think pcntl really should care about what is happening during shutdown, though? Maybe?

Do you think pcntl really should care about what is happening during shutdown, though? Maybe?

During shutdown probably isn't important, but there are other cases where we could run into this, at least in theory. For example, consider something like array_map('expensive_internal_function', $array). This would allow a signal handler to be executed in the middle of the array_map, rather than at the end.

Maybe it's possible to test this with a combination of InfiniteIterator and iterator_apply.

Aha. Wow, I never thought of that. OK.

I've added an extra test in 7f35610. I was hoping to show that this leaks the time_nanosleep return value (which should be an array in this case), but due to the way array_map is implemented this doesn't happen...

I think we should be repeating the EG(exception) check from a few lines above if the VM interrupt is executed, to make sure that code not freeing the retval on exception doesn't leak...

alexdowad · 2020-05-11T13:37:35Z

The Windows situation here still looks fishy. Looking at the zend_timeout implementation:

php-src/Zend/zend_execute_API.c

Line 1107 in 091d53c

ZEND_API ZEND_NORETURN void ZEND_FASTCALL zend_timeout(int dummy) /* {{{ */

It's quite different on Windows, and indeed doesn't seem to reset timed_out.

That code was added by @weltling in 2016... wonder if he can comment on why the timed_out flag is not reset on Windows?

cmb69 · 2020-05-11T15:57:29Z

It seems to me that EG(timed_out) is never used on Windows after zend_timeout() has been called (unless for the case where it is reset). Note that zend_timeout_handler() is POSIX only.

Anyhow, CreateTimerQueueTimer() is legacy (imported from threadpoollegacyapiset.h), and setitimer() is obsolescent, so we should consider to replace them. Also, tq_timer_cb() doesn't look thread-safe.

alexdowad · 2020-05-12T05:50:04Z

@nikic @cmb69 As usual, I appreciate the thorough review. I also appreciate your approach of ensuring that the issue at hand is thoroughly understood before merging any attempted solution, even if it seems to work.

It is a fact that I don't understand everything about the existing code. For example, I am puzzled as to why tq_timer_cb accesses executor globals with eq->global = value rather than the EG macro.

I may study it a bit more, do some refactoring, and if these are accepted, rebase this PR on the refactored master.

cmb69 · 2020-05-12T06:32:53Z

For example, I am puzzled as to why tq_timer_cb accesses executor globals with eq->global = value rather than the EG macro.

As I understand it, the callback can be run by an arbitrary thread (i.e. not necessarily the thread which scheduled the timer).

nikic · 2020-05-12T07:48:11Z

Anyhow, CreateTimerQueueTimer() is legacy (imported from threadpoollegacyapiset.h), and setitimer() is obsolescent, so we should consider to replace them. Also, tq_timer_cb() doesn't look thread-safe.

There's also https://bugs.php.net/bug.php?id=79464. On Linux we could switch from setitimer to timer_create(), but this functionality is not available on macos...

alexdowad · 2020-05-12T08:23:21Z

For example, I am puzzled as to why tq_timer_cb accesses executor globals with eq->global = value rather than the EG macro.

As I understand it, the callback can be run by an arbitrary thread (i.e. not necessarily the thread which scheduled the timer).

Wow, I never thought of that. It makes sense now.

I wonder if there should be some kind of memory barrier or something to make sure the changes are visible on the thread which "owns" the executor globals in question.

alexdowad · 2020-05-12T08:40:43Z

There's also https://bugs.php.net/bug.php?id=79464.

Interesting. I think I'll use the OP's C++ code when investigating the issue of script execution timeouts on ZTS builds...

cmb69 · 2020-05-12T12:33:03Z

I wonder if there should be some kind of memory barrier or something to make sure the changes are visible on the thread which "owns" the executor globals in question.

It seems to me this is not a problem (presuming that our TLS usage is correct). However, I think we need some form of synchronization to avoid potential race conditions regarding tq_timer_cb().

nikic · 2020-05-12T15:17:10Z

I went ahead and committed f096087 and f83a23e separately, as those are not really related to this change. I think for the purpose of this PR I'd mainly like to see the switch to a vm_interrupt rather than timed_out check, the other issues are only tangentially related.

nikic · 2020-05-12T16:56:05Z

Might be worthwhile to run PHP under thread sanitizer sometime to check for synchronization issues...

cmb69 · 2020-05-12T17:02:13Z

I'm afraid TSan is not available on Windows.

A time limit can be set on PHP script execution via `set_time_limit` (or .ini file). When the time limit is reached, the OS will notify PHP and `timed_out` and `vm_interrupt` flags are set. While these flags are regularly checked when executing PHP code, once the end of the script is reached, they are not checked while invoking shutdown functions (registered via `register_shutdown_function`). Of course, if the shutdown functions are implemented *in* PHP, then the interrupt flag will be checked while the VM is running PHP bytecode and the timeout will take effect. But if the shutdown functions are built-in (implemented in C), it will not. Since the shutdown functions are invoked through `zend_call_function`, add a check of the `vm_interrupt` flag there. Then, the script time limit will be respected when *entering* each shutdown function. The fact still remains that if a shutdown function is built-in and runs for a long time, script execution will not time out until it finishes and the interpreter tries to invoke the next one. Still, the behavior of scripts with execution time limits will be more consistent after this patch. To make the execution time-out feature work even more precisely, it would be necessary to scrutinize all the built-in functions and add checks of the `vm_interrupt` flag in any which can run for a long time. That might not be worth the effort, though. It should be mentioned that this patch does not solely affect shutdown functions, neither does it solely allow for interruption of running code due to script execution timeout. Anything else which causes `vm_interrupt` to be set, such as the PHP interpreter receiving a signal, will take effect when exiting from an internal function. And not just internal functions which are called because they were registered to run at shutdown; there are other cases where a series of internal functions might run in the midst of a script. In all such cases, it will be possible to interrupt the interpreter now.

alexdowad · 2020-05-12T20:50:55Z

I think for the purpose of this PR I'd mainly like to see the switch to a vm_interrupt rather than timed_out check, the other issues are only tangentially related.

OK, trying something along those lines...

alexdowad · 2020-05-13T05:36:16Z

Please have a look now.

alexdowad · 2020-05-13T08:58:00Z

Hmm. So both `zend_timeout` and `zend_interrupt_function` never return? I believe that that extensions can install their own `zend_interrupt_function`. As such, we may not be able to tell whether it will return or not. So is it not possible that the retval might still be needed? I think that `zend_timeout` never returns, so we could certainly free it in that case. Tell me if what I'm saying makes sense or not.

…

On Wed, May 13, 2020 at 10:20 AM Nikita Popov ***@***.***> wrote: ***@***.**** commented on this pull request. ------------------------------ In Zend/zend_execute_API.c <#5543 (comment)>: > @@ -830,6 +830,12 @@ int zend_call_function(zend_fcall_info *fci, zend_fcall_info_cache *fci_cache) / /* We must re-initialize function again */ fci_cache->function_handler = NULL; } + + /* This flag is regularly checked while running user functions, but not internal + * So see whether we timed out while the function was running... */ + if (EG(timed_out)) { + zend_timeout(); + } I've added an extra test in 7f35610 <7f35610>. I was hoping to show that this leaks the time_nanosleep return value (which should be an array in this case), but due to the way array_map is implemented this doesn't happen... I think we should be repeating the EG(exception) check from a few lines above if the VM interrupt is executed, to make sure that code not freeing the retval on exception doesn't leak... — You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub <#5543 (comment)>, or unsubscribe <https://github.com/notifications/unsubscribe-auth/AAIESX5G2QE4SWJJZUIO4JTRRJJ55ANCNFSM4M3RZA2A> .

nikic · 2020-05-13T10:45:54Z

Hmm. So both zend_timeout and zend_interrupt_function never return? I believe that that extensions can install their own zend_interrupt_function. As such, we may not be able to tell whether it will return or not. So is it not possible that the retval might still be needed? I think that zend_timeout never returns, so we could certainly free it in that case. Tell me if what I'm saying makes sense or not.

zend_timeout() is noreturn and causes unclean shutdown, so it's okay to leak. The case I have in mind is if zend_interrupt_function throws an exception, which means that zend_call_function throws an exception, and the caller might not free the return value. Though after checking use-sites, it looks like code does free the return value independently of whether an exception was thrown or not. I'll go ahead and merge this in its current form...

alexdowad · 2020-05-13T11:00:22Z

I wonder if there should be some kind of memory barrier or something to make sure the changes are visible on the thread which "owns" the executor globals in question.

It seems to me this is not a problem (presuming that our TLS usage is correct). However, I think we need some form of synchronization to avoid potential race conditions regarding tq_timer_cb().

@cmb69, any suggestion of what kind of synchronization to use? I don't know the codebase well enough yet to know what is already there.

cmb69 · 2020-05-13T12:37:21Z

There is a set of "mutex" functions (actually, these are implemented as critical sections on Windows) which would be appropriate.

nikic reviewed May 7, 2020

View reviewed changes

Zend/zend_execute_API.c Outdated Show resolved Hide resolved

alexdowad force-pushed the timeout-in-shutdown-functions branch 2 times, most recently from 392cb1d to 33d8ce6 Compare May 8, 2020 13:16

cmb69 reviewed May 8, 2020

View reviewed changes

ext/opcache/jit/zend_jit_x86.dasc Outdated Show resolved Hide resolved

nikic reviewed May 11, 2020

View reviewed changes

Zend/zend_execute_API.c Outdated Show resolved Hide resolved

alexdowad force-pushed the timeout-in-shutdown-functions branch from 0df76cd to a24b789 Compare May 11, 2020 11:18

nikic reviewed May 11, 2020

View reviewed changes

alexdowad force-pushed the timeout-in-shutdown-functions branch from a24b789 to b9fc273 Compare May 12, 2020 20:49

Add pcntl signals test

7f35610

php-pulls closed this in 555489d May 13, 2020

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Respect script execution time limit when invoking shutdown functions #5543

Respect script execution time limit when invoking shutdown functions #5543

alexdowad commented May 7, 2020

alexdowad commented May 8, 2020

nikic commented May 11, 2020

alexdowad commented May 11, 2020

alexdowad commented May 11, 2020

nikic commented May 11, 2020

alexdowad commented May 11, 2020

nikic May 11, 2020

alexdowad May 11, 2020

nikic May 11, 2020

alexdowad May 12, 2020

nikic May 13, 2020

alexdowad commented May 11, 2020 •

edited

Loading

cmb69 commented May 11, 2020

alexdowad commented May 12, 2020

cmb69 commented May 12, 2020

nikic commented May 12, 2020

alexdowad commented May 12, 2020

alexdowad commented May 12, 2020

cmb69 commented May 12, 2020

nikic commented May 12, 2020

nikic commented May 12, 2020

cmb69 commented May 12, 2020

alexdowad commented May 12, 2020

alexdowad commented May 13, 2020

alexdowad commented May 13, 2020 via email

nikic commented May 13, 2020

alexdowad commented May 13, 2020

cmb69 commented May 13, 2020

Respect script execution time limit when invoking shutdown functions #5543

Respect script execution time limit when invoking shutdown functions #5543

Conversation

alexdowad commented May 7, 2020

alexdowad commented May 8, 2020

nikic commented May 11, 2020

alexdowad commented May 11, 2020

alexdowad commented May 11, 2020

nikic commented May 11, 2020

alexdowad commented May 11, 2020

nikic May 11, 2020

Choose a reason for hiding this comment

alexdowad May 11, 2020

Choose a reason for hiding this comment

nikic May 11, 2020

Choose a reason for hiding this comment

alexdowad May 12, 2020

Choose a reason for hiding this comment

nikic May 13, 2020

Choose a reason for hiding this comment

alexdowad commented May 11, 2020 • edited Loading

cmb69 commented May 11, 2020

alexdowad commented May 12, 2020

cmb69 commented May 12, 2020

nikic commented May 12, 2020

alexdowad commented May 12, 2020

alexdowad commented May 12, 2020

cmb69 commented May 12, 2020

nikic commented May 12, 2020

nikic commented May 12, 2020

cmb69 commented May 12, 2020

alexdowad commented May 12, 2020

alexdowad commented May 13, 2020

alexdowad commented May 13, 2020 via email

nikic commented May 13, 2020

alexdowad commented May 13, 2020

cmb69 commented May 13, 2020

alexdowad commented May 11, 2020 •

edited

Loading