-
Notifications
You must be signed in to change notification settings - Fork 532
[ENH] New ResourceMonitor (replaces resource profiler) #2200
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
This PR revises the resource profiler with the following objectives: - Increase robustness (and making sure it does not crash nipype) - Extend profiling to all interfaces (including pure python) The increase of robustness will be expected from: 1. Trying to reduce (or remove at all if possible) the logger callback to register the estimations of memory and cpus. This could be achieved by making interfaces responsible or keeping track of their resources to then collect all results after execution of the node. 2. Centralize profiler imports, like the config or logger object so that the applicability of the profiler is checked only once. This first commit just creates one new module nipype.utils.profiler, and moves the related functions in there.
@satra Am I correct thinking that this: https://github.com/oesteban/nipype/blob/d2599e2f86df754bfa0fa41b8f6789e2408fb7bd/nipype/pipeline/engine/nodes.py#L1150-L1151 would ensure that |
@oesteban I think you actually want to pass |
Oh, good catch! - well, that would only make sense if the node will have different num_threads for each iterable... right? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think I'd be comfortable merging this, assuming tests pass. @satra any further comments?
@oesteban - correct, mapnodes inherit constraints. solid work - really appreciate the refactoring that was much needed. assuming tests pass, this looks good to me. |
We'll need to revisit the estimation of resources, but what we have with this PR is one step further from where we were before and it is enough to get going. |
@oesteban what's up with @pytest.mark.skipif(os.getenv('CI_SKIP_TEST', False), reason='disabled in CI tests')
@pytest.mark.parametrize("mem_gb,n_procs", [(0.5, 3), (2.2, 8), (0.8, 4), (1.5, 1)])
def test_cmdline_profiling(tmpdir, mem_gb, n_procs):
"""
Test runtime profiler correctly records workflow RAM/CPUs consumption
of a CommandLine-derived interface
"""
from nipype import config
config.set('execution', 'resource_monitor_frequency', '0.2') # Force sampling fast
tmpdir.chdir()
iface = UseResources(mem_gb=mem_gb, n_procs=n_procs)
result = iface.run()
assert abs(mem_gb - result.runtime.mem_peak_gb) < 0.3, 'estimated memory error above .3GB'
> assert int(result.runtime.cpu_percent / 100 + 0.2) == n_procs, 'wrong number of threads estimated'
E AssertionError: wrong number of threads estimated
E assert 3 == 1
E + where 3 = int(((366.80000000000001 / 100) + 0.2))
E + where 366.80000000000001 = Bunch(cmdline='/code/nipype/nipype/utils/tests/use_resources -g 1.500000 -p 1', command_path='/code/nipype/nipype/util...5.92338, 1508877125.970061]}, returncode=0, startTime='2017-10-24T20:32:03.693559', stderr='', stdout='', version=None).cpu_percent
E + where Bunch(cmdline='/code/nipype/nipype/utils/tests/use_resources -g 1.500000 -p 1', command_path='/code/nipype/nipype/util...5.92338, 1508877125.970061]}, returncode=0, startTime='2017-10-24T20:32:03.693559', stderr='', stdout='', version=None) = <nipype.interfaces.base.InterfaceResult object at 0x7f25b8754278>.runtime |
Hi, as I mentioned before it seems that the measurements we get are not very accurate and those tests fail, especially when running in circleci (I was more successful on my desktop). Since those tests were already disabled before refactoring, that code is left there for when we have the time and will of making sure that the cpu and memory estimates are accurate. I noted that we should revisit these tests in my last comment |
@oesteban sounds good - perhaps we should just always skip it until we can assure the estimates are accurate, WDYT? |
Supersedes #2193 (I removed the specs to make code review easier)
This PR revises the resource profiler (with a new name: resource monitor):
nipype.utils.profiler
where all the code related to resource monitoring is gathered together. Tests are also brought into thenipype.utils
module.CommandLine
-based andFunction
interfaces before).proc_dict
member of theruntime
object.runtime_monitor.json
file is generated, with all the traces and necessary information to reconstruct the performance profile with the resolution of the newruntime_monitor_frequency
option (default: every 1s).n_procs
> MultiProc's settingn_procs
(similarly with the estimated memory). Now, either a warning or an error are raised (depending on the new plugin argraise_insufficient
).Important: this PRs removes the old
filemanip
logger and replaces it by a more genericutils
. Documentation has been updated accordingly and Deprecation warnings are generated consequently.