-
Notifications
You must be signed in to change notification settings - Fork 532
ENH: Centralize virtual/physical $DISPLAYs #2203
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
This PR addresses several (related) problems: - Makes `$DISPLAY` and the config `display_variable` optional (close nipy#2055) - Should fix nipy#1403 since xvfb-run is not used anymore. - Relates to nipy#1400: - Will reduce its impact because now Xvfb is called only once and only if it is absolutely necessary - May make it worse in some cases when Xvfb fails to create a listener (root needed?). This PR adds a `config.get_display()` which identifies what display should be used, and creates a virtual one if necessary. Also adds a few unit tests to the config object to make sure precedence is fulfilled.
This strikes me as being outside the scope of nipype. If I set |
Thanks a lot for the comment.
Totally agree, and that's the reason why I commented that part out, just to have the code handy if needed. It was just a suggestion that could make nipype easier for the user with better error messages (instead of relying on the third party libraries for these infrastructure things). However, there is a hidden complexity: if you set the I'll leave it there for the future (commented out), if we want to check for it only when |
nipype/utils/config.py
Outdated
@@ -172,3 +173,52 @@ def update_matplotlib(self): | |||
def enable_provenance(self): | |||
self._config.set('execution', 'write_provenance', 'true') | |||
self._config.set('execution', 'hash_method', 'content') | |||
|
|||
def get_display(self): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If I understand it, if this has not been called before workflow runtime, then each node will get a copy of config
and thus start up its own Xvfb
object.
And I don't see anywhere where it's stopped.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The first time you call it, one display will be locked. As a matter of fact, when I first wrote the tests (https://github.com/oesteban/nipype/blob/fix/VirtualDisplay/nipype/utils/tests/test_config.py#L9-L49) I realized I had to manally reset the config object all the times because once it is called it is fixed to that display. Therefore, all those config._display = None
and those re-setting/removing the nipype config setting.
Does this reply to your question?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Okay. I meant to put at the end of the first sentence: "Is this the desired behavior?" Sounds like it is.
Similarly, the fact that self._display.stop()
is no longer called anywhere is intentional?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, that was actually the problem: nipype would spawn an endless number of displays.
Well, it is probably nice to add a self._display.stop()
at the end when running a workflow. I'm not sure though about how to do that when running bare interfaces.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ok, added a callback when python exists the session :D.
What about in __del__, or whatever python's cleanup method is? Or if
there's a post run hook.
On Oct 3, 2017 5:38 PM, "Oscar Esteban" <[email protected]> wrote:
*@oesteban* commented on this pull request.
------------------------------
In nipype/utils/config.py
<#2203 (comment)>:
@@ -172,3 +173,52 @@ def update_matplotlib(self):
def enable_provenance(self):
self._config.set('execution', 'write_provenance', 'true')
self._config.set('execution', 'hash_method', 'content')
+
+ def get_display(self):
Yes, that was actually the problem: nipype would spawn an endless number of
displays.
Well, it is probably nice to add a self._display.stop() at the end when
running a workflow. I'm not sure though about how to do that when running
bare interfaces.
—
You are receiving this because you commented.
Reply to this email directly, view it on GitHub
<#2203 (comment)>, or mute
the thread
<https://github.com/notifications/unsubscribe-auth/AAFF8jtDUnCPtK9sbiE-95Zi6cByBhIGks5soqlrgaJpZM4PruNE>
.
|
Codecov Report
@@ Coverage Diff @@
## master #2203 +/- ##
==========================================
+ Coverage 72.24% 72.28% +0.04%
==========================================
Files 1177 1178 +1
Lines 58639 58733 +94
Branches 8447 8457 +10
==========================================
+ Hits 42366 42458 +92
+ Misses 14906 14900 -6
- Partials 1367 1375 +8
Continue to review full report at Codecov.
|
This PR fixed the problem in MRIQC, and the whole workflow run correctly: https://circleci.com/gh/oesteban/mriqc/1368 |
# As of AFNI 16.0.00, redirect_x is not needed | ||
if v[0] > 2015: | ||
# Between AFNI 16.0.00 and 16.2.07, redirect_x is not needed | ||
if v >= (2016, 0, 0) and v < (2016, 2, 7): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
woah - is this really true - crazy!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
oh yeah, tell me about it. I'll report this to AFNI, to make sure this is not a mistake linking.
doc/users/config_file.rst
Outdated
and you would like to redirect all spawned windows to | ||
it. If not set, nipype will try to configure a new virtual server using | ||
Xvfb. (possible values: any X server address; default value: not | ||
set) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is a little unclear. If I understand it correctly, we use the input $DISPLAY
variable unless this is set? Under that assumption, here's a proposed rewrite:
Override the ``$DISPLAY`` environment variable for interfaces that require an X server.
This option is useful if there is a running X server, but ``$DISPLAY`` was not defined
in nipype's environment. For example, if an X server is listening on the default port of
6000, set ``display_variable = :0`` to enable nipype interfaces to use it. It may also
point to displays provided by VNC, `xnest
<http://www.x.org/archive/X11R7.5/doc/man/man1/Xnest.1.html>`_ or `Xvfb
<http://www.x.org/archive/X11R6.8.1/doc/Xvfb.1.html>`_. If neither ``display_variable``
nor the ``$DISPLAY`` environment variable are set, nipype will try to configure a new
virtual server using Xvfb. (possible values: ...)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You are right. display_variable
should override $DISPLAY
. I'll change the code and use your proposed text for the documentation.
nipype/utils/tests/test_config.py
Outdated
config._display = None | ||
dispstr = ':10' | ||
config.set('execution', 'display_variable', dispstr) | ||
monkeypatch.setitem(os.environ, 'DISPLAY', dispstr) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should this not be something other than dispstr
?
nipype/utils/tests/test_config.py
Outdated
|
||
|
||
def test_display_empty_patched(monkeypatch): | ||
"""Check that when no display is specified, a virtual Xvfb is used""" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Docstrings after this one are all the same.
from nipype import config | ||
from nipype import logging | ||
config.stop_display() | ||
logging.getLogger('interface').info('Closing display (if virtual)') |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If nipype is going to spawn many Xvfb
s, shouldn't the stop be associated with the interface, rather than just closing one when Python quits?
I think I must be missing something here.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actually the opposite. You want nipype to run only one Xvfb
if possible. Let's say you start an ipython session, run an interface with _redirect_x = True
and then another interface than again requires X. With the current implementation only one Xvfb
is started. Only when you exit ipython the Xvfb service is killed.
The rationale is precisely stop spawning many Xvfb
s which was the case before
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I see. And you're testing to make sure you're not producing multiple by running multiple interfaces? Because my first comment noted that it looks like each Node makes a copy of config
before spawning an xvfb
, so it's not clear they would actually share it.
Again, this is from a very cursory look, so it's possible I'm missing things.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@effigies - config makes copies of sections, not the entire config object. so as long as the config.get_display is called, nipype maintains a global config/state.
@oesteban - a couple of follow up questions to @effigies' comments: i'm enumerating some use cases:
in my head, 1 and 2 would reuse the same xvfb, however this will not work unless there is some reference counting in place and the one that shuts it down is the last one to finish. this could also result in race conditions of some kind. 3 would launch different xvfb's per user. but within each user's space considerations for 1 and 2 would apply. |
Will reuse same
Will open one new
Will open as many For use cases 2 and 3 we should generally recommend setting up VNC (for clusters) or a unique Xvfb, and setting There is a compromise here between how much we want to make it seamless to the user and basics of infrastructure. The former implementation was extremely inefficient (spawning literally hundreds of |
IGNORE THIS COMMENT i think we still need to support 2. imagine submitting slurm jobs on a cluster - it gets assigned to random nodes. if every job has to start a vnc server i don't think that's a good outcome. i'd rather have every job starting an xvfb session. |
@oesteban - nevermind my previous comment on case 2 - i think we are still ok - case 2 will still launch a different xvfb per process. |
@satra correct :). The vnc idea is when you have one node providing that, otherwise it's better to fall back to the default (ergo |
An important addendum to #2203 (comment) is that |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Alright. @satra addressed my one confusion point. I think this all looks reasonable.
@oesteban - lgtm |
This PR addresses several (related) problems:
$DISPLAY
and the configdisplay_variable
optional (close DISPLAY variable should be optional #2055)This PR adds a
config.get_display()
which identifies what display should be used, and creates a virtual one if necessary. Also adds a few unit tests to the config object to make sure precedence is fulfilled. If the virtual framebuffer is created, it is created only once for the whole workflow (solves the problem of many virtual fbs on a mapnode with_redirect_x = True
.One open question is: do we want to double check that an X server is running if
$DISPLAY
ordisplay_variable
are:0
?https://github.com/oesteban/nipype/blob/e8e18f039ce6ab7c7bd19c045e083bf02275c7c0/nipype/utils/config.py#L180-L185