Skip to content

ENH: Disable symlinks on CIFS filesystems #1941

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 5 commits into from
Apr 9, 2017

Conversation

effigies
Copy link
Member

@effigies effigies commented Apr 5, 2017

There appear to be cases where hacks to make symlinks work on filesystems that don't directly support them can cause FileNotFoundErrors. In particular, the CIFS network filesystem driver for Linux supports the mfsymlinks option, which enables symlinks at the VFS level, having the effect that symlinking can be subject to network timeouts or temporary inconsistencies.

CIFS+mfsymlinks is the Docker-supported way to mount filesystems into a container from a Windows host, which means this may be increasingly a problem for Nipype users, as containers are becoming a more popular environment for running workflows. An error reported on nipreps/fmriprep#398 ran into this issue when running an interface that requires 4 symlinks, which split into two mapflow directories (per dataset being analyzed), using the MultiProc plugin.

This PR permits users to disable symlinking in the config file with the disable_symbolic_links option.

Closes #1938.

@codecov-io
Copy link

codecov-io commented Apr 5, 2017

Codecov Report

Merging #1941 into master will increase coverage by 0.01%.
The diff coverage is 93.33%.

Impacted file tree graph

@@            Coverage Diff            @@
##           master   #1941      +/-   ##
=========================================
+ Coverage   72.49%   72.5%   +0.01%     
=========================================
  Files        1063    1063              
  Lines       54159   54189      +30     
  Branches     7811    7820       +9     
=========================================
+ Hits        39260   39288      +28     
- Misses      13680   13682       +2     
  Partials     1219    1219
Flag Coverage Δ
#smoketests 72.5% <93.33%> (+0.01%) ⬆️
#unittests 70.04% <90%> (+0.01%) ⬆️
Impacted Files Coverage Δ
nipype/utils/tests/test_filemanip.py 93.61% <100%> (+0.4%) ⬆️
nipype/utils/filemanip.py 84.89% <87.5%> (+0.14%) ⬆️

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 3107a6d...ae5c6e1. Read the comment docs.

@chrisgorgo
Copy link
Member

Determining the filesystem might be easier than we thought http://stackoverflow.com/a/25287521/616300

@effigies
Copy link
Member Author

effigies commented Apr 6, 2017

That requires psutil, but if it seems worth it to add the dependency, we can check for CIFS. Could you install psutil on your docker container and get a working is_cifs(path) function?

@chrisgorgo
Copy link
Member

Does not require psutil.

def is_cifs(path):
     exit_code, mount_str = subprocess.getstatusoutput("mount")
     if not exit_code:
         mount_points = [(line.split(" ")[2], line.split(" ")[4]) for line in mount_str.split("\n")]
         sorted_mount_points = sorted(mount_points, key=lambda x: len(x[0]), reverse=True)
         for k, v in sorted_mount_points:
             print(k,v)
             if path.startswith(k):
                 if v == "cifs":
                     return True
                 else:
                     return False
     else:
         return False

@chrisgorgo
Copy link
Member

I also run the workflow with working directory inside the container (not mounted) and I did not get an error.

This means we should avoid symlinks only if the destination is on cifs.

@effigies effigies changed the title ENH: add config option to disable symlinks ENH: Disable symlinks on CIFS filesystems Apr 6, 2017
@effigies
Copy link
Member Author

effigies commented Apr 6, 2017

@chrisfilo I refactored your test a little. Want to grab 0d5359d from me and verify that this works?

from nipype.utils.filemanip import _on_cifs
assert (_on_cifs('/'), _on_cifs('/scratch')) == (False, True)

I think creating a CIFS mount in the docker image for testing might be excessive.

Copy link
Member

@chrisgorgo chrisgorgo left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Works as advertised! Thanks for cleaning up my napkin quality code!

@@ -237,6 +238,30 @@ def hash_timestamp(afile):
return md5hex


def _on_cifs(fname):
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would make it public - could be useful in other places

@@ -288,6 +313,10 @@ def copyfile(originalfile, newfile, copy=False, create_new=False,
if hashmethod is None:
hashmethod = config.get('execution', 'hash_method').lower()

# Don't try creating symlinks on CIFS
if _on_cifs(newfile):
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For performance reasons, I would only run this check if we are about to make a symlink.

@effigies
Copy link
Member Author

effigies commented Apr 6, 2017

The performance issue is a good point. If this function's going to get called a lot, it might make sense to do something a little more persistent. We could do something like:

  • At program start (or first call), call mount, and get (mountpoint, fstype) tuples.
  • Prune all but superstrings of CIFS mounts
  • Check target files against pruned list

Only check if symlinks might be tried
@effigies
Copy link
Member Author

effigies commented Apr 6, 2017

Latest commit may be breaking some style guidelines. Let me know if I should reorganize.

@chrisfilo could you verify this still works? Updated test:

from nipype.utils.filemanip import on_cifs
assert (on_cifs('/'), on_cifs('/scratch')) == (False, True)

@chrisgorgo
Copy link
Member

I'm seeing huge performance hit when using MultiProc. It seems that mount is called each time a Node is being run (or checked in it needs to be rerun) - even if no files are being copied.

@effigies
Copy link
Member Author

effigies commented Apr 7, 2017

On the most recent commit? The latest changes should make the mount call occur only once per run. Unless I misunderstand how things are being run...

@chrisgorgo
Copy link
Member

chrisgorgo commented Apr 7, 2017 via email

@chrisgorgo
Copy link
Member

I'm doing more debugging. It seems that rechecking EPIToT1Transform (nothing gets rewritten) takes longer than usual (it's really slow) however the mri_expand is happening in the background so I don't know if the total running time will differ.

But you are right - the table is generated only once. I'll do more debugging.

@effigies
Copy link
Member Author

effigies commented Apr 8, 2017

Re: EPIToT1, have you recently switched to content hashing, or using an EPI with more volumes? Because I've found that's always been slow when I used content hashing...

@effigies
Copy link
Member Author

effigies commented Apr 8, 2017

Unable to replicate the slowdown on mine.

@chrisgorgo
Copy link
Member

I think the slowdown is unrelated to the change in this PR. Everything looks good to me, but since this is a change at the core of nipype it would be nice to have @satra hava a look at it as well.

@effigies
Copy link
Member Author

effigies commented Apr 8, 2017

Do you have remove_unnecessary_outputs disabled? If not, that might be your slowdown.

@satra satra merged commit c15df99 into nipy:master Apr 9, 2017
@effigies effigies deleted the disable_symlinks branch April 9, 2017 00:38
@effigies effigies mentioned this pull request Apr 9, 2017
5 tasks
@afni-dglen
Copy link
Contributor

afni-dglen commented Feb 7, 2018

John Rogers-Lee and I spent some time looking at the cifs parsing because it ended up blocking the installation on my local Mac. I'll create a pull request with my suggested changes in filemanip.py.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants