-
-
Notifications
You must be signed in to change notification settings - Fork 18.5k
Parallel Cythonization for spawned Processes #30862
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
@WillAyd The cython folks saw that Python 3.8 on macos issue here: conda-forge/cython-feedstock#59 (and maybe elsewhere) Once you are merged, ping me, and I'll test on Windows. |
I've tried this out on windows and it works for me. I can't however see if the build is running in parallel, maybe @Dr-Irv can comment on that? |
I now learned how to fetch this pull request and tried it out. Definitely goes parallel (I use -j 8 since I have 8 real cores). BUT there is an issue that some of the compiles of the C code are not finished before the final "cl" step. So that step fails, and I have to do |
What do you mean by "cl" step? Also is it failing during cythonize or during generation of the libraries? |
When I try a clean build, here is the end of what is on the screen:
So in the "link" part, the link is failing because it is still generating code for other modules. The "cl" part is the compilation that is being done of each module. |
Do you actually get an error message anywhere besides the ultimate exit status? Are you seeing that with -j4? |
Also note that this only affects the Cythonize piece (i.e. generating C code from the pyx modules) so I'm not sure that would be related to linker errors |
With |
So I did see this in the output:
I think there is a parallelization conflict in that the same file is being compiled by two different processes. Investigating this |
can we confirm this doesn't break things with -j 8? |
@Dr-Irv just to confirm you get that same issue on master with -j8 right? |
is this a common pattern in setup.py? |
What is your question in reference to? The |
both - iow is this the standard for parallel building if not why are we deviating ? |
It might help to separate the conversation of building from the conversation of cythonizing. This PR deals only with the latter Cython uses the |
setup.py
Outdated
) | ||
if __name__ == "__main__": | ||
# Freeze to support parallel compilation when using spawn instead of fork | ||
multiprocessing.freeze_support() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
why does this need to be here, why wouldn't you just put this at the top;
the checking if __main__
seems very odd (in setup.py), I have never seen this, can you point to an authoritative reference for this.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just for some other examples - numpy does something like this:
https://github.com/numpy/numpy/blob/a16dfb5b397718c5caf32c63f024d1ce3518c5a1/setup.py#L484
likewise with scikit-learn:
I can move to the top but I think would be weird to include everything in that; especially with everything that is currently global
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
gotcha, can you change to the setup_package
pattern
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sounds good - just did in latest. We probably could/should make it so that there aren't any global variables in the setup file, but it would be a rather large undertaking so kept most globals where they are for now
thanks @WillAyd |
Originally thought this only affected Windows, but parallel Cythonization is also broken on Python 3.8 on macOS as they shifted from using fork to start sub-processes to using spawn
https://bugs.python.org/issue33725
This fix should make things work regardless of platform / python version