Description
I guess this has been previously discussed, but I'd personally appreciate understanding better the dependencies, and see if they can be simplified. As I think they often generate confusion (e.g. not installing the optional ones), and some errors (e.g. editing files that are automatically generated, or forgetting to run the script).
For what I know, pandas has 3 dependencies, numpy
, dateutil
and pytz
. Those live in setup.py
, so when packaging they are required. No question about this part.
Then, for the development environment, I think "ideally" we would like to have a environment.yml
file in the root of the project, so setting up a pandas environment is as easy as conda env create
, and maintaining the list of dependencies is as easy as updating that file.
The questions then are:
- What is the reason for splitting into
dev
andoptional
? And are they enough to justify the increased complexity? - Is it an option to provide only the dependencies for conda? If people is interested in using pip for a development environment, is it an option to instead of having the
requirements.txt
file, just provide the script that generates it (i.e.convert_deps.sh
)? - For the CI dependencies, we've got 14 independent files. Is something we could do to make things simpler? Generating those files with a script would simplify, or would make things more complex? Or at least, would it make sense to keep all those files in a separate dir (e.g.
ci/requirements/
)
CC @jreback, @pandas-dev/pandas-core