Skip to content

Simplifying dependencies #23115

Closed
Closed
@datapythonista

Description

@datapythonista

I guess this has been previously discussed, but I'd personally appreciate understanding better the dependencies, and see if they can be simplified. As I think they often generate confusion (e.g. not installing the optional ones), and some errors (e.g. editing files that are automatically generated, or forgetting to run the script).

For what I know, pandas has 3 dependencies, numpy, dateutil and pytz. Those live in setup.py, so when packaging they are required. No question about this part.

Then, for the development environment, I think "ideally" we would like to have a environment.yml file in the root of the project, so setting up a pandas environment is as easy as conda env create, and maintaining the list of dependencies is as easy as updating that file.

The questions then are:

  • What is the reason for splitting into dev and optional? And are they enough to justify the increased complexity?
  • Is it an option to provide only the dependencies for conda? If people is interested in using pip for a development environment, is it an option to instead of having the requirements.txt file, just provide the script that generates it (i.e. convert_deps.sh)?
  • For the CI dependencies, we've got 14 independent files. Is something we could do to make things simpler? Generating those files with a script would simplify, or would make things more complex? Or at least, would it make sense to keep all those files in a separate dir (e.g. ci/requirements/)

CC @jreback, @pandas-dev/pandas-core

Metadata

Metadata

Assignees

No one assigned

    Labels

    CIContinuous IntegrationDependenciesRequired and optional dependenciesNeeds DiscussionRequires discussion from core team before further action

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions