Description
Feature Type
-
Adding new functionality to pandas
-
Changing existing functionality in pandas
-
Removing existing functionality in pandas
Problem Description
There are public functions and methods whose arguments are types that are not actually exported. This makes it hard to propogate those types to other functions that then call the pandas ones.
For instance, merge
has a how
argument that has a type _typing.MergeHow = Literal['cross', 'inner', 'left', 'outer', 'right']
, but since _typing
is protected, there is no good way to take it as an argument and instead I have to say
def foo(df: pd.DataFrame, ...., how: str):
...
assert how in ['cross', 'inner', 'left', 'outer', 'right']
pd.merge(..., how=how)
For my type checker to be OK with it. This is both annoyingly verbose and fragile to updates of Pandas
Feature Description
Add a typing
module that exposes a (possible subset) of _typing
.
I say possible subset because from looking at the _typing
module there are clearly types that are internal usage only and I'm guessing we don't want to have them public so that they can be changed easier.
I would propose the subset be all types that are used as arguments of public functions and methods.
This way my function above could have;
import pandas as pd
import pandas.typing as pd_typing
def foo(df: pd.DataFrame, ..., how: pd_typing.MergeHow):
...
pd.merge(...., how=how)
and have everything work.
Alternative Solutions
Technically these types are "available" when imported by other modules, so you can access MergeHow
via pandas.core.reshape.merge.MergeHow
or pandas.core.frame.MergeHow
but those are just imports from _typing
imported to be used by those modules themselves, not something users should rely on.
Other alternatives
A) Split the public ones out of _typing
into typing
, could from typing import *
in _typing
if we don't want to rewrite everywhere the newly public types are used.
B) Just make all of typing
public. As someone who is not heavy into Pandas internals I have no strong opinion here but my guess is that there are internal types that we don't want public.
Additional Context
I'm more than willing to take this PR myself I just want feedback about whether this would be accepted.