Skip to content

"drop" fails by default when label to drop is not in the index, but "rename" silently passes by. Wouldn't it make sense to synchronize? #40427

Open
@gykovacs

Description

@gykovacs

Is your feature request related to a problem?

.drop and .rename have similar signatures to some extent, and implement basic dataframe functionalities. When drop is called and the label is not in the index, it fails by default (due to the errors='raise' default parameter). However, when rename is called with a label not in the specified index, it passes by silently by default. I think it would be a useful improvement to get the parameterization and default behavior of these functions aligned, possibly by enriching rename. I know it might be incompatible with previous versions if it would start failing if a label is not in the index, but I am also concerned if it is an intentional and valid usecase to rename columns which might not be present in the dataframe.

Describe the solution you'd like

rename should be extended with the same parameters as drop, possibly with the default setting of raising errors.

API breaking implications

If the raising of errors is not made default, it should not break anything. If the raising of errors is default (making it similar to drop), then those codes attempt to rename labels not present in the index would fail. Possibly it could be a 2 stage improvement, first raising a warning and in a later release raising an error.

Describe alternatives you've considered

One can do workarounds in the host code, but I don't see alternatives solutions.

Additional context

I came across this issue when working with data as part of my job. Columns which I expected at some point of the code were not there, the reason was that I did not specify the index='columns' flag in rename, which passed by silently trying to rename entries in the row index. If rename would have failed as drop would have, it would have been easier to figure out where is the issue.

Then I was pondering if it is a valid use case to pass by silently if there is no label to be renamed, and decided to write this request.

>>> pandas.__version__
'1.3.0.dev0+1040.g454194e76a'
>>> tmp= pandas.DataFrame({'a': [1, 2, 3], 'b': [3, 2, 1]}, index=['b', 'c', 'd'])
>>> tmp
   a  b
b  1  3
c  2  2
d  3  1
>>> tmp.drop(['e'], axis='columns')
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "/home/pandas/pandas/core/frame.py", line 4776, in drop
    return super().drop(
  File "/home/pandas/pandas/core/generic.py", line 4211, in drop
    obj = obj._drop_axis(labels, axis, level=level, errors=errors)
  File "/home/pandas/pandas/core/generic.py", line 4246, in _drop_axis
    new_axis = axis.drop(labels, errors=errors)
  File "/home/pandas/pandas/core/indexes/base.py", line 5925, in drop
    raise KeyError(f"{labels[mask]} not found in axis")
KeyError: "['e'] not found in axis"
>>> tmp
   a  b
b  1  3
c  2  2
d  3  1
>>> tmp.rename({'e': 'g'}, axis='columns')
   a  b
b  1  3
c  2  2
d  3  1
>>> tmp
   a  b
b  1  3
c  2  2
d  3  1
>>>

Any comments are welcome!

Metadata

Metadata

Assignees

No one assigned

    Labels

    API - ConsistencyInternal Consistency of API/BehaviorEnhancementNeeds DiscussionRequires discussion from core team before further actionrename.rename, .rename_axis

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions