Skip to content

Bug: rename incapable of accepting tuples as new name #19497

Closed
@charlie0389

Description

@charlie0389

Pandas is incapable of renaming a pandas.Index object with tuples as the new value. Providing a tuple as new_name in pandas.DataFrame.rename({old_name: new_name}, axis="index") returns a pandas.MultiIndex object, and providing it within a singleton tuple returns an undesirable result. See code below (work-around at bottom):...

import pandas as pd
import numpy as np
df = pd.DataFrame(data = np.arange(5), index=[(x, x) for x in range(5)], columns=["Value"])
print(df) # Note that df.index is a pd.Index object of 2-length tuples

# Wish to rename axis label, but keep the same style
df2 = df.rename({(1,1):(1,5)}, axis="index") 

print(df2)  # Woah! - df2.index is of MultiIndex type
print(df2.index) # ... and here's proof

# Maybe I can get around this by passing it as a singleton tuple...
df3 = df.rename({(1,1):((1,5),)}, axis="index") 
print(df3) # ... apparently not

Will produce the output:

        Value
(0, 0)      0
(1, 1)      1
(2, 2)      2
(3, 3)      3
(4, 4)      4

     Value
0 0      0
1 5      1
2 2      2
3 3      3
4 4      4
MultiIndex(levels=[[0, 1, 2, 3, 4], [0, 2, 3, 4, 5]],
           labels=[[0, 1, 2, 3, 4], [0, 4, 1, 2, 3]])

           Value
(0, 0)         0
((1, 5),)      1
(2, 2)         2
(3, 3)         3
(4, 4)         4

Desired/Expected output:

        Value
(0, 0)      0
(1, 5)      1
(2, 2)      2
(3, 3)      3
(4, 4)      4

Problem description

The current behaviour is a problem for two reasons:

  1. It is un-intuitive - I can't see why a user would expect renaming an index to change the index's type.
  2. There is no way rename Index objects with tuples

I have checked for similar issues by search of the word rename, and at time of writing, pandas 0.22.0 is the latest released version.

Output of pd.show_versions()

INSTALLED VERSIONS ------------------ commit: None python: 3.5.2.final.0 python-bits: 64 OS: Linux OS-release: 4.4.0-112-generic machine: x86_64 processor: x86_64 byteorder: little LC_ALL: None LANG: en_US.UTF-8 LOCALE: en_US.UTF-8

pandas: 0.22.0
pytest: 3.0.3
pip: 9.0.1
setuptools: 28.8.0
Cython: 0.25.1
numpy: 1.11.2
scipy: 0.18.1
pyarrow: None
xarray: None
IPython: 5.1.0
sphinx: 1.4.8
patsy: 0.4.1
dateutil: 2.6.1
pytz: 2016.7
blosc: None
bottleneck: 1.1.0
tables: 3.3.0
numexpr: 2.6.1
feather: None
matplotlib: 1.5.3
openpyxl: 2.4.9
xlrd: 1.0.0
xlwt: 1.1.2
xlsxwriter: 0.9.3
lxml: 3.8.0
bs4: 4.5.1
html5lib: 1.0b10
sqlalchemy: 1.1.3
pymysql: None
psycopg2: None
jinja2: 2.8
s3fs: None
fastparquet: None
pandas_gbq: None
pandas_datareader: None

Workaround

The workaround below uses set_value function which the documentation tells the user to avoid using (unless you really know what you're doing):

df.index.set_value(df.index.get_values(), (1,1), (1, 5)) 
df.reset_index(inplace=True)
df.set_index("index", inplace=True)
df.index.name = None # Arguably not necessary...
print(df)

Produces the output:

        Value
(0, 0)      0
(1, 5)      1
(2, 2)      2
(3, 3)      3
(4, 4)      4

Metadata

Metadata

Assignees

No one assigned

    Type

    No type

    Projects

    No projects

    Milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions