Skip to content

BUG: Indexes still include values that have been deleted #2770

Closed
@darindillon

Description

@darindillon

Using pandas 0.10. If we create a Dataframe with a multi-index, then delete all the rows with value X, we'd expect the index to no longer show value X. But it does.
Note the apparent inconsistency between "index" and "index.levels" -- one shows the values have been deleted but the other doesn't.

import pandas

x = pandas.DataFrame([['deleteMe',1, 9],['keepMe',2, 9],['keepMeToo',3, 9]], columns=['first','second', 'third'])
x = x.set_index(['first','second'], drop=False)

x = x[x['first'] != 'deleteMe'] #Chop off all the 'deleteMe' rows

print x.index #Good: Index no longer has any rows with 'deleteMe'. But....

print x.index.levels #Bad: index still shows the "deleteMe" values are there. But why? We deleted them.

x.groupby(level='first').sum() #Bad: it's creating a dummy row for the rows we deleted!

We don't want the deleted values to show up in that groupby. Can we eliminate them?

Metadata

Metadata

Assignees

No one assigned

    Labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions