Skip to content

API: what should a 2D indexing operation into a 1D Index do? (eg idx[:, None]) #27837

Closed
@jorisvandenbossche

Description

@jorisvandenbossche

Follow-up on #27775 and #27818.

Short recap of what those issues were about:

Currently, indexing into an Index with a 2D (or multiple D) indexer results in an "invalid" Index with an underlying ndarray:

In [1]: idx = pd.Index([1, 2, 3])  

In [2]: idx2 = idx[:, None] 

In [3]: idx2
Out[3]: Int64Index([1, 2, 3], dtype='int64')

In [4]: idx2.values
Out[4]: 
array([[1],
       [2],
       [3]])

So from the repr it looks like a proper index, but the underlying values of an Index should always be 1D (such an invalid index will also lead to errors once you do operations on them).

Before pandas 0.25.0, the shape attribute of the index "correctly" returned the shape of the underlying values: (3, 1), but in 0.25.0 this was changed to (3,) (only checking the length). This caused a regression matplotlib (#27775), and will be "fixed" in 0.25.1 returning again the 2D shape of the underlying values (#27818). Of course, this is only about the shape attribute, while the root cause is this invalid Index.

I think it is clear that we should not allow such invalid Index object to exist.
I currently know of two ways to end up such situation:

So let's use this issue to discuss what to do for this second way: a 2D indexing operation on a 1D object.

This is relevant for the Index, but we should probably try to have it consistent with Series as well.

Metadata

Metadata

Assignees

No one assigned

    Labels

    API DesignDeprecateFunctionality to remove in pandasIndexRelated to the Index class or subclassesIndexingRelated to indexing on series/frames, not to indexes themselves

    Type

    No type

    Projects

    No projects

    Milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions