Skip to content

Index API proposal: unified axis label lookup #7651

Closed
@immerrr

Description

@immerrr

As the next step of separation-of-concerns plan (#6744) I'd like to
propose adding a method (or several, actually) to Index class that
would encapsulate the details of foo.loc[l1,l2,...] lookup.

Implementation Idea

Roughly, the idea is to make loc's getitem as simple as

def __getitem__(self, indexer):
    axes = self.obj.axes
    return self.obj.iloc[axes[0].lookup_labels_nd(indexer, axes[1:], typ='loc')]

Not quite, but hopefully you get the point. The default lookup_labels_nd implementation would then look something like this:

def lookup(self, indexer, other_axes, typ=None):
    if not isinstance(indexer, tuple):
        return self.lookup_labels(indexer, typ=typ)
    else:
        # ndim mismatch error handling is omitted intentionally
        return (self.lookup_labels(indexer[0]),) + \
               tuple(ax.lookup_labels(ix, typ=typ)
                     for ax, ix in zip(other_axes, indexer))

The result should be an object that could be fed to an underlying
BlockManager to perform the requested operation. To support adding
new rows with "setitem", it is only needed to agree that lookup_labels_nd will
never return negative indices unless they reference newly appended
items along that axis.

This would allow to hide Index-subclass-specific lookup peculiarities
in their respective overrides of lookup_labels_nd and lookup_labels (proposals for
better names are welcome), e.g.:

  • looking up str in DatetimeIndex/PeriodIndex
  • looking up int in FloatIndex
  • looking up per-level slices in MultiIndex

Benefits

  • no more confusing errors due to try .. catch block carpet-catching a
    logic error, because corner cases will be handled precisely where
    they are needed and nowhere else
  • no more relying on isinstance checks and exceptions to seek for
    alternative lookup scenarios, meaning more performance
  • the API will provide a contract that is simple to grasp, test, benchmark and,
    eventually, cythonize (as a side effect of this point I'd like to try putting
    up a wiki page with indexing API reference)

Metadata

Metadata

Assignees

No one assigned

    Labels

    API DesignIndexingRelated to indexing on series/frames, not to indexes themselves

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions