Skip to content

Indexing API Reference

immerrr edited this page Jul 23, 2014 · 1 revision

Indexing API Reference

Rationale

This is an attempt to put down (and hopefully simplify) rules of Pandas label lookup in the spirit of #7651 API proposal.

Label Lookup Methods

The first idea is to add the following method to Index API:

def label_lookup(self, key, typ)

Basically, when given a label-based key this method should return location-based key referencing the requested location(-s) along single axis. This is where datetime and float indices may override the default behaviour. For example:

foo = series.loc[key]  # series.index.label_lookup(key, typ='loc')

frame.xs[rows, cols] = foobar  # frame.index.label_lookup(rows, typ='xs-set')
                               # frame.columns.label_lookup(cols, typ='xs-set')  

The second idea is to allow more flexibility for MultiIndex cases by adding another method to Index class API:

def labels_lookup_nd(self, key, other_axes, typ)

This method will be supplied enough information for the first Index of the container to decide which indexing scheme is wanted, e.g.:

foo = series.loc[key]  # series.index.label_lookup_nd(key, tuple(), typ='loc')

frame.xs[rows, cols] = foobar  # frame.index.label_lookup_nd((rows, cols), (frame.columns,), typ='xs-set')

The default behaviour should be to apply the necessary dimensionality checks and invoke corresponding labels_lookup methods on each axis, but that can be overridden to provide convenience accessors for containers with MultiIndex axes.

Label Lookup types

Values of typ= kwarg are strings consisting of:

  • description of the operation to be performed
    • imm for immediate indexing on the container, e.g. series[key]
    • loc, series.loc[key]
    • at, series.at[key]
    • xs, series.xs[key]
  • optional -set suffix for setitem operations, series.loc[key] = foobar

Considerations:

  • strings should provide enough performance in terms of plain python, only when cythonizing one should consider replacing those with integer constants (enums)
  • -set suffix is there to allow extending container axes when accessing non-existent keys, -del suffix is not necessary because the key must be present, like in getitem scenario.
Clone this wiki locally