Skip to content

ENH: specificy missing labels in loc calls GH34272 #34912

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged

Conversation

timhunderwood
Copy link
Contributor

@timhunderwood timhunderwood commented Jun 20, 2020

@timhunderwood
Copy link
Contributor Author

This should address #34272.

Would it be useful to add max limit to the number of missing labels it would print? I think this will be handled better by the str method of the Series class, already.

Fix ipython directive to code block so that error message can be included.
.. code-block:: ipython

s = pd.Series({"a": 1, "b": 2, "c": 3})
s.loc[["a", "b", "missing_0", "c", "missing_1", "missing_2"]]
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

you don’t need a section for this just a note in the whats new

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

sure, I've removed the code block and just left the text entry in the what's new.

@@ -1302,10 +1303,12 @@ def _validate_read_indexer(
# code, so we want to avoid warning & then
# just raising
if not ax.is_categorical():
not_found = list(key[missing_mask])
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

instead of doing this you can create an Index which has the natural formatters

Copy link
Contributor Author

@timhunderwood timhunderwood Jun 21, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done, and I now also use the option context to limit the number of items displayed.

Note, the if raise_missing block just above this uses a list to print the missing keys here. Independently of this PR, if raise_missing==False, an error would be raised directly below regardless (unless the axis is categorical). Not sure if this is the expected/desired behaviour? Perhaps one for another ticket/PR.

@@ -1075,3 +1075,12 @@ def test_setitem_with_bool_mask_and_values_matching_n_trues_in_length():
result = ser
expected = pd.Series([None] * 3 + list(range(5)) + [None] * 2).astype("object")
tm.assert_series_equal(result, expected)


def test_missing_labels_inside_loc():
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can u also try with a lot of missing labels (testing that message is constrained)

you will need to set the max line width via option context

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good point, done. I added a test for many labels and a test for very long labels.

Note because we are raising a KeyError the message is wrapped in quotes (see stack overflow and https://bugs.python.org/issue2651 . This means the \n characters appear in the message.

The pre-existing error message text here is also quite long as it includes a link to the docs and text on "no longer supported". Let me know if you think it should be shortened.

@jreback jreback added Error Reporting Incorrect or improved errors from pandas Indexing Related to indexing on series/frames, not to indexes themselves labels Jun 20, 2020
PR feedback: what's new code block removed; additional test for many or long missing labels, context manager for display options.
Copy link
Contributor

@jreback jreback left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

small comments ping on green

"https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#deprecate-loc-reindex-listlike" # noqa:E501
)
not_found = key[missing_mask]
from pandas import option_context
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

import at the top

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done

Previously, if labels were missing for a loc call, a Key Error was raised stating that this was no longer supported.

Now the error message also includes a list of the missing labels (max 10 items, display width 80 characters).

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can u add the issue number here

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can you do it like other issue references, e.g. :issue:`34272`

Copy link
Contributor Author

@timhunderwood timhunderwood Jun 26, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@jreback - sure, this is now done.

PR feedback: reference issue in what's new and import position.
PR feedback: import sorting (isort) fix
@jreback jreback added this to the 1.1 milestone Jun 25, 2020
Copy link
Contributor

@jreback jreback left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ping when green

Previously, if labels were missing for a loc call, a Key Error was raised stating that this was no longer supported.

Now the error message also includes a list of the missing labels (max 10 items, display width 80 characters).

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can you do it like other issue references, e.g. :issue:`34272`

@jreback jreback merged commit deec940 into pandas-dev:master Jun 26, 2020
@jreback
Copy link
Contributor

jreback commented Jun 26, 2020

thanks @timhunderwood very nice

@timhunderwood
Copy link
Contributor Author

thanks, Happy to help!

fangchenli pushed a commit to fangchenli/pandas that referenced this pull request Jun 27, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Error Reporting Incorrect or improved errors from pandas Indexing Related to indexing on series/frames, not to indexes themselves
Projects
None yet
Development

Successfully merging this pull request may close these issues.

ENH: specify missing keys in KeyError when passing list-like to .loc
3 participants