Skip to content

KeyErrors are inconsistent and too verbose #25996

Open
@naught101

Description

@naught101

Code Sample, a copy-pastable example if possible

In [1]: df['asda'] 
KeyError: 'asda'

in [2]: df[['asda']]   
KeyError: "None of [Index(['asda'], dtype='object')] are in the [columns]"

In [3]: df[['asda', 'lat']]  # lat is a valid column                                                              
KeyError: "['asda'] not in index"

In [4]: df.loc['asda']                                                                                            
KeyError: 'asda'

In [5]: df.loc['asda']  
KeyError: "None of [Index(['asda'], dtype='object')] are in the [index]"

In [6]: df.loc[['asda', 0]]                                                                                       
/home/nedcr/miniconda3/envs/ana/bin/ipython:1: FutureWarning: 
Passing list-likes to .loc or [] with any missing label will raise
KeyError in the future, you can use .reindex() as an alternative.

See the documentation here:
https://pandas.pydata.org/pandas-docs/stable/indexing.html#deprecate-loc-reindex-listlike
  #!/home/nedcr/miniconda3/envs/ana/bin/python
Out[9]: 
          id      lat       lng  location BL  ...  location 2030  scale 2030  location 2070  scale 2070
asda     NaN      NaN       NaN          NaN  ...            NaN         NaN            NaN         NaN
0     1009.0 -15.4875  124.5222    38.224253  ...      38.860162    0.947698      40.494363    0.987552

[2 rows x 10 columns]

Problem description

The last example is kind of irrelevant, as it will be changed soon.

The 2nd and 5th example are annoying, because it is often useful to catch KeyErrors, and use their values to return nice error messages. That's difficult with this output.

Also they are slightly misleading, as I didn't pass and index as the indexer, so it is potentially annoying for debugging.

Expected Output

example 2 should have a message either like

KeyError: ['asda'] not in columns

or just

KeyError: ['asda']

Similarly for example 5.

Output of pd.show_versions()

INSTALLED VERSIONS

commit: None
python: 3.6.8.final.0
python-bits: 64
OS: Linux
OS-release: 4.18.0-16-lowlatency
machine: x86_64
processor: x86_64
byteorder: little
LC_ALL: None
LANG: en_AU.UTF-8
LOCALE: en_AU.UTF-8

pandas: 0.24.2
pytest: 4.3.1
pip: 19.0.3
setuptools: 40.8.0
Cython: None
numpy: 1.16.2
scipy: 1.2.1
pyarrow: None
xarray: 0.12.0
IPython: 7.3.0
sphinx: None
patsy: None
dateutil: 2.8.0
pytz: 2018.9
blosc: None
bottleneck: None
tables: None
numexpr: None
feather: None
matplotlib: 3.0.3
openpyxl: None
xlrd: None
xlwt: None
xlsxwriter: None
lxml.etree: 4.3.2
bs4: None
html5lib: None
sqlalchemy: None
pymysql: None
psycopg2: None
jinja2: 2.10
s3fs: None
fastparquet: None
pandas_gbq: None
pandas_datareader: None
gcsfs: None

Metadata

Metadata

Assignees

No one assigned

    Labels

    EnhancementError ReportingIncorrect or improved errors from pandasIndexingRelated to indexing on series/frames, not to indexes themselves

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions