Closed
Description
Code Sample
import pandas as pd
import numpy as np
# Generate data
people = ['Susie', 'Alejandro']
day = ['Monday', 'Tuesday', 'Wednesday']
data = [[person, d, *np.random.randint(0, 5, 2)] for person in people for d in day]
df = pd.DataFrame(data, columns=['Name', 'day', 'burgers', 'fries'])
df.head()
Name | day | burgers | fries |
---|---|---|---|
Susie | Monday | 4 | 0 |
Susie | Tuesday | 0 | 1 |
Susie | Wednesday | 0 | 1 |
Alejandro | Monday | 4 | 2 |
Alejandro | Tuesday | 2 | 0 |
# Melt on column that's not present in `df`
df.melt(['Name', 'day'], ['Burgers', 'fries'])
Outputs warning:
/home/ubuntu/anaconda3/lib/python3.6/site-packages/pandas/core/indexing.py:1472: FutureWarning:
Passing list-likes to .loc or [] with any missing label will raise
KeyError in the future, you can use .reindex() as an alternative.
See the documentation here:
https://pandas.pydata.org/pandas-docs/stable/indexing.html#deprecate-loc-reindex-listlike
return self._getitem_tuple(key)
Problem description
This behavior should produce an error. Warnings aren't always taken seriously and if this melting operation were apart of a larger chain of operations, it would be unclear what the cause of the warning was since the warning does not directly address the actual problem. There is no scenario in which support for melting on non-present columns would be beneficial and there are clear reasons why it would be beneficial to alert users if they've made this mistake.
Expected Output
df.melt(['Name', 'day'], ['Burgers', 'fries'])
should be treated like df.melt(['Name', 'Day'])
(where Day
is not present in the df) and produce a Traceback.