Description
Code Sample, a copy-pastable example if possible
df = pd.DataFrame( data = {
'acol' : np.arange(4),
'bcol' : 2*np.arange(4)
})
df.drop(df.bcol > 2, axis=0, inplace=True)
print(df)
Expected Output
acol bcol
0 0 0
1 1 2
Observed Output
acol bcol
2 2 4
3 3 6
4 4 8
Problem description
The anticipated behavior was that rows with bcol
> 2 would be dropped. The actual behavior is that the boolean gets converted to 0/1, and then treated as index label. So row numbers 0 and/or 1 are dropped... but all other rows will be kept.
The documentation did not make it clear what was happening.
Solutions might include documentation clarifying that .drop()
cannot be used with boolean indexing, or a warning when receiving the (attempted) boolean index.
Output of pd.show_versions()
pandas: 0.20.2
pytest: 3.1.2
pip: 9.0.1
setuptools: 33.1.1.post20170320
Cython: 0.25.2
numpy: 1.12.1
scipy: 0.19.1
xarray: 0.9.6
IPython: 6.1.0
sphinx: 1.6.3
patsy: 0.4.1
dateutil: 2.6.0
pytz: 2017.2
blosc: None
bottleneck: 1.2.0
tables: 3.4.2
numexpr: 2.6.2
feather: None
matplotlib: 2.0.2
openpyxl: 2.5.0a1
xlrd: 1.0.0
xlwt: 1.2.0
xlsxwriter: 0.9.6
lxml: 3.8.0
bs4: 4.5.3
html5lib: 0.9999999
sqlalchemy: 1.1.11
pymysql: 0.7.9.None
psycopg2: 2.7.1 (dt dec pq3 ext lo64)
jinja2: 2.9.5
s3fs: 0.1.1
pandas_gbq: None
pandas_datareader: None