Skip to content

Indexing by 'partial' boolean DataFrame does not work #17170

Closed
@sinowood

Description

@sinowood

Code Sample, a copy-pastable example if possible

import numpy as np
import pandas as pd

x = pd.DataFrame(np.arange(9.).reshape(3,3), index=list('abc'), columns=list('ABC'))
y = pd.DataFrame(1, index=list('ab'), columns=list('AB'))

x[y.notnull()]

Expected Output

     A    B   C
a  0.0  1.0 NaN
b  3.0  4.0 NaN
c  NaN  NaN NaN

Observed Output

ValueError: Boolean array expected for the condition, not float64

Problem description

Pandas 0.19 works as expected, but with 0.20 a ValueError is raised.

When we want to pick some data out of a large DataFrame, sometimes it's more convenient and flexible to provide a 'partial' boolean DataFrame to filter the data.

Output of pd.show_versions()

INSTALLED VERSIONS

commit: None
python: 3.5.2.final.0
python-bits: 64
OS: Windows
OS-release: 7
machine: AMD64
processor: Intel64 Family 6 Model 94 Stepping 3, GenuineIntel
byteorder: little
LC_ALL: None
LANG: None
LOCALE: None.None

pandas: 0.20.3
pytest: None
pip: 9.0.1
setuptools: 27.2.0
Cython: None
numpy: 1.13.1
scipy: None
xarray: None
IPython: 5.1.0
sphinx: 1.5.1
patsy: None
dateutil: 2.6.0
pytz: 2016.10
blosc: None
bottleneck: None
tables: None
numexpr: None
feather: None
matplotlib: None
openpyxl: None
xlrd: None
xlwt: None
xlsxwriter: None
lxml: None
bs4: None
html5lib: None
sqlalchemy: None
pymysql: None
psycopg2: None
jinja2: 2.8.1
s3fs: None
pandas_gbq: None
pandas_datareader: None

Metadata

Metadata

Assignees

No one assigned

    Labels

    IndexingRelated to indexing on series/frames, not to indexes themselvesTestingpandas testing functions or related to the test suite

    Type

    No type

    Projects

    No projects

    Milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions