Description
Code Sample
In [1]: from io import StringIO
In [2]: import pandas as pd
In [3]: data = StringIO("""issue_date,issue_date_dt
...: ,
...: ,
...: 19600215.0,1960-02-15
...: ,
...: ,""")
In [4]: df = pd.read_csv(data, parse_dates=[1])
In [5]: df
Out[5]:
issue_date issue_date_dt
0 NaN NaT
1 NaN NaT
2 19600215.0 1960-02-15
3 NaN NaT
4 NaN NaT
In [6]: df.any(axis=0)
Out[6]:
issue_date True
issue_date_dt True
dtype: bool
In [7]: df.any(axis=1)
Out[7]:
0 False
1 False
2 False
3 False
4 False
dtype: bool
Problem description
df.any(axis=0)
behaves as expected. It returns True for both the columns, but df.any(axis=1)
returns False for all the rows.
Note: A question with similar issue can be found here
Note: If you use notnull
then we are getting the required output
In [9]: df.notnull().any(1)
Out[9]:
0 False
1 False
2 True
3 False
4 False
dtype: bool
Expected Output
df.any(axis=1)
should return True for those rows with True values.
Output of pd.show_versions()
INSTALLED VERSIONS
commit: None
python: 3.6.5.final.0
python-bits: 64
OS: Windows
OS-release: 10
machine: AMD64
processor: Intel64 Family 6 Model 158 Stepping 9, GenuineIntel
byteorder: little
LC_ALL: None
LANG: en_US.UTF-8
LOCALE: None.None
pandas: 0.23.4
pytest: None
pip: 10.0.1
setuptools: 39.2.0
Cython: None
numpy: 1.15.0
scipy: 1.1.0
pyarrow: None
xarray: None
IPython: 6.5.0
sphinx: None
patsy: None
dateutil: 2.7.3
pytz: 2018.5
blosc: None
bottleneck: None
tables: None
numexpr: None
feather: None
matplotlib: 2.2.3
openpyxl: None
xlrd: None
xlwt: None
xlsxwriter: None
lxml: None
bs4: None
html5lib: 1.0.1
sqlalchemy: None
pymysql: None
psycopg2: None
jinja2: 2.10
s3fs: None
fastparquet: None
pandas_gbq: None
pandas_datareader: None