Closed
Description
Code Sample, a copy-pastable example if possible
import pandas as pd
A = pd.DataFrame()
A["author"] = ["X", "Y", "Z"]
A["publisher"] = ["BBC", "NBC", "N24"]
A["date"] = pd.to_datetime(['17-10-2010 07:15:30', '13-05-2011 08:20:35', "15-01-2013 09:09:09"])
# the following produces the faulty result
A.apply(lambda x: {}, axis=1)
Problem description
The last line returns a dataframe with all entries replaced by NaN. This only happens if the following two conditions are both satisfied:
- a column with
datetime64[ns]
is present in the dataframe (in the above example the column with namedate
) - the function applied to the dataframe returns a dictionary
When using a Dataframe without the datetime column, the code returns the expected result (for the above result apd.Series
with empty dictionaries).
Why this is a (significant) problem:
Output of apply depends on presence of another column that is not used by applied function.
Potentially related:
I tried to search for a similar issues and found the already closed issues
However, these issues are fixed and already closed since 2015.
Expected Output
the expected output can be easily produced by removing the 6th line (A["date"] = ...
)
>>> A.apply(lambda x: {}, axis=1)
0 {}
1 {}
2 {}
dtype: object
Output of pd.show_versions()
Checked with newest version of pandas:
- pandas (0.21.1)
- numpy (1.13.3)
- Python 3.6.2
[paste the output of pd.show_versions()
here below this line]
INSTALLED VERSIONS
------------------
commit: None
python: 3.6.2.final.0
python-bits: 64
OS: Darwin
OS-release: 17.2.0
machine: x86_64
processor: i386
byteorder: little
LC_ALL: None
LANG: None
LOCALE: en_CA.UTF-8
pandas: 0.21.1
pytest: None
pip: 9.0.1
setuptools: 36.6.0
Cython: None
numpy: 1.13.3
scipy: 1.0.0
pyarrow: None
xarray: None
IPython: None
sphinx: None
patsy: None
dateutil: 2.6.1
pytz: 2017.3
blosc: None
bottleneck: None
tables: None
numexpr: None
feather: None
matplotlib: 2.1.0
openpyxl: None
xlrd: None
xlwt: None
xlsxwriter: None
lxml: None
bs4: None
html5lib: 0.9999999
sqlalchemy: None
pymysql: None
psycopg2: None
jinja2: None
s3fs: None
fastparquet: None
pandas_gbq: None
pandas_datareader: None