Skip to content

DataFrame.apply returns NaN if DataFrame contains datetime column #18775

Closed
@AlexHentschel

Description

@AlexHentschel

Code Sample, a copy-pastable example if possible

import pandas as pd

A = pd.DataFrame()
A["author"] = ["X", "Y", "Z"]
A["publisher"] = ["BBC", "NBC", "N24"]
A["date"] = pd.to_datetime(['17-10-2010 07:15:30', '13-05-2011 08:20:35', "15-01-2013 09:09:09"])

# the following produces the faulty result
A.apply(lambda x: {}, axis=1)

Problem description

The last line returns a dataframe with all entries replaced by NaN. This only happens if the following two conditions are both satisfied:

  • a column with datetime64[ns] is present in the dataframe (in the above example the column with name date)
  • the function applied to the dataframe returns a dictionary
    When using a Dataframe without the datetime column, the code returns the expected result (for the above result a pd.Series with empty dictionaries).

Why this is a (significant) problem:
Output of apply depends on presence of another column that is not used by applied function.

Potentially related:
I tried to search for a similar issues and found the already closed issues

However, these issues are fixed and already closed since 2015.

Expected Output

the expected output can be easily produced by removing the 6th line (A["date"] = ...)

>>> A.apply(lambda x: {}, axis=1)
0    {}
1    {}
2    {}
dtype: object

Output of pd.show_versions()

Checked with newest version of pandas:

  • pandas (0.21.1)
  • numpy (1.13.3)
  • Python 3.6.2

[paste the output of pd.show_versions() here below this line]

INSTALLED VERSIONS
------------------
commit: None
python: 3.6.2.final.0
python-bits: 64
OS: Darwin
OS-release: 17.2.0
machine: x86_64
processor: i386
byteorder: little
LC_ALL: None
LANG: None
LOCALE: en_CA.UTF-8
pandas: 0.21.1
pytest: None
pip: 9.0.1
setuptools: 36.6.0
Cython: None
numpy: 1.13.3
scipy: 1.0.0
pyarrow: None
xarray: None
IPython: None
sphinx: None
patsy: None
dateutil: 2.6.1
pytz: 2017.3
blosc: None
bottleneck: None
tables: None
numexpr: None
feather: None
matplotlib: 2.1.0
openpyxl: None
xlrd: None
xlwt: None
xlsxwriter: None
lxml: None
bs4: None
html5lib: 0.9999999
sqlalchemy: None
pymysql: None
psycopg2: None
jinja2: None
s3fs: None
fastparquet: None
pandas_gbq: None
pandas_datareader: None

Metadata

Metadata

Assignees

No one assigned

    Labels

    ApplyApply, Aggregate, Transform, MapDuplicate ReportDuplicate issue or pull requestReshapingConcat, Merge/Join, Stack/Unstack, Explode

    Type

    No type

    Projects

    No projects

    Milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions