Skip to content

DataFrame.from_dict multilevel dicts with datetime.date objects as keys sets the values as NaN in the DataFrame #19993

Closed
@hodossy

Description

@hodossy

Code Sample, from Jupyter notebook

import pandas as pd
import datetime

example_multilevel_dict = {
  (0, datetime.date(2018, 3, 3)): {'A': 5, 'B': 1344, 'C': 0, 'D': 48, 'E': 20},
  (0, datetime.date(2018, 3, 4)): {'A': 5, 'B': 1344, 'C': 0, 'D': 48, 'E': 20},
  (1, datetime.date(2018, 3, 3)): {'A': 5, 'B': 1344, 'C': 0, 'D': 48, 'E': 20},
  (1, datetime.date(2018, 3, 4)): {'A': 5, 'B': 1344, 'C': 0, 'D': 48, 'E': 20}
}

# MultiIndex rows are required, but this give multilevel columns
multicolumn_df = pd.DataFrame.from_dict(example_multilevel_dict)
print(multicolumn_df)

# but this is what 'orient' is for, however all values are NaN
multiindex_df = pd.DataFrame.from_dict(example_multilevel_dict, orient='index')
print(multiindex_df)

# but when you transpose the multicolumn_df, you get it right
print(multicolumn_df.T)

# This is not the case when the tuple contains strings instead of dates
ok_multilevel_dict = {
  (0, 'a'): {'A': 5, 'B': 1344, 'C': 0, 'D': 48, 'E': 20},
  (0, 'b'): {'A': 5, 'B': 1344, 'C': 0, 'D': 48, 'E': 20},
  (1, 'a'): {'A': 5, 'B': 1344, 'C': 0, 'D': 48, 'E': 20},
  (1, 'b'): {'A': 5, 'B': 1344, 'C': 0, 'D': 48, 'E': 20}
}
print(pd.DataFrame.from_dict(ok_multilevel_dict , orient='index'))

Output:

           0                     1           
  2018-03-03 2018-03-04 2018-03-03 2018-03-04
A          5          5          5          5
B       1344       1344       1344       1344
C          0          0          0          0
D         48         48         48         48
E         20         20         20         20
               A   B   C   D   E
0 2018-03-03 NaN NaN NaN NaN NaN
  2018-03-04 NaN NaN NaN NaN NaN
1 2018-03-03 NaN NaN NaN NaN NaN
  2018-03-04 NaN NaN NaN NaN NaN
              A     B  C   D   E
0 2018-03-03  5  1344  0  48  20
  2018-03-04  5  1344  0  48  20
1 2018-03-03  5  1344  0  48  20
  2018-03-04  5  1344  0  48  20

Problem description

When constructing a DataFrame from a dictionary with date objects from the datetime library, values are set to NaN when using orient='index'. See examples.

Please note that I suppose that using orient='index' should be equal with a transpose.

Expected Output

print(pd.DataFrame.from_dict(example_multilevel_dict , orient='index'))

              A     B  C   D   E
0 2018-03-03  5  1344  0  48  20
  2018-03-04  5  1344  0  48  20
1 2018-03-03  5  1344  0  48  20
  2018-03-04  5  1344  0  48  20

Output of pd.show_versions()

INSTALLED VERSIONS

commit: None
python: 3.6.3.final.0
python-bits: 64
OS: Windows
OS-release: 10
machine: AMD64
processor: Intel64 Family 6 Model 142 Stepping 9, GenuineIntel
byteorder: little
LC_ALL: None
LANG: None
LOCALE: None.None

pandas: 0.22.0
pytest: None
pip: 9.0.1
setuptools: 28.8.0
Cython: None
numpy: 1.14.1
scipy: None
pyarrow: None
xarray: None
IPython: 6.2.1
sphinx: None
patsy: None
dateutil: 2.6.1
pytz: 2018.3
blosc: None
bottleneck: None
tables: None
numexpr: None
feather: None
matplotlib: None
openpyxl: None
xlrd: None
xlwt: None
xlsxwriter: None
lxml: None
bs4: None
html5lib: 1.0.1
sqlalchemy: None
pymysql: None
psycopg2: None
jinja2: 2.10
s3fs: None
fastparquet: None
pandas_gbq: None
pandas_datareader: None

Metadata

Metadata

Assignees

Labels

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions