Skip to content

DataFrame(recarray, columns=MultiIndex) disregards input data, gives empty DataFrame #13415

Closed
@jzwinck

Description

@jzwinck

I previously posted this as a question (not knowing it was a bug) here: http://stackoverflow.com/questions/37732403/pandas-dataframe-from-multiindex-and-numpy-structured-array-recarray

First I create a two-level MultiIndex:

import numpy as np
import pandas as pd

ind = pd.MultiIndex.from_product([('X','Y'), ('a','b')])

I can use it like this:

pd.DataFrame(np.zeros((3,4)), columns=ind)

Which gives:

     X         Y     
     a    b    a    b
0  0.0  0.0  0.0  0.0
1  0.0  0.0  0.0  0.0
2  0.0  0.0  0.0  0.0

But now I'm trying to do this:

dtype = [('Xa','f8'), ('Xb','i4'), ('Ya','f8'), ('Yb','i4')]
pd.DataFrame(np.zeros(3, dtype), columns=ind)

But that gives me an empty DataFrame!

Empty DataFrame
Columns: [(X, a), (X, b), (Y, a), (Y, b)]
Index: []

I expected it to do the same thing as this:

df = pd.DataFrame(np.zeros(3, dtype))
df.columns = ind
df

Which is:

     X       Y   
     a  b    a  b
0  0.0  0  0.0  0
1  0.0  0  0.0  0
2  0.0  0  0.0  0

INSTALLED VERSIONS

commit: None
python: 2.7.10.final.0
python-bits: 64
OS: Linux
OS-release: 3.13.0-86-generic
machine: x86_64
processor: x86_64
byteorder: little
LC_ALL: None
LANG: en_US.UTF-8

pandas: 0.18.0
pip: 8.1.1
setuptools: 20.7.0
numpy: 1.10.0
scipy: 0.16.0
statsmodels: 0.6.1
IPython: 3.2.1
patsy: 0.4.1
dateutil: 2.5.3
pytz: 2016.4
tables: 3.2.2
numexpr: 2.5.2
matplotlib: 1.4.3

Metadata

Metadata

Assignees

No one assigned

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions