Skip to content

read_excel gives different data for one and more than one elements in parse_cols setting #15316

Closed
@fortooon

Description

@fortooon

related to #12292

Try to get values from only first column with empty cells

import pandas as pd
df = pd.DataFrame([["", 1, 100], [3, 2, 200], ["", 3, 300], ["", "", 400]])
df.to_excel("test_excel.xls", index=False, header=False)
fst_col = pd.read_excel("test_excel.xls", parse_cols=[0], header=None).values
fst_cols = pd.read_excel("test_excel.xls", parse_cols=[0,1], header=None).values
print(fst_col)
print("V.S.")
print(fst_cols)

...[Out]
 [[3]]
 V.S.
 [[ nan   1.]
  [  3.   2.]
  [ nan   3.]
  [ nan   nan]]

Different view of read the same column:

Whether output for first column must be the same for both cases, for generalization of reading data?
How can I get full data (included empty values) from first column using parse_cols=[0]?

Expected Output

[[ nan ]
[  3. ]
[ nan ]
[ nan ]]

pandas: 0.18.1
xlrd: 0.9.4
python: 2.7.7.final.0

Metadata

Metadata

Assignees

No one assigned

    Labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions