Description
Reproducible example:
In [10]: df = pd.DataFrame({'col': range(9)}, index=pd.MultiIndex.from_product([['A0', 'A1', 'A2'], ['B0', 'B1', 'B2']], names=[1,2]))
In [11]: df
Out[11]:
col
1 2
A0 B0 0
B1 1
B2 2
A1 B0 3
B1 4
B2 5
A2 B0 6
B1 7
B2 8
In [12]: pd.options.display.max_rows = 4
In [13]: df
Out[13]:
col
1 2
A0 A0 0
A0 1
... ...
A2 A2 7
A2 8
[9 rows x 1 columns]
So the truncated repr shows incorrectly the first index level (with integer level name 1
) again for the second level.
Original post:
Code Sample, a copy-pastable example if possible
import wget
import pandas
import pickle
url = 'https://www.dropbox.com/s/aldllo0bi3m3wkl/stock?dl=1'
filename = wget.download(url)
df = pickle.load(open(filename))
df
#bad display, index duplicated?
df.head()
#expected display
Problem description
merged | ||
---|---|---|
1 | 2 | |
a. | a. | 2 |
abel | abel | 1 |
agnes | agnes | 2 |
alain | alain | 8 |
alain | 2 |
I have created a multi-index based on 2 columns .
Those two columns wont appear properly, index_column "2" being duplicated from "1"
When displaying up to the 60th first rows of dataframe, it's fine, then it duplicates again the column 1 in the column 2.
Expected Output
merged | ||
---|---|---|
1 | 2 | |
a. | masson-dubois | 2 |
abel | pinchard | 1 |
agnes | paquet | 2 |
alain | corcia | 8 |
hudelot-noellat | 2 |
Output of pd.show_versions()
pandas: 0.19.1
nose: None
pip: 9.0.1
setuptools: 28.8.0
Cython: None
numpy: 1.11.2
scipy: 0.18.0
statsmodels: 0.6.1
xarray: None
IPython: 5.1.0
sphinx: None
patsy: 0.4.1
dateutil: 2.5.0
pytz: 2016.7
blosc: None
bottleneck: None
tables: None
numexpr: None
matplotlib: 1.5.3
openpyxl: None
xlrd: None
xlwt: None
xlsxwriter: None
lxml: None
bs4: 4.5.1
html5lib: None
httplib2: 0.9.2
apiclient: 1.5.1
sqlalchemy: None
pymysql: None
psycopg2: None
jinja2: 2.8
boto: None
pandas_datareader: None