Closed
Description
I didn't find yet a small reproducible example, but with the actual (also small) data, I see the following problem:
In [47]: subjects_url = 'https://physionet.org/pn4/sleep-edfx/ST-subjects.xls'
...: data = pd.read_excel(subjects_url, header=[0, 1])
In [48]: data.head()
Out[48]:
Subject - age - sex Placebo night Temazepam night
Nr Age M1/F2 night nr lights off night nr lights off
0 1 60 1 1 23:01:00 2 23:48:00
1 2 35 2 2 23:27:00 1 00:00:00
2 4 18 2 1 23:53:00 2 22:37:00
3 5 32 2 2 23:23:00 1 23:34:00
4 6 35 2 1 23:28:00 2 23:26:00
When doing a set_index
with a key of the first level of the index (which I think is not supported), it actually gives a result, but an invalid one, which is illustrated by the repr that is erroring:
In [49]: res = data.set_index('Subject - age - sex')
In [50]: res
Out[50]: ---------------------------------------------------------------------------
...
TypeError: unsupported format string passed to numpy.ndarray.__format__
The invalid part is that res.index
seems to be an Int64Index, but is backed by a 2D array:
In [51]: res.index
Out[51]:
Int64Index([ 1, 60, 1, 2, 35, 2, 4, 18, 2, 5, 32, 2, 6, 35, 2, 7, 51,
2, 8, 66, 2, 9, 47, 1, 10, 20, 2, 11, 21, 2, 12, 21, 1, 13,
22, 1, 14, 20, 1, 15, 66, 2, 16, 79, 2, 17, 48, 2, 18, 53, 2,
19, 28, 2, 20, 24, 1, 21, 34, 2, 22, 56, 1, 24, 48, 2],
dtype='int64', name='Subject - age - sex')
In [52]: res.index.values
Out[52]:
array([[ 1, 60, 1],
[ 2, 35, 2],
[ 4, 18, 2],
[ 5, 32, 2],
[ 6, 35, 2],
[ 7, 51, 2],
[ 8, 66, 2],
[ 9, 47, 1],
[10, 20, 2],
[11, 21, 2],
[12, 21, 1],
[13, 22, 1],
[14, 20, 1],
[15, 66, 2],
[16, 79, 2],
[17, 48, 2],
[18, 53, 2],
[19, 28, 2],
[20, 24, 1],
[21, 34, 2],
[22, 56, 1],
[24, 48, 2]])
Done with up to date master (0.24.dev)