-
-
Notifications
You must be signed in to change notification settings - Fork 18.5k
ENH: MultiIndex.from_frame #23141
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
ENH: MultiIndex.from_frame #23141
Changes from 10 commits
79bdecb
fa82618
64b45d6
64c7bb1
3ee676c
fd266f5
4bc8f5b
9d92b70
45595ad
3530cd3
1c22791
cf78780
64c2750
ede030b
190c341
e0df632
78ff5c2
0252db9
d98c8a9
8a1906e
08c120f
8353c3f
9df3c11
6d4915e
b5df7b2
ab3259c
cf95261
63051d7
a75a4a5
8d23df9
c8d696d
7cf82d1
1a282e5
b3c6a90
c760359
bb69314
9e11180
96c6af3
a5236bf
c78f364
14bfea8
6960804
11c5947
904644a
30fe0df
ec60563
8fc6609
9b906c6
e416122
4ef9ec4
4240a1e
9159b2d
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -1189,11 +1189,11 @@ def to_frame(self, index=True, name=None): | |
else: | ||
idx_names = self.names | ||
|
||
result = DataFrame({(name or level): | ||
self._get_level_values(level) | ||
for name, level in | ||
zip(idx_names, range(len(self.levels)))}, | ||
copy=False) | ||
result = DataFrame( | ||
ms7463 marked this conversation as resolved.
Show resolved
Hide resolved
|
||
list(self), | ||
columns=[n or i for i, n in enumerate(idx_names)] | ||
) | ||
|
||
if index: | ||
result.index = self | ||
return result | ||
|
@@ -1412,6 +1412,45 @@ def from_product(cls, iterables, sortorder=None, names=None): | |
labels = cartesian_product(labels) | ||
return MultiIndex(levels, labels, sortorder=sortorder, names=names) | ||
|
||
@classmethod | ||
def from_frame(cls, df, squeeze=True): | ||
""" | ||
Make a MultiIndex from a dataframe | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Finish with period. Run Use |
||
|
||
TomAugspurger marked this conversation as resolved.
Show resolved
Hide resolved
|
||
Parameters | ||
---------- | ||
df : pd.DataFrame | ||
DataFrame to be converted to MultiIndex | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Can you finish with a period. |
||
squeeze : bool | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Can you add the default. |
||
If df is a single column, squeeze multiindex to be a regular index. | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. multiiindex --> MultiIndex |
||
|
||
Returns | ||
------- | ||
index : MultiIndex | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. no need for giving a name, just leave the type. Add a description in the next line (indented). |
||
|
||
Examples | ||
-------- | ||
>>> df = pd.DataFrame([[0, u'green'], [0, u'purple'], [1, u'green'], | ||
[1, u'purple'], [2, u'green'], [2, u'purple']], | ||
columns=[u'number', u'color']) | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. this is missing the |
||
>>> pd.MultiIndex.from_frame(df) | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. can you put a blank line between cases There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. adding a comment on the case if its not obvious There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. removed other examples as I have removed the callable names /squeeze parameter
ms7463 marked this conversation as resolved.
Show resolved
Hide resolved
|
||
MultiIndex(levels=[[0, 1, 2], [u'green', u'purple']], | ||
labels=[[0, 0, 1, 1, 2, 2], [0, 1, 0, 1, 0, 1]], | ||
names=[u'number', u'color']) | ||
|
||
""" | ||
from pandas import DataFrame | ||
ms7463 marked this conversation as resolved.
Show resolved
Hide resolved
|
||
# just let column level names be the tuple of the meta df columns | ||
# since they're not required to be strings | ||
if not isinstance(df, DataFrame): | ||
raise TypeError("Input must be a DataFrame") | ||
columns = list(df) | ||
mi = cls.from_tuples(list(df.values), names=columns) | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Again, I would expect that the dtypes are preserved here, but it seems like they aren't. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Will try out something along these lines when I’m at a computer.
|
||
if squeeze: | ||
return mi.squeeze() | ||
else: | ||
return mi | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Personal nit: change the four lines of return mi.squeeze() if squeeze else mi |
||
|
||
def _sort_levels_monotonic(self): | ||
""" | ||
.. versionadded:: 0.20.0 | ||
|
@@ -1474,6 +1513,28 @@ def _sort_levels_monotonic(self): | |
names=self.names, sortorder=self.sortorder, | ||
verify_integrity=False) | ||
|
||
def squeeze(self): | ||
""" | ||
Squeeze a single level multiindex to be a regular Index instane. If | ||
the MultiIndex is more than a single level, return a copy of the | ||
MultiIndex. | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Can you run |
||
|
||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I don't think we need to make this a public method. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. changed squeeze -> _squeeze There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I am re-thinking if this should be public, see #22866 |
||
Returns | ||
------- | ||
index : Index | MultiIndex | ||
|
||
Examples | ||
-------- | ||
>>> mi = pd.MultiIndex.from_tuples([('a',), ('b',), ('c',)]) | ||
>>> mi.squeeze() | ||
ms7463 marked this conversation as resolved.
Show resolved
Hide resolved
|
||
Index(['a', 'b', 'c'], dtype='object') | ||
|
||
""" | ||
if len(self.levels) == 1: | ||
return self.levels[0][self.labels[0]] | ||
else: | ||
return self.copy() | ||
|
||
def remove_unused_levels(self): | ||
""" | ||
Create a new MultiIndex from the current that removes | ||
|
Uh oh!
There was an error while loading. Please reload this page.