Skip to content

call __finalize__ in more methods #37186

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 3 commits into from
Oct 20, 2020
Merged

Conversation

arw2019
Copy link
Member

@arw2019 arw2019 commented Oct 17, 2020

  • closes #xxxx
  • tests added / passed
  • passes black pandas
  • passes git diff upstream/master -u -- "*.py" | flake8 --diff
  • whatsnew entry

The goal here is to make progress on some of the more uncontroversial/easier content of #28283. I'll split off into another PR if this gets too big

@arw2019 arw2019 force-pushed the GH28283-duplicated branch from 0cc664f to b1747d2 Compare October 17, 2020 03:42
@arw2019 arw2019 marked this pull request as draft October 17, 2020 04:00
@jreback jreback added the metadata _metadata, .attrs label Oct 17, 2020
@jreback
Copy link
Contributor

jreback commented Oct 17, 2020

cc @TomAugspurger

lgtm.

@arw2019 arw2019 marked this pull request as ready for review October 17, 2020 16:32
@arw2019 arw2019 changed the title WIP: call __finalize__ in more methods call __finalize__ in more methods Oct 17, 2020
return stack(self, level, dropna=dropna)
result = stack(self, level, dropna=dropna)

return result.__finalize__(self, method="stack")
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this falls into the name issue I spotted in #37037. Essentially, if df has a column named name we try to set result.name = self.name, i.e. the series name.

In [10]: df = pd.DataFrame({"A": [1, 2], "name": [3, 4]})

In [11]: df.stack().__finalize__(df)
---------------------------------------------------------------------------
TypeError                                 Traceback (most recent call last)
<ipython-input-11-899f15844169> in <module>
----> 1 df.stack().__finalize__(df)

~/Envs/dask-dev/lib/python3.8/site-packages/pandas/core/generic.py in __finalize__(self, other, method, **kwargs)
   5157         if isinstance(other, NDFrame):
   5158             for name in self._metadata:
-> 5159                 object.__setattr__(self, name, getattr(other, name, None))
   5160         return self
   5161

~/Envs/dask-dev/lib/python3.8/site-packages/pandas/core/series.py in name(self, value)
    464     def name(self, value):
    465         if value is not None and not is_hashable(value):
--> 466             raise TypeError("Series.name must be a hashable type")
    467         object.__setattr__(self, "_name", value)
    468

TypeError: Series.name must be a hashable type

I'll think the solution is to only iterate over the intersection of self._metadata and other._metadata. I'll put up a PR.

@TomAugspurger
Copy link
Contributor

#37205 handles the dataframe.name issue.

I don't know if we want to add a test case for all of those, but one or two might not hurt?

@arw2019
Copy link
Member Author

arw2019 commented Oct 17, 2020

#37205 handles the dataframe.name issue.

I don't know if we want to add a test case for all of those, but one or two might not hurt?

Great!

I can add the tests here and then probably we would want to merge #37205 before this goes in

@jreback jreback added this to the 1.2 milestone Oct 20, 2020
@jreback
Copy link
Contributor

jreback commented Oct 20, 2020

can you merge master and ping on green as merged @TomAugspurger patch

@arw2019 arw2019 force-pushed the GH28283-duplicated branch from b1747d2 to 8475562 Compare October 20, 2020 06:06
@arw2019
Copy link
Member Author

arw2019 commented Oct 20, 2020

Green

@jreback jreback merged commit 2f55283 into pandas-dev:master Oct 20, 2020
@jreback
Copy link
Contributor

jreback commented Oct 20, 2020

thanks @arw2019

JulianWgs pushed a commit to JulianWgs/pandas that referenced this pull request Oct 26, 2020
kesmit13 pushed a commit to kesmit13/pandas that referenced this pull request Nov 2, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
metadata _metadata, .attrs
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants