Skip to content

CLN: Exception*2 in groupby wrapper #28771

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 8 commits into from
Oct 8, 2019
Merged
Changes from 4 commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
61 changes: 35 additions & 26 deletions pandas/core/groupby/groupby.py
Original file line number Diff line number Diff line change
Expand Up @@ -11,6 +11,7 @@ class providing the base-class of operations.
from contextlib import contextmanager
import datetime
from functools import partial, wraps
import re
import types
from typing import FrozenSet, List, Optional, Tuple, Type, Union

Expand Down Expand Up @@ -617,44 +618,52 @@ def _make_wrapper(self, name):
def wrapper(*args, **kwargs):
# a little trickery for aggregation functions that need an axis
# argument
kwargs_with_axis = kwargs.copy()
if "axis" not in kwargs_with_axis or kwargs_with_axis["axis"] is None:
kwargs_with_axis["axis"] = self.axis

def curried_with_axis(x):
return f(x, *args, **kwargs_with_axis)
kwargs2 = kwargs.copy()

def curried(x):
return f(x, *args, **kwargs)
return f(x, *args, **kwargs2)

# preserve the name so we can detect it when calling plot methods,
# to avoid duplicates
curried.__name__ = curried_with_axis.__name__ = name
curried.__name__ = name

# special case otherwise extra plots are created when catching the
# exception below
if name in base.plotting_methods:
return self.apply(curried)

try:
return self.apply(curried_with_axis)
except Exception:
try:
return self.apply(curried)
except Exception:

# related to : GH3688
# try item-by-item
# this can be called recursively, so need to raise
# ValueError
# if we don't have this method to indicated to aggregate to
# mark this column as an error
try:
return self._aggregate_item_by_item(name, *args, **kwargs)
except AttributeError:
# e.g. SparseArray has no flags attr
raise ValueError
import inspect
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

import at the top of the module

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

its an expensive import im trying to get rid of in the one other place we use it

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

really? interesting

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What do we consider an "expensive" import? It seems natural to me to import the stdlib at the top of a module so would prefer to do that as well unless there really is a lot to be gained here

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

About 2.6 ms (of about 400 ms total). So its not massive, but at this point it is among the lower-hanging fruit. Fine to move I guess.

sig = inspect.signature(f)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could this be done before the currying? IIUC correctly we are mutating the kwargs after the function has been defined but before it gets executed. I think would be more intuitive if we set up the appropriate kwargs before the currying even happens

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

i wasnt sure what the semantics are of editing kwargs. is it necessarily a new dict, or could we be editing a dict that exists in the calling namespace?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't know definitely but I think the same rules would apply to kwargs as to default arguments, i.e. the parameter values get evaluated on function execution and not definition

https://docs.python.org/3/reference/compound_stmts.html#function-definitions

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If there wasn't the if name in base.plotting_methods case just above this I would think the answer is "yes" we can do this earlier, but I'm not confident as-is. Good candidate for follow-up to see if we can get that case out of this wrapper.

if "axis" in sig.parameters:
if kwargs.get("axis", None) is None:
kwargs2["axis"] = self.axis

try:
return self.apply(curried)
except TypeError as err:
if not re.search(
"reduction operation '.*' not allowed for this dtype", str(err)
):
# We don't have a cython implementation
# TODO: is the above comment accurate?
raise

# related to : GH3688
# try item-by-item
# this can be called recursively, so need to raise
# ValueError
# if we don't have this method to indicated to aggregate to
# mark this column as an error
try:
return self._aggregate_item_by_item(name, *args, **kwargs)
except AttributeError:
# e.g. SparseArray has no flags attr
# FIXME: 'SeriesGroupBy' has no attribute '_aggregate_item_by_item'
# occurs in idxmax() case
# in tests.groupby.test_function.test_non_cython_api
raise ValueError

wrapper.__name__ = name
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Any reason we don't just use functools.wraps? Maybe a general question in our code base but unclear to me why we choose to assign to the dunders directly instead of using functools

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

no idea

return wrapper

def get_group(self, name, obj=None):
Expand Down