Skip to content

BUG: fix reverse comparison operations for Categorical #8706

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 1 commit into from
Nov 2, 2014

Conversation

immerrr
Copy link
Contributor

@immerrr immerrr commented Nov 2, 2014

This PR fixes #8658 and also adds pd.lib.unbox_if_zerodim that extracts values from zerodim arrays and adds detection of pd.Period, datetime.date and datetime.time in pd.lib.isscalar.

I tried making unbox_if_zerodim usage more implicit but there's just too much to test when this is introduced (think every method that accepts a scalar must have another test to check it also accepts zerodim array). Definitely too much to add for a single PR, but overall zerodim support could be added later on class-by-class basis.

@jreback
Copy link
Contributor

jreback commented Nov 2, 2014

hmm
what about nan/NaT ?

@immerrr
Copy link
Contributor Author

immerrr commented Nov 2, 2014

what about nan/NaT?

nan is good as long as it's some sort of float.
NaT is good as long as it's an instance of datetime.date:

In [4]: type(pd.NaT).mro()
Out[4]: 
[pandas.tslib.NaTType,
 pandas.tslib._NaT,
 pandas.tslib._Timestamp,
 datetime.datetime,
 datetime.date,
 object]

But OK, I'll add the tests

or PyDelta_Check(val)
or PyTime_Check(val)
or isinstance(val, PandasScalarTypes))

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think u need this PandasScalarTypes
all pandss objects now have a _typ property
so for period u can just check it directly
eg
getsttr(obj,'_typ',None) in ['period]

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In [11]:  getattr(pd.Period('2014-01-01') ,'_typ', 'i dont have a _typ')
Out[11]: 'i dont have a _typ'

PeriodIndex has _typ = 'periodindex', but I'm not sure what does it suppose to mean.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This can probably be solved by splitting out a PandasScalarObject class from PandasObject. It may add no functionality to PandasObject mixin, but one could inherit from it to underline that it is a scalar and should be treated accordingly.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I realized that I didn't fix Period like I did Categorical, see here:https://github.com/pydata/pandas/blob/master/pandas/core/common.py#L85

if you add _typ to Period then you can create ABCPeriod and then import it in cython and do a direct isinstance check, no need for PandasScalarObject (though you could create that as a congolomerate type as ABCScalar if you want ('categorical','period','timedelta','timestamp')) I think would work.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This reminds me. I was always wondering why are those implemented in pandas by hand instead of using abc.ABCMeta?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

not sure - maybe was a py2/py3 issue? (these were done a while ago)

@jreback
Copy link
Contributor

jreback commented Nov 2, 2014

pls also create an issue to change use of np.isscalar to use lib.isscalar

@immerrr
Copy link
Contributor Author

immerrr commented Nov 2, 2014

I'm also not sure about the name of unbox_if_zerodim, it may create an impression that it's related to timestamp boxing/unboxing, but I couldn't come up with a better yet concise name.

@immerrr immerrr force-pushed the add-basic-zerodim-array-support branch 2 times, most recently from 4b76138 to 2a17bd3 Compare November 2, 2014 11:06
@jreback
Copy link
Contributor

jreback commented Nov 2, 2014

how about _maybe_to_scalar ?

@jreback jreback added Bug Categorical Categorical Data Type labels Nov 2, 2014
@jreback jreback added this to the 0.15.1 milestone Nov 2, 2014
@immerrr
Copy link
Contributor Author

immerrr commented Nov 2, 2014

how about _maybe_to_scalar ?

It kind of loses the zerodim part and, besides, array may contain a non-scalar, e.g. np.array({}).item() == {}. Maybe item_if_zerodim?

@immerrr
Copy link
Contributor Author

immerrr commented Nov 2, 2014

Or, given there's a values_from_object, item_from_zerodim?

@jreback
Copy link
Contributor

jreback commented Nov 2, 2014

item_from_zerodim sounds good (maybe doc better that I did values_from_object :)

@immerrr immerrr force-pushed the add-basic-zerodim-array-support branch 5 times, most recently from 6194177 to c227de7 Compare November 2, 2014 14:10
By the time the comparison gets dispatched to _cat_compare_op, the first
argument becomes zerodim array and no longer passes isscalar test.

ENH: add pd.lib.item_from_zerodim to extract values from zerodim arrays

ENH: make pd.lib.isscalar detect pd.Period, datetime.date and datetime.time
@immerrr immerrr force-pushed the add-basic-zerodim-array-support branch from c227de7 to 1a6da8d Compare November 2, 2014 15:27
@jreback jreback merged commit 1a6da8d into pandas-dev:master Nov 2, 2014
@jreback
Copy link
Contributor

jreback commented Nov 2, 2014

merged via 02de853

I changed it very slightly to make the period checking more consistent with internals. (and IIRC need this somewhere else anyhow).

thanks!

these subtle changes are excellent! all towards better code!

@immerrr immerrr deleted the add-basic-zerodim-array-support branch November 2, 2014 21:37
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Bug Categorical Categorical Data Type
Projects
None yet
Development

Successfully merging this pull request may close these issues.

BUG: Categoricals cannot compare to ndim=0 array values
2 participants