Skip to content

[ArrayManager] TST: Enable extension tests #40348

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged

Conversation

jorisvandenbossche
Copy link
Member

xref #39146

@jorisvandenbossche jorisvandenbossche added the Internals Related to non-user accessible pandas implementation label Mar 10, 2021
@jorisvandenbossche
Copy link
Member Author

General note: I am using the if hasattr(result._mgr, "blocks"): to check whether we are using BlockManager or ArrayManager in the base extensions tests, to not rely on fixtures (like using_array_manager) that are defined higher up in pandas but are not available for downstream packages.

@jorisvandenbossche
Copy link
Member Author

@jreback @jbrockmendel any comments here? (trying to get all testing PRs merged to reduce the custom selection)

@jbrockmendel
Copy link
Member

Will take a look now.

return self._get_data_subset(lambda arr: is_numeric_dtype(arr.dtype))
return self._get_data_subset(
lambda arr: is_numeric_dtype(arr.dtype)
or getattr(arr.dtype, "_is_numeric", False)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

seems like is_numeric_dtype should catch this?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It maybe should, but it doesn't at the moment.

The above is mimicking the logic of the BlockManager (ExtensionBlock.is_numeric checks this attribute of the dtype). But it's certainly inconsistent .. The consequence of the inconsistency is that eg decimal is considered numeric for groupby operations, but is_numeric_dtype(decimal_dtype) will return False (not sure where this would have a direct impact).

@@ -388,7 +388,8 @@ def test_loc_len1(self, data):
# see GH-27785 take_nd with indexer of len 1 resulting in wrong ndim
df = pd.DataFrame({"A": data})
res = df.loc[[0], "A"]
assert res._mgr._block.ndim == 1
if hasattr(res._mgr, "blocks"):
assert res._mgr._block.ndim == 1
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

i think OK to just check res._mgr.ndim (or even just res.ndim)

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I added a check for the ndim of mgr (so there is also a check for ArrayManager), but left the block ndim check in place, because the original bug was that only the block ndim was wrong (the manager ndim was correct)

@jreback jreback added this to the 1.3 milestone Mar 16, 2021
@jreback jreback added the Testing pandas testing functions or related to the test suite label Mar 16, 2021
@jreback
Copy link
Contributor

jreback commented Mar 16, 2021

looks fine, may want to rebase

@jorisvandenbossche jorisvandenbossche merged commit 88fb114 into pandas-dev:master Mar 18, 2021
@jorisvandenbossche jorisvandenbossche deleted the am-tests-extension branch March 18, 2021 07:40
JulianWgs pushed a commit to JulianWgs/pandas that referenced this pull request Jul 3, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Internals Related to non-user accessible pandas implementation Testing pandas testing functions or related to the test suite
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants