Skip to content

BUG: merge_asof raising KeyError for extension dtypes #53458

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 3 commits into from
Jun 1, 2023

Conversation

lukemanley
Copy link
Member

@lukemanley lukemanley added Reshaping Concat, Merge/Join, Stack/Unstack, Explode ExtensionArray Extending pandas with custom dtypes or arrays. labels May 30, 2023
@lukemanley lukemanley added this to the 2.1 milestone May 30, 2023

pa_type = self._pa_array.type
assert pa.types.is_timestamp(pa_type)
np_dtype = np.dtype(f"M8[{pa_type.unit}]")
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

IIRC we do something like the below in another area of this file correct? If so can we reuse?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yes, updated a few locations to reuse these methods.

@mroeschke mroeschke merged commit 470e945 into pandas-dev:main Jun 1, 2023
@mroeschke
Copy link
Member

Thanks @lukemanley

@lukemanley lukemanley deleted the merge-asof-extension-dtypes branch June 3, 2023 10:50
topper-123 pushed a commit to topper-123/pandas that referenced this pull request Jun 5, 2023
* fix merge_asof raising KeyError for extension dtypes

* reuse new methods elsewhere
Daquisu pushed a commit to Daquisu/pandas that referenced this pull request Jul 8, 2023
* fix merge_asof raising KeyError for extension dtypes

* reuse new methods elsewhere
@0x26res
Copy link

0x26res commented Aug 3, 2023

@lukemanley thanks for fixing. I was wondering, do you expect it to work for timezone aware pyarrow timestamps eg pa.timestamp("ns", "UTC")

@lukemanley
Copy link
Member Author

@0x26res - yes, tz-aware pyarrow timestamps should work in pandas 2.1 (to be released later this month):

In [1]: import pandas as pd

In [2]: import pyarrow as pa

In [3]: dtype = pd.ArrowDtype(pa.timestamp('ns', 'UTC'))

In [4]: df1 = pd.DataFrame({"a": [1, 3]}, index=pd.Index([1, 3], dtype=dtype))

In [5]: df2 = pd.DataFrame({"b": [0, 2]}, index=pd.Index([0, 2], dtype=dtype))

In [6]: pd.merge_asof(df1, df2, left_index=True, right_index=True)
Out[6]: 
                                     a  b
1970-01-01 00:00:00.000000001+00:00  1  0
1970-01-01 00:00:00.000000003+00:00  3  2

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
ExtensionArray Extending pandas with custom dtypes or arrays. Reshaping Concat, Merge/Join, Stack/Unstack, Explode
Projects
None yet
Development

Successfully merging this pull request may close these issues.

ENH: Support pyarrow timestamps in merge_asof
3 participants