Skip to content

DEPR: pd.concat special cases DatetimeIndex to sort even when sort=False #57335

Open
@lukemanley

Description

@lukemanley

pd.concat has a specific behavior to always sort DatetimeIndex when join="outer" and the non-concatentation axes are not aligned. This was undocumented (prior to 2.2.1) and is inconsistent with all other index types and data types including other temporal types such as pyarrow timestamps/dates/times.

An attempt to treat this as a bug fix highlighted that this has been long-standing behavior that users may be accustomed to. (xref #57006)

Here are two options that come to mind:

  1. Deprecate the existing behavior and do not sort by default for DatetimeIndex. This would simplify things by removing the carve out and treating all index types / data types consistently. This would require users to explicitly pass sort=True when concatenating two frames with monotonic indexes if they wanted to ensure a monotonic result.

  2. Always sort the non-concatenation axis when join="outer" (for all dtypes). This would be consistent with how pd.merge and DataFrame.join handle outer merges and in practice may be more useful behavior since concatenating two frames with monotonic indexes will return a frame with a monotonic index.

Metadata

Metadata

Assignees

No one assigned

    Labels

    DeprecateFunctionality to remove in pandasNeeds DiscussionRequires discussion from core team before further actionReshapingConcat, Merge/Join, Stack/Unstack, ExplodeSortinge.g. sort_index, sort_values

    Type

    No type

    Projects

    No projects

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions