Skip to content

ENH: also check type of right DataFrame to determine (sub)class of result DataFrame #44054

Open
@jorisvandenbossche

Description

@jorisvandenbossche

Currently, we use the type of the left object to construct the result of pd.merge:

typ = self.left._constructor
result = typ(result_data).__finalize__(self, method=self._merge_type)

For GeoPandas users, this means that the order of the arguments passed matters: pd.merge(df, gdf, ..) (or df.merge(gdf, ..)) returns a DataFrame while pd.merge(gdf, df, ...) (or gdf.merge(df, ..)) returns a GeoDataFrame. This can be surprising for users.

I think it should be rather easy / safe to also check the type of the right object and use that class (if left is a non-subclassed pandas.DataFrame, otherwise left still has "precedence")

API breaking implications

This can result in a different class type as return value. But for subclasses that are mostly compatible with pandas.DataFrame (and only add functionality), this should not have a big impact.

Metadata

Metadata

Assignees

No one assigned

    Labels

    EnhancementReshapingConcat, Merge/Join, Stack/Unstack, ExplodeSubclassingSubclassing pandas objects

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions