Skip to content

DEPR (PDEP-7/CoW): deprecate and remove copy keyword (except in constructors) #56022

Open
@jorisvandenbossche

Description

@jorisvandenbossche

PDEP-7 did not spell it out explicitly, but a consequence of Copy-on-Write is that the copy keyword is no longer very useful.

Currently a bunch of methods have this keyword (astype, rename, reindex, ..., full list at #50535), for example:

>>> df = pd.DataFrame({'a': [1, 2], 'b': [3, 4]})
>>> df2 = df.rename(columns=str.upper)  # has a default of `copy=True`

With the current default behaviour df2 is a full copy of df. This default of copy=True will change to no longer copy when CoW is enabled (but act as "delayed" copy). Users could nowadays use copy=False to avoid the full copy, but this will no longer be possible with CoW (the previous concept of "shallow copy" no longer exists, xref #36195 (comment)). So passing copy=False is something we will have to deprecate anyhow.

In theory we could keep copy=True as a non-default option, which would result in an actual hard copy instead of the CoW-tracked view. However, in #50535, we essentially already decided to not do this, and in the CoW mode currently a copy=True is simply ignored.
The idea is that if a user really wants a hard copy, they can add a .copy() in the chain (e.g. df2 = df.rename(..).copy() instead of df2 = df.rename(..., copy=True). But so in #50535 we felt that it was not worth to keep a whole keyword for such minor use case which has a clear and easy alternative.

So the consequence of the current behaviour with CoW enabled is that we can deprecate the copy keyword altogether. The idea is that we can already start doing this slowly with a DeprecationWarning in pandas 2.2, which at the same time can point people to enable CoW in the warning message as alternative.

While it's a consequence of the CoW behaviour changes, it's still deprecating a keyword in 15+ methods, so opening this issue for visibility. cc @pandas-dev/pandas-core

Metadata

Metadata

Assignees

Type

No type

Projects

No projects

Relationships

None yet

Development

No branches or pull requests

Issue actions