BUG: df.duplicated treats None as np.nan in object columns

Found out while writing tests for `.duplicated` in #21645 (so far, `.duplicated` was almost exclusively tested implicitly through `.drop_duplicates`)

At first I thought this is intended behaviour for `DataFrame.duplicated()`, but `Series.duplicated()` does not treat it equally. This makes sense to me, since as objects, `None is not np.nan` - I therefore labelled this as a bug.

```
s = pd.Series([np.nan, 3, 3, None, np.nan], dtype=object)
s
# 0     NaN
# 1       3
# 2       3
# 3    None
# 4     NaN
# dtype: object

s.duplicated()
# 0    False
# 1    False
# 2     True
# 3    False
# 4     True
# dtype: bool

s.to_frame().duplicated()
# 0    False
# 1    False
# 2     True
# 3     True  <- CHANGED
# 4     True
# dtype: bool
```

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

BUG: df.duplicated treats None as np.nan in object columns #21720

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

BUG: df.duplicated treats None as np.nan in object columns #21720

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions