BUG?: using `None` as replacement value in `replace()` typically upcasts to object dtype

I noticed that in certain cases, when replacing a value with `None`, that we always cast to object dtype, regardless of whether the dtype of the calling series can actually hold None (at least, when considering `None` just as a generic  "missing value" indicator).

For example, a float Series can hold `None` in the sense of holding missing values, which is how `None` is treated in setitem:

```python
>>> ser = pd.Series([1, 2, 3], dtype="float")
>>> ser[1] = None
>>> ser
0    1.0
1    NaN
2    3.0
dtype: float64
```

However, when using `replace()` to change the value 2.0 with None, it depends on the exact way to specify the to_replace/value combo, but typically it will upcast to object:

```python
# with list
>>> ser.replace([1, 2], [10, None])
0    10.0
1    None
2     3.0
dtype: object

# with Series -> here it gives NaN but that is because the Series constructor already coerces the None
>>> ser.replace(pd.Series({1: 10, 2: None}))
0    10.0
1     NaN
2     3.0
dtype: float64

# with scalar replacements
>>> ser.replace(1, 10).replace(2, None)
0    10.0
1    None
2     3.0
dtype: object
```
In all the above cases, when replacing `None` with `np.nan`, it of course just results in a float Series with NaN.


The reason for this is two-fold. First, in `Block._replace_coerce` there is a check specifically for `value is None` and in that case we always cast to object dtype:

https://github.com/pandas-dev/pandas/blob/5f23aced2f97f2ed481deda4eaeeb049d6c7debe/pandas/core/internals/blocks.py#L906-L910

The above is used when replacing with a list of values. But for the scalar case, we also cast to object dtype because in this case we check for `if self._can_hold_element(value)` to do the replacement with a simple setitem (and if not cast to object dtype first before trying again). But it seems that `can_hold_element(np.array([], dtype=float), None)` gives False .. 

---

Everything is tested with current main (3.0.0.dev), but I see the same behaviour on older releases (2.0 and 1.5)


---

Somewhat related issue:

* https://github.com/pandas-dev/pandas/issues/29024

	if value is None:
	# gh-45601, gh-45836, gh-46634
	if mask.any():
	has_ref = self.refs.has_reference()
	nb = self.astype(np.dtype(object))

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

BUG?: using `None` as replacement value in `replace()` typically upcasts to object dtype #60284

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

BUG?: using None as replacement value in replace() typically upcasts to object dtype #60284

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions

BUG?: using `None` as replacement value in `replace()` typically upcasts to object dtype #60284