Open
Description
Pandas version checks
- I have checked that the issue still exists on the latest versions of the docs on
main
here
Location of the documentation
https://pandas.pydata.org/docs/reference/api/pandas.Series.diff.html#pandas.Series.diff
https://pandas.pydata.org/docs/reference/api/pandas.DataFrame.diff.html#pandas.DataFrame.diff
Documentation problem
The documentation for pandas.Series.diff
and pandas.DataFrame.diff
states that no matter the dtype of the original series/column, the output will be of dtype float64
. This is not true for series/columns of dtypes bool
-- the output here is of dtype object
.
For example:
import pandas as pd
# pd.__version__ == '2.2.0'
s = pd.Series([True, True, False, False, True])
d = s.diff()
# d.dtype is now 'object'
Indeed, the underlying function algorithms.diff explicitly differentiates between boolean and integer dtypes.
Suggested fix for documentation
The Notes
section should read something like this:
Notes
-----
For boolean dtypes, this uses :meth:`operator.xor` rather than
:meth:`operator.sub` and the result's dtype is ``object``.
Otherwise, the result is calculated according to the current dtype in {klass},
however the dtype of the result is always float64.