Description
Code Sample, a copy-pastable example if possible
Input:
import numpy as np
import pandas as pd
df2 = pd.DataFrame(dict(x=0, y=[np.nan]*9 + [1]*9))
print(df2.head())
print(df2.groupby('x').ffill().head())
Output:
x y
0 0 NaN
1 0 NaN
2 0 NaN
3 0 NaN
4 0 NaN
x y
0 0 NaN
1 0 1.0
2 0 1.0
3 0 1.0
4 0 1.0
Problem description
The new groupby().ffill()
in pandas 0.23.0 (#19673) returns incorrect answers, and appears to be permuting the input before filling.
Expected Output
x y
0 0 NaN
1 0 NaN
2 0 NaN
3 0 NaN
4 0 NaN
x y
0 0 NaN
1 0 NaN
2 0 NaN
3 0 NaN
4 0 NaN
Output of pd.show_versions()
INSTALLED VERSIONS
commit: None
python: 3.6.5.final.0
python-bits: 64
OS: Linux
OS-release: 4.14.14-200.fc26.x86_64
machine: x86_64
processor: x86_64
byteorder: little
LC_ALL: None
LANG: en_GB.utf8
LOCALE: en_GB.UTF-8
pandas: 0.23.0
pytest: 3.5.1
pip: 10.0.1
setuptools: 39.1.0
Cython: 0.28.2
numpy: 1.13.3
scipy: 0.19.1
pyarrow: None
xarray: None
IPython: 4.2.1
sphinx: 1.7.4
patsy: 0.5.0
dateutil: 2.7.2
pytz: 2018.4
blosc: None
bottleneck: 1.2.1
tables: 3.4.3
numexpr: 2.6.2
feather: None
matplotlib: 2.2.2
openpyxl: None
xlrd: 1.1.0
xlwt: None
xlsxwriter: None
lxml: None
bs4: 4.6.0
html5lib: 1.0.1
sqlalchemy: 1.2.7
pymysql: None
psycopg2: None
jinja2: 2.10
s3fs: None
fastparquet: 0.1.5
pandas_gbq: None
pandas_datareader: None