pd.groupby(pd.TimeGrouper()) mishandles null values in dates

#### Code Sample, a copy-pastable example if possible
#### The code is updated following some comments
```python
import pandas as pd
import random
from random import randint

random.seed(2)
data= [['2010-01-06', randint(1,9)],
       ['2010-08-26', randint(1,9)],
       ['2010-09-06', randint(1,9)],
       ['2010-09-16', 10],
       ['2010-09-20', 10],
       ['2010-09-23', 10],
       ['2010-09-24', randint(1,9)],
       ['2010-09-20', randint(1,9)],]

for m in range(1270):
    data.append(['2010' + '-' + str(randint(10, 12)).zfill(2) + '-' + str(randint(1, 32)).zfill(2),
                randint(1, 121111)])

df = pd.DataFrame(data)
df.columns = ['date', 'n']
df['date'] = pd.to_datetime(df['date'], errors='coerce')
df_r = df[df['date'].notnull()]

g1 = df.groupby(pd.TimeGrouper(key='date', freq='M'))['n'].nunique()
g2 = df_r.groupby(pd.TimeGrouper(key='date', freq='M'))['n'].nunique()
# This should print 'True' but it prints 'False'
print((g1==g2).mean() == 1)
```

#### Problem description
When a columns is used in TimeGrouper to group, null values are supposed to be ignored. This is indeed correct when dataset is small. However, the above given code demonstrates that when dataset is larger, sometimes distributes null values into some legit dates. Worst of all there was one time it inserted a value in a row and shifted the entire time series downwards. When I compare two grouped series it made me think one is leading another by 1 month, causing significant waste of resources as I was developing a financial model based on large datasets.

Updated comments after further investigation:
This same piece of code behaves different on some different versions, although none of them, including the latest 0.20.3, produces correct results.


#### Expected Output
True
#### Output of ``pd.show_versions()``


<details>

INSTALLED VERSIONS
this is also updated
------------------
commit: None
python: 2.7.13.final.0
python-bits: 64
OS: Windows
OS-release: 7
machine: AMD64
processor: Intel64 Family 6 Model 63 Stepping 0, GenuineIntel
byteorder: little
LC_ALL: None
LANG: None
LOCALE: None.None

pandas: 0.20.1
pytest: 3.0.7
pip: 9.0.1
setuptools: 27.2.0
Cython: 0.25.2
numpy: 1.12.1
scipy: 0.19.0
xarray: None
IPython: 5.3.0
sphinx: 1.5.6
patsy: 0.4.1
dateutil: 2.6.0
pytz: 2017.2
blosc: None
bottleneck: 1.2.1
tables: 3.2.2
numexpr: 2.6.2
feather: None
matplotlib: 2.0.2
openpyxl: 2.4.7
xlrd: 1.0.0
xlwt: 1.2.0
xlsxwriter: 0.9.6
lxml: 3.7.3
bs4: 4.6.0
html5lib: 0.999
sqlalchemy: 1.1.9
pymysql: None
psycopg2: None
jinja2: 2.9.6
s3fs: None
pandas_gbq: None
pandas_datareader: None
</details>

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

pd.groupby(pd.TimeGrouper()) mishandles null values in dates #17575

Code Sample, a copy-pastable example if possible

The code is updated following some comments

Problem description

Expected Output

Output of `pd.show_versions()`

INSTALLED VERSIONS
this is also updated

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

pd.groupby(pd.TimeGrouper()) mishandles null values in dates #17575

Description

Code Sample, a copy-pastable example if possible

The code is updated following some comments

Problem description

Expected Output

Output of pd.show_versions()

INSTALLED VERSIONS this is also updated

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions

Output of `pd.show_versions()`

INSTALLED VERSIONS
this is also updated