Description
Code Sample, a copy-pastable example if possible
1st example:
Input:
series = pd.Series([1, 1, 2, 0, 1, np.nan, 4, 4, np.nan, 3])
series.value_counts(True, bins=2)
Output:
(-0.005, 2.0] 0.5
(2.0, 4.0] 0.3
dtype: float64
Expected output:
(-0.005, 2.0] 0.625
(2.0, 4.0] 0.375
dtype: float64
2nd example:
Input:
series = pd.Series([1, 1, 2, 0, 1, np.nan, 4, 4, np.nan, 3])
series.value_counts(True, bins=2, dropna=False)
Output:
(-0.005, 2.0] 0.5
(2.0, 4.0] 0.3
dtype: float64
Expected Output:
(-0.004, 2.0] 0.5
(2.0, 4.0] 0.3
NaN 0.2
dtype: float64
Problem description
dropna argument in value_counts() seems to have no effect when bins is not None. Expected behaviour is better, because it sums up shares up to 1 when dropna=True and shows NaNs when dropna=False.
Output of pd.show_versions()
INSTALLED VERSIONS
commit: None
python: 3.7.3.final.0
python-bits: 64
OS: Windows
OS-release: 10
machine: AMD64
processor: Intel64 Family 6 Model 142 Stepping 10, GenuineIntel
byteorder: little
LC_ALL: None
LANG: None
LOCALE: None.None
pandas: 0.24.2
pytest: None
pip: 19.0.3
setuptools: 40.8.0
Cython: None
numpy: 1.16.2
scipy: None
pyarrow: None
xarray: None
IPython: None
sphinx: None
patsy: None
dateutil: 2.8.0
pytz: 2018.9
blosc: None
bottleneck: None
tables: None
numexpr: None
feather: None
matplotlib: None
openpyxl: None
xlrd: None
xlwt: None
xlsxwriter: None
lxml.etree: None
bs4: None
html5lib: None
sqlalchemy: None
pymysql: None
psycopg2: None
jinja2: None
s3fs: None
fastparquet: None
pandas_gbq: None
pandas_datareader: None
gcsfs: None