Closed
Description
crosstab
also has a bug -- it counts np.nan in margin totals even when dropna=True
.
Appears independent of #12569 and #4003
df = pd.DataFrame({'a':[1,2,2,2,2,np.nan],'b':[3,3,4,4,4,4]})
pd.crosstab(df.a,df.b, margins=True)
Out[233]:
b 3 4 All
a
1.0 1 0 1
2.0 1 3 4
All 2 4 6
Expected Output
Out[233]:
b 3 4 All
a
1.0 1 0 1
2.0 1 3 4
All 2 3 5
output of pd.show_versions()
INSTALLED VERSIONS
commit: None
python: 3.4.4.final.0
python-bits: 64
OS: Darwin
OS-release: 15.3.0
machine: x86_64
processor: i386
byteorder: little
LC_ALL: None
LANG: en_US.UTF-8
pandas: 0.17.1
nose: 1.3.7
pip: 8.1.0
setuptools: 20.2.2
Cython: 0.23.4
numpy: 1.10.4
scipy: 0.16.1
statsmodels: None
IPython: 4.0.1
sphinx: 1.3.1
patsy: 0.4.0
dateutil: 2.4.2
pytz: 2015.7
blosc: None
bottleneck: 1.0.0
tables: 3.2.2
numexpr: 2.4.4
matplotlib: 1.4.3
openpyxl: 2.2.6
xlrd: 0.9.4
xlwt: 1.0.0
xlsxwriter: 0.7.7
lxml: 3.4.4
bs4: 4.4.1
html5lib: 0.999
httplib2: None
apiclient: None
sqlalchemy: 1.0.9
pymysql: None
psycopg2: None
Jinja2: 2.8