Description
Code Sample, a copy-pastable example if possible
dfa = pd.Series(np.array([1, 2, 3, 4, 5], dtype=np.int64),
index=pd.Int64Index([1,2,3,4,5], name='IND'), name='PN').to_frame()
dfb = pd.Series([1,1,3,1,3], dtype='category',
index=pd.Int64Index([1, 2, 7, 8, 9], name='PN'), name='CAT').to_frame()
dfa.join(dfb, on='PN')
Problem description
The code above (join with missing values) raises ValueError: Buffer dtype mismatch, expected 'Python object' but got 'long long'
, which is apparently ignored, however the result is correct:
---------------------------------------------------------------------------
ValueError Traceback (most recent call last)
ValueError: Buffer dtype mismatch, expected 'Python object' but got 'long long'
Exception ValueError: "Buffer dtype mismatch, expected 'Python object' but got 'long long'" in 'pandas._libs.lib.is_bool_array' ignored
PN CAT
IND
1 1 1.0
2 2 1.0
3 3 NaN
4 4 NaN
5 5 NaN
p.s. perhaps this relates to #17187, though I'm not sure
Output of pd.show_versions()
INSTALLED VERSIONS
commit: None
python: 2.7.13.final.0
python-bits: 64
OS: Windows
OS-release: 7
machine: AMD64
processor: Intel64 Family 6 Model 63 Stepping 2, GenuineIntel
byteorder: little
LC_ALL: None
LANG: None
LOCALE: None.None
pandas: 0.21.0
pytest: 3.0.5
pip: 9.0.1
setuptools: 36.6.0
Cython: 0.25.2
numpy: 1.13.3
scipy: 1.0.0
pyarrow: None
xarray: None
IPython: 5.5.0
sphinx: 1.5.1
patsy: 0.4.1
dateutil: 2.6.1
pytz: 2017.3
blosc: None
bottleneck: 1.2.1
tables: 3.2.2
numexpr: 2.6.2
feather: None
matplotlib: 2.1.0
openpyxl: 2.4.0
xlrd: 1.0.0
xlwt: 1.1.2
xlsxwriter: 0.9.6
lxml: 3.6.0
bs4: 4.5.3
html5lib: 1.0b10
sqlalchemy: 1.1.4
pymysql: None
psycopg2: None
jinja2: 2.9.6
s3fs: None
fastparquet: None
pandas_gbq: None
pandas_datareader: 0.5.0