Skip to content

ValueError (Buffer dtype mismatch) on the join with categorical data #18646

Closed
@vfilimonov

Description

@vfilimonov

Code Sample, a copy-pastable example if possible

dfa = pd.Series(np.array([1, 2, 3, 4, 5], dtype=np.int64),
                index=pd.Int64Index([1,2,3,4,5], name='IND'), name='PN').to_frame()

dfb = pd.Series([1,1,3,1,3], dtype='category',
                index=pd.Int64Index([1, 2, 7, 8, 9], name='PN'), name='CAT').to_frame()

dfa.join(dfb, on='PN')

Problem description

The code above (join with missing values) raises ValueError: Buffer dtype mismatch, expected 'Python object' but got 'long long', which is apparently ignored, however the result is correct:

---------------------------------------------------------------------------
ValueError                                Traceback (most recent call last)
ValueError: Buffer dtype mismatch, expected 'Python object' but got 'long long'

Exception ValueError: "Buffer dtype mismatch, expected 'Python object' but got 'long long'" in 'pandas._libs.lib.is_bool_array' ignored

     PN  CAT
IND         
1     1  1.0
2     2  1.0
3     3  NaN
4     4  NaN
5     5  NaN

p.s. perhaps this relates to #17187, though I'm not sure

Output of pd.show_versions()

INSTALLED VERSIONS

commit: None
python: 2.7.13.final.0
python-bits: 64
OS: Windows
OS-release: 7
machine: AMD64
processor: Intel64 Family 6 Model 63 Stepping 2, GenuineIntel
byteorder: little
LC_ALL: None
LANG: None
LOCALE: None.None

pandas: 0.21.0
pytest: 3.0.5
pip: 9.0.1
setuptools: 36.6.0
Cython: 0.25.2
numpy: 1.13.3
scipy: 1.0.0
pyarrow: None
xarray: None
IPython: 5.5.0
sphinx: 1.5.1
patsy: 0.4.1
dateutil: 2.6.1
pytz: 2017.3
blosc: None
bottleneck: 1.2.1
tables: 3.2.2
numexpr: 2.6.2
feather: None
matplotlib: 2.1.0
openpyxl: 2.4.0
xlrd: 1.0.0
xlwt: 1.1.2
xlsxwriter: 0.9.6
lxml: 3.6.0
bs4: 4.5.3
html5lib: 1.0b10
sqlalchemy: 1.1.4
pymysql: None
psycopg2: None
jinja2: 2.9.6
s3fs: None
fastparquet: None
pandas_gbq: None
pandas_datareader: 0.5.0

Metadata

Metadata

Assignees

No one assigned

    Labels

    CategoricalCategorical Data TypeDuplicate ReportDuplicate issue or pull requestReshapingConcat, Merge/Join, Stack/Unstack, ExplodeTestingpandas testing functions or related to the test suite

    Type

    No type

    Projects

    No projects

    Milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions