Closed
Description
While trying to get dask's CI passing (dask/dask#6996), I noticed another error related to concat. Dask concatenates the empty "meta" dataframe to know the shape/dtypes of the resulting dataframe, and something is failing in there now.
Small reproducer without dask:
>>> df1 = pd.DataFrame({'a': [1, 2, 3], 'b': [4, 5, 6]})
>>> df2 = pd.DataFrame({'b': [1, 2, 3], 'c': [4, 5, 6]})
>>> pd.concat([df1[0:0], df2[0:0], df1[0:0]])
---------------------------------------------------------------------------
AssertionError Traceback (most recent call last)
<ipython-input-151-b1206da866d9> in <module>
----> 1 pd.concat([df1[0:0], df2[0:0], df1[0:0]])
~/scipy/pandas/pandas/core/reshape/concat.py in concat(objs, axis, join, ignore_index, keys, levels, names, verify_integrity, sort, copy)
297 )
298
--> 299 return op.get_result()
300
301
~/scipy/pandas/pandas/core/reshape/concat.py in get_result(self)
518
519 new_data = concatenate_block_managers(
--> 520 mgrs_indexers, self.new_axes, concat_axis=self.bm_axis, copy=self.copy
521 )
522 if not self.copy:
~/scipy/pandas/pandas/core/internals/concat.py in concatenate_block_managers(mgrs_indexers, axes, concat_axis, copy)
89 else:
90 b = make_block(
---> 91 _concatenate_join_units(join_units, concat_axis, copy=copy),
92 placement=placement,
93 ndim=len(axes),
~/scipy/pandas/pandas/core/internals/concat.py in _concatenate_join_units(join_units, concat_axis, copy)
325 join_units = nonempties
326
--> 327 empty_dtype, upcasted_na = _get_empty_dtype_and_na(join_units)
328
329 to_concat = [
~/scipy/pandas/pandas/core/internals/concat.py in _get_empty_dtype_and_na(join_units)
436
437 msg = "invalid dtype determination in get_concat_dtype"
--> 438 raise AssertionError(msg)
439
440
AssertionError: invalid dtype determination in get_concat_dtype
It occurs when reindexing happens (not fully aligned dataframes), and apparently at least 3 dataframes are needed to trigger it (the same example with only 2 dataframes passed to concat doesn't fail).
I suppose this might be related to #38843 (change only on master, so not a target for 1.2.1) cc @jbrockmendel