Open
Description
xref #27873
In many arithmetic/comparison ops we do something like
if is_list_like(other) and not hasattr(other, "dtype"):
other = np.asarray(other)
if is_list_like(other) and len(other) != len(self):
raise ValueError("Lengths must match")
but we are not entirely consistent about this (see below for summary of what we do where). AFAICT the only case where we wouldn't want to do both of these consistently is if we have an object-dtype, the relevant test case being something like:
ser = pd.Series([["A"], ("A", "B"), frozenset("C")], dtype=np.object_)
ser == ("A", "B")
(note in this example ATM ser == ser[0]
will raise ValueError
).
We should be consistent in what we are doing for these, but need to decide what that consistent behavior should be.
Summary of current behavior:
- Series comparison
- wraps list in ndarray (not listlike)
- checks length matching for list, ndarray, Index, Series (not tuple, set, ...)
- Series arithmetic
- doesn't wrap listlike at all
- length checks are left implicit
- Categorical comparisons do the "full" checks
- What if we have list-like categories?
- DatetimeArray/TimedeltaArray/PeriodArray arithmetic - no wrapping, no explicit checks
- DatetimeArray comparisons
- wrap only list
- Checks length match for all listlikes, but not at the beginning
- TimedeltaArray comparisons
- wrap any listlike (though indirectly)
- Checks length match for all listlikes, but not at the beginning
- PeriodArray comparisons
- dont wrap any listlikes
- check all list-likes up-front
Not yet reviewed: IntegerArray, PandasArray, SparseArray, Index, FooIndex