Skip to content

Timestamp comparison inconsistency #22017

Closed
@jbrockmendel

Description

@jbrockmendel

Trying to make align the behavior of DataFrame/Series ops, a sticking point turns out to be Timestamp comparisons:

ts = pd.Timestamp.now()
df = pd.DataFrame({'A': [1.0, 2.0],
                   'B': [1, 2],
                   'C': ['foo', 'bar']})

>>> df < ts
      A     B     C
0  True  True  True
1  True  True  True
>>> ts < df
      A     B     C
0  True  True  True
1  True  True  True

So first off, this behavior is just nuts. But let's look at Series behavior:

df['A'] < ts
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "/usr/local/lib/python3.7/site-packages/pandas/core/ops.py", line 1283, in wrapper
    res = na_op(values, other)
  File "/usr/local/lib/python3.7/site-packages/pandas/core/ops.py", line 1167, in na_op
    result = method(y)
  File "pandas/_libs/tslibs/timestamps.pyx", line 170, in pandas._libs.tslibs.timestamps._Timestamp.__richcmp__
TypeError: Cannot compare type 'Timestamp' with type 'float'

>>> df['B'] == ts
0    False
1    False
Name: B, dtype: bool

>>> df['B'] > ts
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "/usr/local/lib/python3.7/site-packages/pandas/core/ops.py", line 1283, in wrapper
    res = na_op(values, other)
  File "/usr/local/lib/python3.7/site-packages/pandas/core/ops.py", line 1167, in na_op
    result = method(y)
  File "pandas/_libs/tslibs/timestamps.pyx", line 170, in pandas._libs.tslibs.timestamps._Timestamp.__richcmp__
TypeError: Cannot compare type 'Timestamp' with type 'int'

>>> df['C']
0    foo
1    bar
Name: C, dtype: object
>>> df['C'] <= ts
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "/usr/local/lib/python3.7/site-packages/pandas/core/ops.py", line 1283, in wrapper
    res = na_op(values, other)
  File "/usr/local/lib/python3.7/site-packages/pandas/core/ops.py", line 1143, in na_op
    result = _comp_method_OBJECT_ARRAY(op, x, y)
  File "/usr/local/lib/python3.7/site-packages/pandas/core/ops.py", line 1122, in _comp_method_OBJECT_ARRAY
    result = libops.scalar_compare(x, y, op)
  File "pandas/_libs/ops.pyx", line 98, in pandas._libs.ops.scalar_compare
  File "pandas/_libs/tslibs/timestamps.pyx", line 170, in pandas._libs.tslibs.timestamps._Timestamp.__richcmp__
TypeError: Cannot compare type 'Timestamp' with type 'str'

The Series behavior is deferring to the Timestamp implementation, which is Technically Correct. @jreback Are there backward-compat reasons for the existing behavior of DataFrame implementation, or can it go?

If we're OK with raising, then I'm close to being ready to dispatch DataFrame ops to their Series counterparts, at which point we can get rid of BlockManager.eval and Block.eval.

Metadata

Metadata

Assignees

No one assigned

    Labels

    AlgosNon-arithmetic algos: value_counts, factorize, sorting, isin, clip, shift, diffBugDatetimeDatetime data dtype

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions