Open
Description
from datetime import datetime
from pandas import DataFrame
import numpy as np
max_int = np.iinfo(np.int64).max
min_int = np.iinfo(np.int64).min
df = DataFrame([max_int, min_int], index=[datetime(2013, 1, 1), datetime(2013, 1, 1)])
assert df.resample("M").apply(np.sum)[0][0] == -1
...
AssertionError
The assertion error occurs because during the aggregation, pandas
checks in cython_operation
in core/groupby.py
via _is_cython_func
from core/base.py
whether there are any "missing" integer values (assuming the data is integer
) before and after the aggregation, which are defined as iNaT = -9223372036854775808
. If there are any such values, we automatically cast the data to float
.
This logic is quite prevalent in the codebase, but it does seem quite fraught with pitfalls. For example, what if the output of a computation got the value -9223372036854775808
? Also, what if the user intended to use -9223372036854775808
as a legitimate data point?
Unlikely, sure. But reasonable, absolutely.