Incorrect result when resampling DatetimeIndex due to hidden cast of ints to floats

To summarize: should raise when loss of precision when casting from ints to floats for computations (everywhere!)

The wrong result is returned in the following case because of unnecessary casting of ints to floats.

```
max_int = np.iinfo(np.int64).max
min_int = np.iinfo(np.int64).min
assert max_int + min_int == -1
assert DataFrame([max_int,min_int], index=[datetime(2013,1,1),datetime(2013,1,1)]).resample('M', how=np.sum)[0][0] == -1
```

This is the offending code in `groupby.py` that does the casting:

```
class NDFrameGroupBy(GroupBy):
    def _cython_agg_blocks(self, how, numeric_only=True):
        .....
        if is_numeric:
            values = com.ensure_float(values)
```

and even worse, sneakily casts the floats back to integers hiding the problem!

```
        # see if we can cast the block back to the original dtype
        result = block._try_cast_result(result)
```

It should be possible to perform this calculation without any casting. For example, the `sum()` of a DataFrame returns the correct result:

```
assert DataFrame([max_int, min_int]).sum()[0] == -1
```

I am working with financial data that needs to stay as int to protect its precision. In my opinion, any operation that casts values to floats do a calculation, but then returns results as ints is _very wrong_. At the very least, the results should be returned as floats to show that a cast has been done. Even better would be to perform the calculations with ints whenever possible.


Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Incorrect result when resampling DatetimeIndex due to hidden cast of ints to floats #3707

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Incorrect result when resampling DatetimeIndex due to hidden cast of ints to floats #3707

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions