Closed
Description
On a groupby with a composed key if the product of all possible values is bigger than 2^63 we get a ValueError "negative dimensions are not allowed"
when we call len(grouped_data)
.
A simple version to reproduce it:
values = range(55109)
data = pd.DataFrame.from_dict({'a': values, 'b': values, 'c': values, 'd': values})
grouped = data.groupby(['a', 'b', 'c', 'd'])
len(grouped)
A side effect of this error is that if there are NaN values as possible keys it won't ignore them, it will replace the NaN values with some other values present in the index.
Here there is a complete IPython notebook example to reproduce it:
http://nbviewer.ipython.org/gist/jordeu/cd86fc99f5f89451cf93