Closed
Description
I think in general we try to return python scalars instead of numpy scalars in to_dict
(similar as in tolist
or iteration).
Eg:
In [27]: df = pd.DataFrame({'a': [1, 2], 'b': [.1, .2]})
In [28]: df.to_dict()
Out[28]: {'a': {0: 1, 1: 2}, 'b': {0: 0.1, 1: 0.2}}
In [29]: type(df.to_dict()['a'][0])
Out[29]: int
However, this is not consistent, and eg when using orient='records'
:
In [31]: df.to_dict(orient='records')
Out[31]: [{'a': 1.0, 'b': 0.10000000000000001}, {'a': 2.0, 'b': 0.20000000000000001}]
In [32]: type(df.to_dict(orient='records')[0]['a'])
Out[32]: numpy.float64
In this case, that is because of iterating over self.values
in the 'records' implementation (which also means that if you have a string column, self.values
will be object dtype, and you actually get python scalars)
There are a bunch of other issues related to iteration (eg #20791, #13468), but didn't see one specifically related to to_dict
.