ENH: json_normalize() avoid loss of precision for int64 with missing values

This code:
```python
x = 1234567890123456789
x - pd.io.json.json_normalize([{'x': x}, {}]).loc[0, 'x'].astype(int)
```
Gives `21`, when reasonable users might expect it to give `0`.

This inaccuracy occurs when one field has an int64 value that is not present in all records, triggering a conversion of the Series dtype to float64.  Of course, Pandas does this conversion so that it can put NAN where no value exists.

One solution could be to add a `fill_value` parameter, as seen in `add()`, `unstack()`, and other Pandas functions.  It would be good for this to support a dict as well as a single value, in case different fill values are required for different columns.

The usage might be like this:

```python
pd.io.json.json_normalize([{'x': x}, {}], fill_value=-1).x
# or
pd.io.json.json_normalize([{'x': x}, {}], fill_value={'x': -1}).x
```
Then instead of the current result:
```
0    1.234568e+18
1             NaN
Name: x, dtype: float64
```
The result would be:
```
0    1234567890123456789
1                     -1
Name: x, dtype: int64
```

I'm using Pandas 0.20.1.


Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Uh oh!

ENH: json_normalize() avoid loss of precision for int64 with missing values #16918

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Uh oh!

ENH: json_normalize() avoid loss of precision for int64 with missing values #16918

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions