Closed
Description
I'm using to_gbq()
to load a local DataFrame into BigQuery. I'm running into an issue where floating point numbers are gaining significant figures and therefore causing numerical overflow errors when loaded to BigQuery.
The load.py
module's encode_chunk()
function writes to a local CSV buffer using Pandas' to_csv()
function, which has a known issue regarding added significant figures on some operating systems (read more here).
In my case, 0.208 was transformed to 0.20800000000000002.
I've been able to solve the issue locally by changing the float_format
parameter to '%g'
in the encode_chunk()
function's pd.to_csv()
call:
dataframe.to_csv(
csv_buffer, index=False, header=False, encoding='utf-8',
float_format='%g', date_format='%Y-%m-%d %H:%M:%S.%f')
Can this be safely applied as a default?
Versions:
pandas==0.22.0
pandas-gbq==0.5.0
OS details:
MacOS 10.13.4
Metadata
Metadata
Assignees
Labels
No labels