Closed
Description
Environment details
- OS type and version: macOS 13
- Python version: Python 3.10.7
- pip version: pip 23.2
google-cloud-bigquery
version: 3.11.4
Steps to reproduce
- Create a table with a
BIGNUMERIC
type column - Use
Client.load_table_from_dataframe()
to load a column with the valueDecimal("0.12345678901234560000000000000000000000")
Code example
df = pd.DataFrame("TEST_COL", Decimal("0.12345678901234560000000000000000000000"))
with bigquery.Client(project=project, credentials=credentials) as client:
load_job = client.load_table_from_dataframe(df, table_name)
result = load_job.result()
Stack trace
Traceback (most recent call last):
File "/Applications/PyCharm.app/Contents/plugins/python/helpers/pydev/_pydevd_bundle/pydevd_exec2.py", line 3, in Exec
exec(exp, global_vars, local_vars)
File "<input>", line 1, in <module>
File "pyarrow/array.pxi", line 1044, in pyarrow.lib.Array.from_pandas
File "pyarrow/array.pxi", line 316, in pyarrow.lib.array
File "pyarrow/array.pxi", line 83, in pyarrow.lib._ndarray_to_array
File "pyarrow/error.pxi", line 100, in pyarrow.lib.check_status
pyarrow.lib.ArrowInvalid: Rescaling Decimal128 value would cause data loss
Details
I poked around a bit, and I think I see the problem. To determine the column type you create an arrow table
then lookup the type by type IDMy test decimal value can be represented by an arrow 128-bit type
>>>pyarrow.array(pd.DataFrame({"x": [Decimal('0.12345678901234560000000000000000000000')]})["x"]).type
Decimal128Type(decimal128(38, 38))
But AFAICT that has the same type ID as every other Decimal128Type, so it maps to "NUMERIC" in your lookup table:
So when you convert to pyarrow before uploading, it tries to coerce using your hardcoded type for "NUMERIC"
and fails because it needed
decimal128(38, 38)
to represent the value, but is given decimal128(38, 9)
. I think you will need to detect if a 128 bit type can't fit in your NUMERIC subtype and upgrade those to BIGNUMERIC.
Hope that helps!