Skip to content

feat: [WIP] add dry_run option for read_gbq #923

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 3 commits into
base: main
Choose a base branch
from
Open

Conversation

sycai
Copy link
Contributor

@sycai sycai commented Apr 28, 2025

The basic logic is:

  • If the input is a query, dry run it
  • If the input is a table reference, convert it to a SQL ("SELECT * FROM ...") and dry run it.

The dry run stats are returned in a Pandas Series, similar with https://github.com/googleapis/python-bigquery-dataframes/blob/30a62372b2fd72deaf7dbc1dbce8b48f03b3041c/bigframes/core/blocks.py#L818-L885

I didn't perform the type conversion from BQ schema to Pandas dtypes due to non-trivial efforts. We could either:

  • Re-use BQ Python client's conversion function, which is buried somewhere deep in the codebase, and is possibly not supposed to be invoked by external packages.
  • Define our own conversion function, which needs to be kept in sync with the BQ client library.

@product-auto-label product-auto-label bot added size: m Pull request size is medium. api: bigquery Issues related to the googleapis/python-bigquery-pandas API. labels Apr 28, 2025
@sycai sycai marked this pull request as ready for review April 28, 2025 23:35
@sycai sycai requested review from a team as code owners April 28, 2025 23:35
@sycai sycai requested review from shollyman and GarrettWu April 28, 2025 23:35
@sycai sycai requested review from tswast and chalmerlowe and removed request for shollyman and GarrettWu April 28, 2025 23:36
@sycai sycai added the do not merge Indicates a pull request not ready for merge, due to either quality or timing. label Apr 29, 2025
@sycai sycai changed the title feat: add dry_run option for read_gbq feat: [WIP] add dry_run option for read_gbq Apr 29, 2025
@sycai sycai removed request for tswast and chalmerlowe April 29, 2025 21:20
@sycai
Copy link
Contributor Author

sycai commented Apr 29, 2025

The next step is add logics of type conversions from BQ schema to Pandas

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
api: bigquery Issues related to the googleapis/python-bigquery-pandas API. do not merge Indicates a pull request not ready for merge, due to either quality or timing. size: m Pull request size is medium.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants