DIS: Keywords for multi-threading capabilities

With the addition of the new pyarrow engine, we now have the option to use multiple threads to read a CSV file. (This is also controllable through the ``pyarrow.set_cpu_count`` option). 

Should we expose a keyword(such as ``num_threads`` maybe) to the user as a keyword, or just add an example in the docs(for this case, redirecting to ``pyarrow.set_cpu_count``? In the case of ``read_csv``, this keyword would probably only apply to the ``pyarrow`` engines, however it is worth noting that we have had multiple feature requests for parallel CSV reading (e.g. #37955), and it is probably worth it to be configure the number of threads used if we offer multithreading. 

Personally, I would prefer having a keyword, as if we decide to add more I/O engines with multithreading capabilities, it would be more convenient to be able to control this option through a keyword.

cc @pandas-dev/pandas-core 

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

DIS: Keywords for multi-threading capabilities #43313

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

DIS: Keywords for multi-threading capabilities #43313

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions