Open
Description
Hi According to the tfx examples, I pass the pipeline_options
to generate_statistics_from_csv
which set --direct_num_workers=16
like:
pipeline_options = PipelineOptions(['--direct_num_workers=16'])
It's seem that this option cannot speed up this API, when I set direct_num_workers=1
, the cost time is equal the 16 worker, like that:
# direct_num_workers=1
python prep.py 99.27s user 5.84s system 99% cpu 1:45.67 total
# direct_num_workers=16
python prep.py 101.92s user 5.22s system 98% cpu 1:48.44 total
Could someone help me?