Skip to content

read_hdf crash python process when use it in multithread code #14263

Closed
@xmseraph

Description

@xmseraph

one of my folder contains multiple h5 files, and I tried to load them into dataframes and then concat these df into one.

the python process crashes when the num_tasks>1, if I debug thread by thread, it works, in another, it crashes simply when two threads run at the same time, even though they read different files.

from multiprocessing.pool import ThreadPool
import pandas as pd 

num_tasks=2
def readjob(x):
    path = x
    return pd.read_hdf(path,"df",mode='r')

pool = ThreadPool(num_tasks)
results = pool.map(readjob,files)

Metadata

Metadata

Assignees

No one assigned

    Labels

    Duplicate ReportDuplicate issue or pull requestIO HDF5read_hdf, HDFStore

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions