Description
We want to test interesting dataframe examples for namely our roundtrip tests, as different aspects of a df could highlight distinct bugs in a library's adoption of the interchange protocol.
My current idea is to generate interesting dicts that can act as the data arguments in say pd.Dataframe()
or vaex.from_dict()
via Hypothesis. This seems a few hours to get the ball rolling with a strategy that generates dicts with elements respective to all the valid dtypes. This won't meet the standard of a first-party strategy but will do the job just nicely.
Additional work would however be needed to map the dtypes to the library's respective dtype "identifiers" (i.e. dtype objects like np.int64
and/or strings) and piped correctly to the library's respective dataframe constructor (e.g. pd.DataFrame()
, vaex.from_dict()
). Seems like a fairly simple problem, maybe just an hours work.