Open
Description
As of now read_fwf
infers the fields positions using only first 100 rows of the file, and this number is not easily modifiable. However, if there is a field with values for several rare objects only, it will be completely missed! So it would be great if pandas used much more rows by default (100 is quite a small number) - why not put something like 10000? Or at least provide a way to increase this number and/or a parameter like infer_using_whole_file=False
.
If anyone finds this issue and needs an immediate solution - I personally use monkey-patching:
_detect_colspecs = pd.io.parsers.FixedWidthReader.detect_colspecs
pd.io.parsers.FixedWidthReader.detect_colspecs = lambda self, n=100000, skiprows=None: _detect_colspecs(self, n, skiprows)