Description
Code Sample, a copy-pastable example if possible
try:
table = pd.read_csv(csv_file_name, low_memory=False)
except:
raise
Problem description
From the stacktrace in the core file, pandas seems to be throwing an exception complaining "out of memory" (which it is not, the machine has 64 G of RAM and the interpreter was using maybe 5 G) but, during the cleanup of that exception, attempts to double free the self->error_msg
pointer (according to gcc). Results in a SIGSEGV
.
Expected Output
Pandas successfully converts the CSV into a dataframe
Output of pd.show_versions()
INSTALLED VERSIONS
commit: None
python: 3.5.2.final.0
python-bits: 64
OS: Linux
OS-release: 4.4.0-81-generic
machine: x86_64
processor: x86_64
byteorder: little
LC_ALL: None
LANG: en_US.UTF-8
LOCALE: en_US.UTF-8
pandas: 0.19.2
nose: None
pip: 9.0.1
setuptools: 36.0.1
Cython: 0.25.2
numpy: 1.13.0
scipy: None
statsmodels: None
xarray: None
IPython: None
sphinx: None
patsy: None
dateutil: 2.6.0
pytz: 2016.10
blosc: None
bottleneck: None
tables: None
numexpr: None
matplotlib: None
openpyxl: None
xlrd: None
xlwt: None
xlsxwriter: None
lxml: None
bs4: None
html5lib: None
httplib2: None
apiclient: None
sqlalchemy: 1.1.4
pymysql: None
psycopg2: 2.6.1 (dt dec pq3 ext lo64)
jinja2: 2.9.6
boto: 2.47.0
pandas_datareader: None
I can provide the source CSV if necessary (though it happens reliably with "large" CSVs, for a definition of "large" I haven't nailed down but in the multi-GB range). Below is the stack trace: