Skip to content

Using resample() with groupby on this DataFrame causes Segmentation Fault #8573

Closed
@ginzor

Description

@ginzor

When trying to resample timestamps into 5 minute time slots grouping on an id column (tried both counting and summing aggregation in 'how' parameter). In a DataFrame with TimeSeries data I get a memory crash, i.e. Segmentation Fault.

I reduced the DataFrame as far as I could in reproducing the crash. Also noted that it will not cause a segfault if I sort the index (don't know if this is needed for resample() function could not find such documentation).

import datetime
import pandas as pd

all_wins_and_wagers =\
[(1L, datetime.datetime(2013, 10, 1, 16, 20), 1L, 0L),
 (2L, datetime.datetime(2013, 10, 1, 16, 10), 1L, 0L),
 (2L, datetime.datetime(2013, 10, 1, 18, 15), 1L, 0L),
 (2L, datetime.datetime(2013, 10, 1, 16, 10, 31), 1L, 0L)]

df = pd.DataFrame.from_records(all_wins_and_wagers, columns=("ID", "timestamp", "A", "B")).set_index("timestamp")
df_resampled = df.groupby("ID").resample("5min", "sum")

Tried on following setups of pandas.

INSTALLED VERSIONS

commit: None
python: 2.7.8.final.0
python-bits: 64
OS: Linux
OS-release: 3.16-2-amd64
machine: x86_64
processor:
byteorder: little
LC_ALL: None
LANG: en_US.UTF-8

pandas: 0.14.1
nose: 1.3.4
Cython: None
numpy: 1.9.0
scipy: None
statsmodels: None
IPython: 2.3.0
sphinx: None
patsy: None
scikits.timeseries: None
dateutil: 2.2
pytz: 2014.7
bottleneck: None
tables: None
numexpr: None
matplotlib: 1.0.1
openpyxl: None
xlrd: None
xlwt: None
xlsxwriter: None
lxml: None
bs4: None
html5lib: 0.999
httplib2: None
apiclient: None
rpy2: None
sqlalchemy: None
pymysql: None
psycopg2: None

INSTALLED VERSIONS

commit: None
python: 2.7.8.final.0
python-bits: 64
OS: Linux
OS-release: 3.16-2-amd64
machine: x86_64
processor:
byteorder: little
LC_ALL: None
LANG: en_US.UTF-8

pandas: 0.13.1
Cython: None
numpy: 1.8.0
scipy: None
statsmodels: None
IPython: 2.3.0
sphinx: None
patsy: None
scikits.timeseries: None
dateutil: 1.5
pytz: 2012c
bottleneck: None
tables: None
numexpr: None
matplotlib: 1.0.1
openpyxl: None
xlrd: None
xlwt: None
xlsxwriter: None
sqlalchemy: None
lxml: None
bs4: None
html5lib: 0.999
bq: None
apiclient: None

Metadata

Metadata

Assignees

No one assigned

    Labels

    BugGroupbyResampleresample methodTestingpandas testing functions or related to the test suite

    Type

    No type

    Projects

    No projects

    Milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions