Skip to content

Inserting subclass/composition of Series into DataFrame strips 'extra' functions/properties #1713

Closed
@carsonfarmer

Description

@carsonfarmer

When trying to insert/append a subclass (or composition) of a pandas Series into a DataFrame, any and all of the 'extra' functions that come with my subclass (or composition) are stripped and a Series is created:

In [7]: df = read_csv('some/data/from/file.csv')

In [8]: sp = SpatialSeries(df.the_geom) # SpatialSeries is subclass, the_geom is spatial location (WKT)

In [9]: type(sp)
Out[9]: spseries.SpatialSeries

In [10]: type(df)
Out[10]: pandas.core.frame.DataFrame

In [11]: df['geoms'] = sp

In [12]: type(df['geoms'])
Out[12]: pandas.core.series.Series

I suspect that for the most part, this kind of behaviour is useful, however, I need the extra functions and classes associated with SpatialSeries, and I'd rather not have to subclass DataFrame to create a special DataFrame that allows this. It looks like the culprit is here in frame.py at lines 1761-1772:

    def _set_item(self, key, value):
        """
        Add series to DataFrame in specified column.

        If series is a numpy-array (not a Series/TimeSeries), it must be the
        same length as the DataFrame's index or an error will be thrown.

        Series/TimeSeries will be conformed to the DataFrame's index to
        ensure homogeneity.
        """
        value = self._sanitize_column(key, value)
        NDFrame._set_item(self, key, value)

I particular, I'm looking at value = self._sanitize_column(key, value), which appears to use np.asarray(value) before it returns the input array (even if the input column is a Series). Is there any way to avoid this behaviour? Or alternatively, a better way to implement this so that useful subclasses can be used within a DataFrame? I hope I'm not missing something simple/vital here?

FYI:

In [13]: pandas.__version__
Out[13]: '0.8.0b1'

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions