Skip to content

Setting values on slice of multi-index gives NaNs #10440

Open
@jim22k

Description

@jim22k

Best shown with an example.

import numpy as np, pandas as pd
timestamps = map(pd.Timestamp, ['2014-01-01', '2014-02-01'])
categories = ['A', 'B', 'C', 'D']
df = pd.DataFrame(index=pd.MultiIndex.from_product([timestamps, categories], names=['ts', 'cat']),
                  columns=['Col1', 'Col2'])

>>> df
                Col1  Col2
ts         cat            
2014-01-01 A     NaN   NaN
           B     NaN   NaN
           C     NaN   NaN
           D     NaN   NaN
2014-02-01 A     NaN   NaN
           B     NaN   NaN
           C     NaN   NaN
           D     NaN   NaN

I want to set the values for all categories in a single month. These examples work just fine.

df.loc['2014-01-01', 'Col1'] = 5
df.loc['2014-01-01', 'Col2'] = [1,2,3,4]

>>> df
               Col1 Col2
ts         cat          
2014-01-01 A      5    1
           B      5    2
           C      5    3
           D      5    4
2014-02-01 A    NaN  NaN
           B    NaN  NaN
           C    NaN  NaN
           D    NaN  NaN

These examples don't work.

df.loc['2014-01-01', 'Col1'] += 1
df.loc['2014-02-01', 'Col2'] = df.loc['2014-01-01', 'Col2']

>>> df
               Col1 Col2
ts         cat          
2014-01-01 A    NaN    1
           B    NaN    2
           C    NaN    3
           D    NaN    4
2014-02-01 A    NaN  NaN
           B    NaN  NaN
           C    NaN  NaN
           D    NaN  NaN

It doesn't seem to be a "setting a value on a copy" issue. Instead, Pandas is writing the NaNs.

My current workaround is to unstack each column into a DataFrame with simple indexes. This works, but I have lots of columns to work with. One dataframe is much easier to work with than a pile of dataframes.

The computations for each month depend on the values computed in the previous month, hence why it can't be done fully vectorized on an entire column.

Metadata

Metadata

Assignees

No one assigned

    Labels

    EnhancementIndexingRelated to indexing on series/frames, not to indexes themselvesMultiIndex

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions