Open
Description
Hello the awesome Pandas team!
Consider the example below
data = pd.DataFrame({'mydate' : [pd.to_datetime('2016-06-06'),
pd.to_datetime('2016-06-08'),
pd.to_datetime('2016-06-09'),
pd.to_datetime('2016-06-10'),
pd.to_datetime('2016-06-12'),
pd.to_datetime('2016-06-13')],
'myvalue' : [1, 2, 3, 4, 5, 6],
'group' : ['A', 'A', 'A', 'B', 'B', 'B']})
data.set_index('mydate', inplace = True)
Out[58]:
group myvalue
mydate
2016-06-06 A 1
2016-06-08 A 2
2016-06-09 A 3
2016-06-10 B 4
2016-06-12 B 5
2016-06-13 B 6
Now I need to compute the difference between the current value of myvalue
and its lagged value, where by lagged I actually mean lagged by 1 day (if possible).
So this returns a result, but its not what I need
data['delta_one'] = data.groupby('group').myvalue.transform(lambda x: x - x.shift(1))
data
Out[56]:
group myvalue delta_one
mydate
2016-06-06 A 1 nan
2016-06-08 A 2 1.0000
2016-06-09 A 3 1.0000
2016-06-10 B 4 nan
2016-06-12 B 5 1.0000
2016-06-13 B 6 1.0000
This is what I need, but it does not work
data['delta_two'] = data.groupby('group').myvalue.transform(lambda x: x - x.shift(1, pd.Timedelta('1 days')))
File "C:\Users\john\AppData\Local\Continuum\anaconda3\lib\site-packages\pandas\core\internals.py", line 120, in __init__
len(self.mgr_locs)))
ValueError: Wrong number of items passed 4, placement implies 3
Any ideas? Is this a bug? I think I am using the correct pandonic syntax here.
Thanks!