Open
Description
Research
-
I have searched the [pandas] tag on StackOverflow for similar questions.
-
I have asked my usage related question on StackOverflow.
Link to question on StackOverflow
somthing weird in using groupby.rolling.apply(function)
Question about pandas
my code is:
def mean_second_derivative_centra(x):
print(x)
sum_value=0
for i in range(len(x)-5):
sum_value+=(x[i+5]-2*x[i+3]+x[i])/2
return sum_value/(2*(len(x)-5))
df['center_second_derivative_centra'] = df['vwap'].groupby(level=0).rolling(20).apply(mean_second_derivative_centra).fillna(0).reset_index(level=0,drop=True)
The original data is double index, like this:
ticker TradeDate
000006.SZ 20190102 323.068289
20190103 322.534244
20190104 324.741135
20190107 333.052955
20190108 333.284140
688800.SH 20240315 44.268274
20240318 46.026523
20240319 46.574582
20240320 46.519835
20240321 45.813010
Name: vwap, Length: 2240979, dtype: float64
When using df['vwap'].groupby(level=0).rolling(20).apply(mean_second_derivative_centra), the print result is
ticker TradeDate
000762.SZ 20190125 119.486283
20190128 123.510591
20190129 130.372216
20190130 137.017194
20190131 144.383060
000767.SZ 20190102 139.268014
20190103 138.143270
20190104 143.540819
20190107 149.846321
20190108 151.003238
20190109 154.993187
20190110 157.060098
20190111 173.592207
20190114 183.498919
20190115 164.814842
20190116 148.880467
20190117 153.782416
20190118 157.737036
20190121 153.820486
20190122 163.393745
dtype: float64
It seems like that data has not been grouped, and the result might be wrong.
However, the result is True. Because we can infer from the appearance of Nan.
I'm really confused about the print result.
Hope someone can solve this problem.