Skip to content

QST: somthing weird in using groupby.rolling.apply(function) #58061

Open
@Li-xy-lucky

Description

@Li-xy-lucky

Research

  • I have searched the [pandas] tag on StackOverflow for similar questions.

  • I have asked my usage related question on StackOverflow.

Link to question on StackOverflow

somthing weird in using groupby.rolling.apply(function)

Question about pandas

my code is:

def mean_second_derivative_centra(x):
    print(x)
    sum_value=0
    for i in range(len(x)-5):
        sum_value+=(x[i+5]-2*x[i+3]+x[i])/2
    return sum_value/(2*(len(x)-5))

df['center_second_derivative_centra'] = df['vwap'].groupby(level=0).rolling(20).apply(mean_second_derivative_centra).fillna(0).reset_index(level=0,drop=True)

The original data is double index, like this:

ticker     TradeDate
000006.SZ  20190102     323.068289
           20190103     322.534244
           20190104     324.741135
           20190107     333.052955
           20190108     333.284140
   
688800.SH  20240315      44.268274
           20240318      46.026523
           20240319      46.574582
           20240320      46.519835
           20240321      45.813010
Name: vwap, Length: 2240979, dtype: float64

When using df['vwap'].groupby(level=0).rolling(20).apply(mean_second_derivative_centra), the print result is

ticker     TradeDate
000762.SZ  20190125     119.486283
           20190128     123.510591
           20190129     130.372216
           20190130     137.017194
           20190131     144.383060
000767.SZ  20190102     139.268014
           20190103     138.143270
           20190104     143.540819
           20190107     149.846321
           20190108     151.003238
           20190109     154.993187
           20190110     157.060098
           20190111     173.592207
           20190114     183.498919
           20190115     164.814842
           20190116     148.880467
           20190117     153.782416
           20190118     157.737036
           20190121     153.820486
           20190122     163.393745
dtype: float64

It seems like that data has not been grouped, and the result might be wrong.
However, the result is True. Because we can infer from the appearance of Nan.
I'm really confused about the print result.
Hope someone can solve this problem.

Metadata

Metadata

Assignees

No one assigned

    Labels

    Needs TriageIssue that has not been reviewed by a pandas team memberUsage Question

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions