Description
Research
-
I have searched the [pandas] tag on StackOverflow for similar questions.
-
I have asked my usage related question on StackOverflow.
Link to question on StackOverflow
Question about pandas
For the following functions:
1.def nancorr
2.cdef void add_var
3.cdef void add_skew
4.cdef void add_mean
It appears that both the Welford method and Kahan summation are taken into account. However, for second-order functions like correlation and variance, only the Welford method is used without Kahan summation for the means (meanx or meany). For third-order functions like skewness, only Kahan summation for the naive one-pass algorithm is employed without using Welford.
My question is: How does the Pandas community decide which method to use for stable precision? If our goal is to achieve the highest possible stability, it seems that all these functions should utilize a combination of Welford and Kahan methods.
Could you please clarify the rationale behind these choices?