DataFrame.corr(method="kendall") calculation is slow

```python
import numpy as np
import pandas as pd

df = pd.DataFrame(np.random.randn(1000, 300))

df.corr(method="kendall")
# 21.6 s ± 686 ms per loop (mean ± std. dev. of 7 runs, 1 loop each)
```

`DataFrame.corr(method="kendall")` doesn't scale particularly well, perhaps because it's the only named correlation method that isn't Cythonized at the moment (we just call `kendalltau` from `scipy` repeatedly in a Python for loop: https://github.com/pandas-dev/pandas/blob/master/pandas/core/frame.py#L7454).  It may be worthwhile to try to implement something more efficient within `_libs/algos.pyx`.

Relevant discussion: https://github.com/pandas-dev/pandas/pull/28151

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

DataFrame.corr(method="kendall") calculation is slow #28329

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

DataFrame.corr(method="kendall") calculation is slow #28329

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions