Closed
Description
Code Sample, a copy-pastable example if possible
import pandas as pd
from scipy.stats import pearsonr
df = pd.DataFrame({'A': [1,2,3], 'B': [2,5,6]})
print(df.corr(method=lambda x, y: pearsonr(x, y)[1]))
A B
A 1.000000 0.178912
B 0.178912 1.000000
Problem description
I want to use the method argument of corr
to compute p-values. However, diagonal elements are set to 1
. I would expect them to be 0
. They are set to 1
here:
Lines 7025 to 7026 in cb00deb
Although I can see that for a 'normal' correlation 1
is expected, this is not the case in my example. Hence, I would suggest to remove these two lines from frame.py
.
Expected Output
A B
A 0.000000 0.178912
B 0.178912 0.000000
Output of pd.show_versions()
INSTALLED VERSIONS
------------------
commit: None
python: 3.6.8.final.0
python-bits: 64
OS: Linux
OS-release: 4.4.165-81-default
machine: x86_64
processor: x86_64
byteorder: little
LC_ALL: en_US.UTF-8
LANG: en_US.UTF-8
LOCALE: en_US.UTF-8
pandas: 0.24.2
pytest: 4.1.1
pip: 18.1
setuptools: 40.6.3
Cython: 0.29.3
numpy: 1.15.4
scipy: 1.2.0
pyarrow: None
xarray: None
IPython: 7.2.0
sphinx: 1.8.3
patsy: 0.5.1
dateutil: 2.7.5
pytz: 2018.9
blosc: None
bottleneck: None
tables: 3.4.4
numexpr: 2.6.9
feather: None
matplotlib: 3.0.2
openpyxl: 2.4.0-b1
xlrd: 1.2.0
xlwt: None
xlsxwriter: None
lxml.etree: 4.3.0
bs4: 4.7.1
html5lib: 1.0.1
sqlalchemy: None
pymysql: None
psycopg2: None
jinja2: 2.10
s3fs: None
fastparquet: None
pandas_gbq: None
pandas_datareader: None
gcsfs: None