Description
-
[ X ] I have checked that this issue has not already been reported.
-
[ X ] I have confirmed this bug exists on the latest version of pandas.
-
(optional) I have confirmed this bug exists on the master branch of pandas.
Note: Please read this guide detailing how to provide the necessary information for us to reproduce your bug.
Code Sample, a copy-pastable example
df = pd.DataFrame({'A': ['0.09', '0.63', '0.121909', '0.117863']})
df['B'] = pd.to_numeric(df['A'])
df['C'] = df['A'].astype(float)
Problem description
When floats as strings are passed to a column of a DataFrame, pd.to_numeric casts some of the given strings incorrectly.
In the above case df['C'] and df['A'] should have exactly the same values, but they differ in some decimals:
df['B'] = (0, 0.09) (1, 0.63) (2, 0.12190899999999999) (3, 0.11786300000000001)
df['C'] = (0, 0.09) (1, 0.63) (2, 0.121909) (3, 0.117863)
df['C'] is correct but df['B'] is incorrect.
The above issue does not occur when the same input is passed as list
w = pd.to_numeric(['0.09', '0.63', '0.121909', '0.117863'])
[0.09 0.63 0.121909 0.117863]
[this should explain why the current behaviour is a problem and why the expected output is a better solution]
Expected Output
Output of pd.show_versions()
INSTALLED VERSIONS
commit : 2a7d332
python : 3.8.5.final.0
python-bits : 32
OS : Windows
OS-release : 10
Version : 10.0.19041
machine : AMD64
processor : Intel64 Family 6 Model 142 Stepping 12, GenuineIntel
byteorder : little
LC_ALL : None
LANG : None
LOCALE : English_United States.1252
pandas : 1.1.2