Skip to content

unexpected behavior when using infer_object function #22212

Open
@rora002

Description

@rora002

Code Sample

In the code sample below: column "A" consists entirely of numbers formatted as strings.

df = pd.DataFrame({"A": ["1","2","3"]})
df.convert_objects(convert_numeric=True).dtypes
df.infer_objects().dtypes

Problem description

At present, I am using the convert_objects function to convert any columns which are entirely made up of numbers formatted as strings, to numeric values if possible. I note that the convert_objects function is deprecated, so I attempted to update my code to use infer_objects instead.

However, the infer_objects function appears to work differently, and will only convert a column to a numeric type if all rows in a particular column are numbers, but the series was previously configured in the dataframe (as shown in the example)

I understand the conversion of columns consisting entirely of string formatted numbers to numeric types may not be desirable for the default behavior, however it would be handy to give an argument which allows either behavior.

Alternatively, one must loop through each column and attempt conversion using the to_numeric function.

Expected Output

# output from df.convert_objects(convert_numeric=True).dtypes
A    int64
dtype: object

# output from df.infer_objects().dtypes
A    object
dtype: object

Output of pd.show_versions()

INSTALLED VERSIONS

commit: None
python: 3.6.3.final.0
python-bits: 64
OS: Windows
OS-release: 10
machine: AMD64
processor: Intel64 Family 6 Model 58 Stepping 9, GenuineIntel
byteorder: little
LC_ALL: None
LANG: en
LOCALE: None.None

pandas: 0.23.1
pytest: 3.2.1
pip: 18.0
setuptools: 36.5.0.post20170921
Cython: 0.26.1
numpy: 1.13.3
scipy: 0.19.1
pyarrow: None
xarray: None
IPython: 6.4.0
sphinx: 1.6.3
patsy: 0.4.1
dateutil: 2.6.1
pytz: 2017.2
blosc: None
bottleneck: 1.2.1
tables: 3.4.2
numexpr: 2.6.2
feather: None
matplotlib: 2.2.2
openpyxl: 2.4.8
xlrd: 1.1.0
xlwt: 1.3.0
xlsxwriter: 1.0.2
lxml: 4.1.0
bs4: 4.6.0
html5lib: 0.999999999
sqlalchemy: 1.1.13
pymysql: None
psycopg2: None
jinja2: 2.9.6
s3fs: None
fastparquet: None
pandas_gbq: None
pandas_datareader: None

Metadata

Metadata

Assignees

No one assigned

    Labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions