Skip to content

BUG/API: Data assignment dtype inference inconsistencies #14001

Closed
@sinhrks

Description

@sinhrks

Code Sample, a copy-pastable example if possible

Assuming we have int DataFrame and assign float values.

  • assigning 2-d float array keeps int columns if possible.
df = pd.DataFrame(np.zeros((3, 2), dtype=np.int64))
df.iloc[0:2, 0:2] = np.array([[1, 2.1], [2, 2.2]], dtype=np.float64)
df
#    0    1
#0  1  2.1
#1  2  2.2
#2  0  0.0
  • but assigning 1-d array coerces to float, not keeps int .
df = pd.DataFrame(np.zeros((3, 2), dtype=np.int64))
df.iloc[0:2, 0] = np.array([1, 2], dtype=np.float64)
df
#      0  1
#0  1.0  0
#1  2.0  0
#2  0.0  0

Expected Output

I think 1st one should be all float, as we're assigning float dtype.

df = pd.DataFrame(np.zeros((3, 2), dtype=np.int64))
df.iloc[0:2, 0:2] = np.array([[1, 2.1], [2, 2.2]], dtype=np.float64)
df
#      0    1
#0  1.0  2.1
#1  2.0  2.2
#2  0.0  0.0

The inference occurs around here, and it should be something like:

if hasattr(value, 'dtype'):
    if is_dtype_equal(values.dtype, value.dtype):
        dtype = value.dtype
    else:
        dtype = _find_common_type(values.dtpye, value.dtype)
elif is_scalar(value):
    dtype, _ = _infer_dtype_from_scalar(value)
else:
    # not sure can reach here...
    dtype = 'infer'
...

output of pd.show_versions()

0.18.1

Metadata

Metadata

Assignees

No one assigned

    Type

    No type

    Projects

    No projects

    Milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions