Description
Code Sample, a copy-pastable example if possible
test = pd.DataFrame({'a': ['test'], 'b': [np.nan]})
test.loc[:, 'b'] = test.loc[:, 'b'].astype('Int64')
# *** AttributeError: 'IntegerArray' object has no attribute 'size'
Problem description
Assignment of .astype('Int64')
sometimes fails with AttributeError: 'IntegerArray' object has no attribute 'size'
.
Initially, I thought this was a regression from #25584, but it only manifests under very specific conditions:
- There is exactly one null value in the converted column
# works test = pd.DataFrame({'a': ['test', 'test'], 'b': [np.nan, 1.0]}) test.loc[:, 'b'] = test.loc[:, 'b'].astype('Int64') # works test = pd.DataFrame({'a': ['test', 'test'], 'b': [np.nan, np.nan]}) test.loc[:, 'b'] = test.loc[:, 'b'].astype('Int64') # fails test = pd.DataFrame({'a': ['test'], 'b': [np.nan]}) test.loc[:, 'b'] = test.loc[:, 'b'].astype('Int64') # *** AttributeError: 'IntegerArray' object has no attribute 'size'
- There is another column in the DataFrame
# works test = pd.DataFrame({'b': [np.nan]}) test.loc[:, 'b'] = test.loc[:, 'b'].astype('Int64')
- Assignment happens via
df.loc[:, colname]
- without assignment, or with assignment todf[colname]
, things work.test = pd.DataFrame({'a': ['test'], 'b': [np.nan]}) # fails test.loc[:, 'b'] = test['b'].astype('Int64') # *** AttributeError: 'IntegerArray' object has no attribute 'size' # fails test.loc[:, 'b'] = test.loc[:, 'b'].astype('Int64') # *** AttributeError: 'IntegerArray' object has no attribute 'size' # works test.loc[:, 'b'].astype('Int64') # works test['b'] = test['b'].astype('Int64')
Expected Output
test.loc[:, 'b'] = test.loc[:, 'b'].astype('Int64')
shouldn't raise an AttributeError
.
print(test)
after assignment should read along the lines of
0 <NA>
dtype: Int64
Output of pd.show_versions()
INSTALLED VERSIONS
commit : None
python : 3.6.9.final.0
python-bits : 64
OS : Linux
OS-release : 4.15.0-46-generic
machine : x86_64
processor :
byteorder : little
LC_ALL : None
LANG : C.UTF-8
LOCALE : en_US.UTF-8
pandas : 1.0.1
numpy : 1.16.1
pytz : 2018.9
dateutil : 2.8.0
pip : 19.3.1
setuptools : 41.4.0
Cython : None
pytest : 3.10.1
hypothesis : None
sphinx : 1.8.5
blosc : None
feather : None
xlsxwriter : None
lxml.etree : 4.3.0
html5lib : None
pymysql : 0.9.3
psycopg2 : None
jinja2 : 2.10
IPython : 7.2.0
pandas_datareader: None
bs4 : None
bottleneck : None
fastparquet : None
gcsfs : None
lxml.etree : 4.3.0
matplotlib : None
numexpr : None
odfpy : None
openpyxl : None
pandas_gbq : None
pyarrow : None
pytables : None
pytest : 3.10.1
pyxlsb : None
s3fs : None
scipy : 1.2.0
sqlalchemy : 1.2.17
tables : None
tabulate : None
xarray : None
xlrd : None
xlwt : None
xlsxwriter : None
numba : None