Closed
Description
-
I have checked that this issue has not already been reported.
-
I have confirmed this bug exists on the latest version of pandas.
-
(optional) I have confirmed this bug exists on the master branch of pandas.
On a DataFrame with categorical dtype, df.at[x,y] = v
sets all non-initialized values in row x
.
Code Sample, a copy-pastable example
$ python
Python 3.8.0 (default, Oct 28 2019, 16:14:01)
[GCC 8.3.0] on linux
Type "help", "copyright", "credits" or "license" for more information.
>>> import pandas as pd
>>> pd.__version__
'1.2.0.dev0+1137.g50b34a4a8'
>>> df = pd.DataFrame(index=range(3), columns=range(3), dtype=pd.CategoricalDtype(['foo', 'bar']))
>>> df.at[1,2] = 'foo'
>>> df
0 1 2
0 NaN NaN NaN
1 foo foo foo
2 NaN NaN NaN
It doesn't overwrite values that have been set with df.loc
:
>>> df = pd.DataFrame(index=range(3), columns=range(3), dtype=pd.CategoricalDtype(['foo', 'bar']))
>>> df.loc[1,1] = 'bar' # not necessary, just for demo
>>> df.at[1,2] = 'foo'
>>> df
0 1 2
0 NaN NaN NaN
1 foo bar foo
2 NaN NaN NaN
Problem description
df.at[x, y] = v
on a categorical dtype should behave as with other dtypes, and the same as df.loc[x, y] = v
.
Expected Output
The same as what happens with a DF initialized with None
s:
>>> df = pd.DataFrame([[None] * 3] * 3, index=range(3), columns=range(3), dtype=pd.CategoricalDtype(['foo', 'bar']))
>>> df.loc[1,1] = 'bar'
>>> df.at[1,2] = 'foo'
>>> df
0 1 2
0 NaN NaN NaN
1 NaN bar foo
2 NaN NaN NaN
Or as with dtype=float
:
>>> df = pd.DataFrame(index=range(3), columns=range(3), dtype=float)
>>> df.loc[1,1] = 1
>>> df.at[1,2] = 27
>>> df
0 1 2
0 NaN NaN NaN
1 NaN 1.0 27.0
2 NaN NaN NaN
Output of pd.show_versions()
INSTALLED VERSIONS
- ------------------
- commit : 50b34a4
- python : 3.8.0.final.0
- python-bits : 64
- OS : Linux
- OS-release : 5.4.0-52-generic
- Version : fillna bug #57~18.04.1-Ubuntu SMP Thu Oct 15 14:04:49 UTC 2020
- machine : x86_64
- processor : x86_64
- byteorder : little
- LC_ALL : None
- LANG : en_US.UTF-8
- LOCALE : en_US.UTF-8
- pandas : 1.2.0.dev0+1137.g50b34a4a8
- numpy : 1.19.4
- pytz : 2020.4
- dateutil : 2.8.1
- pip : 20.2.4
- setuptools : 50.3.2
- Cython : None
- pytest : None
- hypothesis : None
- sphinx : None
- blosc : None
- feather : None
- xlsxwriter : None
- lxml.etree : None
- html5lib : None
- pymysql : None
- psycopg2 : None
- jinja2 : None
- IPython : None
- pandas_datareader: None
- bs4 : None
- bottleneck : None
- fsspec : None
- fastparquet : None
- gcsfs : None
- matplotlib : None
- numexpr : None
- odfpy : None
- openpyxl : None
- pandas_gbq : None
- pyarrow : None
- pyxlsb : None
- s3fs : None
- scipy : None
- sqlalchemy : None
- tables : None
- tabulate : None
- xarray : None
- xlrd : None
- xlwt : None
- numba : None