Skip to content

BUG: DataFrame.at setter of categorical DF overwrites entire row #37763

Closed
@treszkai

Description

@treszkai
  • I have checked that this issue has not already been reported.

  • I have confirmed this bug exists on the latest version of pandas.

  • (optional) I have confirmed this bug exists on the master branch of pandas.


On a DataFrame with categorical dtype, df.at[x,y] = v sets all non-initialized values in row x.

Code Sample, a copy-pastable example

$ python
Python 3.8.0 (default, Oct 28 2019, 16:14:01) 
[GCC 8.3.0] on linux
Type "help", "copyright", "credits" or "license" for more information.
>>> import pandas as pd
>>> pd.__version__
'1.2.0.dev0+1137.g50b34a4a8'
>>> df = pd.DataFrame(index=range(3), columns=range(3), dtype=pd.CategoricalDtype(['foo', 'bar']))
>>> df.at[1,2] = 'foo'
>>> df
     0    1    2
0  NaN  NaN  NaN
1  foo  foo  foo
2  NaN  NaN  NaN

It doesn't overwrite values that have been set with df.loc:

>>> df = pd.DataFrame(index=range(3), columns=range(3), dtype=pd.CategoricalDtype(['foo', 'bar']))
>>> df.loc[1,1] = 'bar'  # not necessary, just for demo
>>> df.at[1,2] = 'foo'
>>> df
     0    1    2
0  NaN  NaN  NaN
1  foo  bar  foo
2  NaN  NaN  NaN

Problem description

df.at[x, y] = v on a categorical dtype should behave as with other dtypes, and the same as df.loc[x, y] = v.

Expected Output

The same as what happens with a DF initialized with Nones:

>>> df = pd.DataFrame([[None] * 3] * 3, index=range(3), columns=range(3), dtype=pd.CategoricalDtype(['foo', 'bar']))
>>> df.loc[1,1] = 'bar'
>>> df.at[1,2] = 'foo'
>>> df
     0    1    2
0  NaN  NaN  NaN
1  NaN  bar  foo
2  NaN  NaN  NaN

Or as with dtype=float:

>>> df = pd.DataFrame(index=range(3), columns=range(3), dtype=float)
>>> df.loc[1,1] = 1
>>> df.at[1,2] = 27
>>> df
    0    1     2
0 NaN  NaN   NaN
1 NaN  1.0  27.0
2 NaN  NaN   NaN

Output of pd.show_versions()

INSTALLED VERSIONS

  • ------------------
  • commit : 50b34a4
  • python : 3.8.0.final.0
  • python-bits : 64
  • OS : Linux
  • OS-release : 5.4.0-52-generic
  • Version : fillna bug #57~18.04.1-Ubuntu SMP Thu Oct 15 14:04:49 UTC 2020
  • machine : x86_64
  • processor : x86_64
  • byteorder : little
  • LC_ALL : None
  • LANG : en_US.UTF-8
  • LOCALE : en_US.UTF-8
  • pandas : 1.2.0.dev0+1137.g50b34a4a8
  • numpy : 1.19.4
  • pytz : 2020.4
  • dateutil : 2.8.1
  • pip : 20.2.4
  • setuptools : 50.3.2
  • Cython : None
  • pytest : None
  • hypothesis : None
  • sphinx : None
  • blosc : None
  • feather : None
  • xlsxwriter : None
  • lxml.etree : None
  • html5lib : None
  • pymysql : None
  • psycopg2 : None
  • jinja2 : None
  • IPython : None
  • pandas_datareader: None
  • bs4 : None
  • bottleneck : None
  • fsspec : None
  • fastparquet : None
  • gcsfs : None
  • matplotlib : None
  • numexpr : None
  • odfpy : None
  • openpyxl : None
  • pandas_gbq : None
  • pyarrow : None
  • pyxlsb : None
  • s3fs : None
  • scipy : None
  • sqlalchemy : None
  • tables : None
  • tabulate : None
  • xarray : None
  • xlrd : None
  • xlwt : None
  • numba : None

Metadata

Metadata

Assignees

No one assigned

    Labels

    BugCategoricalCategorical Data TypeIndexingRelated to indexing on series/frames, not to indexes themselves

    Type

    No type

    Projects

    No projects

    Milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions