Skip to content

DEPR: Deprecate ordered=None for CategoricalDtype #26403

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 14 commits into from
Jul 3, 2019
Merged
Show file tree
Hide file tree
Changes from 1 commit
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
1 change: 1 addition & 0 deletions doc/source/whatsnew/v0.25.0.rst
Original file line number Diff line number Diff line change
Expand Up @@ -257,6 +257,7 @@ Deprecations

- Deprecated the ``units=M`` (months) and ``units=Y`` (year) parameters for ``units`` of :func:`pandas.to_timedelta`, :func:`pandas.Timedelta` and :func:`pandas.TimedeltaIndex` (:issue:`16344`)
- The functions :func:`pandas.to_datetime` and :func:`pandas.to_timedelta` have deprecated the ``box`` keyword. Instead, use :meth:`to_numpy` or :meth:`Timestamp.to_datetime64` or :meth:`Timedelta.to_timedelta64`. (:issue:`24416`)
- The default value ``ordered=None`` in :class:`~pandas.api.types.CategoricalDtype` has been deprecated in favor of ``ordered=False``. When converting between categorical types ``ordered=True`` must be explicitly passed in order to be preserved. (:issue:`26336`)

.. _whatsnew_0250.prior_deprecations:

Expand Down
7 changes: 7 additions & 0 deletions pandas/core/dtypes/dtypes.py
Original file line number Diff line number Diff line change
Expand Up @@ -550,8 +550,15 @@ def update_dtype(self, dtype):
new_categories = self.categories

new_ordered = dtype.ordered

# TODO(GH26336): remove this if block when ordered=None is removed
if new_ordered is None:
new_ordered = self.ordered
if self.ordered:
msg = ("ordered=None is deprecated and will default to False "
"in a future version; ordered=True must be explicitly "
"passed in order to be retained")
warnings.warn(msg, FutureWarning, stacklevel=2)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe we should let the stacklevel fit for the astype case (although the series and index case might need a different stacklevel), as that seems the more common case compared to directly using this method?

Or, otherwise, I would certainly try to make the context clearer in this warning message: indicate that a CategoricalDtype was constructed without specifying the ordered, etc (otherwise the message might be very confusing, as it is not raised when creating it)

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've updated the stacklevel so that it works for the Categorical.astype and CategoricalIndex.astype cases; all other cases look like they require a unique stacklevel. I've also updated the message itself to be more clear.


return CategoricalDtype(new_categories, new_ordered)

Expand Down
8 changes: 8 additions & 0 deletions pandas/tests/arrays/categorical/test_dtypes.py
Original file line number Diff line number Diff line change
Expand Up @@ -160,6 +160,14 @@ def test_astype_category(self, dtype_ordered, cat_ordered):
expected = cat
tm.assert_categorical_equal(result, expected)

def test_astype_category_ordered_none_deprecated(self):
# GH 26336
cdt1 = CategoricalDtype(categories=list('cdab'), ordered=True)
cdt2 = CategoricalDtype(categories=list('cedafb'))
cat = Categorical(list('abcdaba'), dtype=cdt1)
with tm.assert_produces_warning(FutureWarning, check_stacklevel=False):
cat.astype(cdt2)

def test_iter_python_types(self):
# GH-19909
cat = Categorical([1, 2])
Expand Down
8 changes: 7 additions & 1 deletion pandas/tests/dtypes/test_dtypes.py
Original file line number Diff line number Diff line change
Expand Up @@ -817,7 +817,13 @@ def test_update_dtype(self, ordered_fixture, new_categories, new_ordered):
if expected_ordered is None:
expected_ordered = dtype.ordered

result = dtype.update_dtype(new_dtype)
# GH 26336
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can you give some explanation on what you are testing here (the cases)

if new_ordered is None and ordered_fixture is True:
with tm.assert_produces_warning(FutureWarning):
result = dtype.update_dtype(new_dtype)
else:
result = dtype.update_dtype(new_dtype)

tm.assert_index_equal(result.categories, expected_categories)
assert result.ordered is expected_ordered

Expand Down
8 changes: 8 additions & 0 deletions pandas/tests/indexes/test_category.py
Original file line number Diff line number Diff line change
Expand Up @@ -490,6 +490,14 @@ def test_astype_category(self, name, dtype_ordered, index_ordered):
expected = index
tm.assert_index_equal(result, expected)

def test_astype_category_ordered_none_deprecated(self):
# GH 26336
cdt1 = CategoricalDtype(categories=list('cdab'), ordered=True)
cdt2 = CategoricalDtype(categories=list('cedafb'))
idx = CategoricalIndex(list('abcdaba'), dtype=cdt1)
with tm.assert_produces_warning(FutureWarning, check_stacklevel=False):
idx.astype(cdt2)

def test_reindex_base(self):
# Determined by cat ordering.
idx = CategoricalIndex(list("cab"), categories=list("cab"))
Expand Down
8 changes: 8 additions & 0 deletions pandas/tests/series/test_dtypes.py
Original file line number Diff line number Diff line change
Expand Up @@ -227,6 +227,14 @@ def test_astype_categories_deprecation(self):
result = s.astype('category', categories=['a', 'b'], ordered=True)
tm.assert_series_equal(result, expected)

def test_astype_category_ordered_none_deprecated(self):
# GH 26336
cdt1 = CategoricalDtype(categories=list('cdab'), ordered=True)
cdt2 = CategoricalDtype(categories=list('cedafb'))
s = Series(list('abcdaba'), dtype=cdt1)
with tm.assert_produces_warning(FutureWarning, check_stacklevel=False):
s.astype(cdt2)

def test_astype_from_categorical(self):
items = ["a", "b", "c", "a"]
s = Series(items)
Expand Down