Skip to content

CLN/DEPR: removed deprecated as_indexer arg from str.match() #22356

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
wants to merge 9 commits into from
2 changes: 1 addition & 1 deletion doc/source/whatsnew/v0.24.0.txt
Original file line number Diff line number Diff line change
Expand Up @@ -630,7 +630,7 @@ Numeric
Strings
^^^^^^^

-
- Removed as_indexer(deprecated of 0.21.0) keyword completely from str.match() (:issue:`22356`,:issue:`6581`)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

  • You mean 0.20.0 ? You can also just remove that parenthetical comment.
  • This should be in the Removal of prior version deprecations/changes section.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

thanks, will fix it straight away

-
-

Expand Down
20 changes: 3 additions & 17 deletions pandas/core/strings.py
Original file line number Diff line number Diff line change
Expand Up @@ -709,7 +709,7 @@ def rep(x, r):
return result


def str_match(arr, pat, case=True, flags=0, na=np.nan, as_indexer=None):
def str_match(arr, pat, case=True, flags=0, na=np.nan):
"""
Determine if each string matches a regular expression.

Expand All @@ -722,8 +722,6 @@ def str_match(arr, pat, case=True, flags=0, na=np.nan, as_indexer=None):
flags : int, default 0 (no flags)
re module flags, e.g. re.IGNORECASE
na : default NaN, fill value for missing values.
as_indexer
.. deprecated:: 0.21.0
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

so is this just wrong?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@HyunTruth can you see when this was added? if it was in 0.21.0 then we can't remove this yet, if it was a mistake / typo then we could.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sure, I'll check on it.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@jreback It seems to be a typo, as in whatsnew v0.20.0, it is stated that

- The default behaviour of ``Series.str.match`` has changed from extracting
  groups to matching the pattern. The extracting behaviour was deprecated
  since pandas version 0.13.0 and can be done with the ``Series.str.extract``
  method (:issue:`5224`). As a consequence, the ``as_indexer`` keyword is
  ignored (no longer needed to specify the new behaviour) and is deprecated.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ok then


Returns
-------
Expand All @@ -741,17 +739,6 @@ def str_match(arr, pat, case=True, flags=0, na=np.nan, as_indexer=None):

regex = re.compile(pat, flags=flags)

if (as_indexer is False) and (regex.groups > 0):
raise ValueError("as_indexer=False with a pattern with groups is no "
"longer supported. Use '.str.extract(pat)' instead")
elif as_indexer is not None:
# Previously, this keyword was used for changing the default but
# deprecated behaviour. This keyword is now no longer needed.
warnings.warn("'as_indexer' keyword was specified but is ignored "
"(match now returns a boolean indexer by default), "
"and will be removed in a future version.",
FutureWarning, stacklevel=3)

dtype = bool
f = lambda x: bool(regex.match(x))

Expand Down Expand Up @@ -2469,9 +2456,8 @@ def contains(self, pat, case=True, flags=0, na=np.nan, regex=True):
return self._wrap_result(result)

@copy(str_match)
def match(self, pat, case=True, flags=0, na=np.nan, as_indexer=None):
result = str_match(self._parent, pat, case=case, flags=flags, na=na,
as_indexer=as_indexer)
def match(self, pat, case=True, flags=0, na=np.nan):
result = str_match(self._parent, pat, case=case, flags=flags, na=na)
return self._wrap_result(result)

@copy(str_replace)
Expand Down
15 changes: 4 additions & 11 deletions pandas/tests/test_strings.py
Original file line number Diff line number Diff line change
Expand Up @@ -938,20 +938,13 @@ def test_match(self):
exp = Series([True, NA, False])
tm.assert_series_equal(result, exp)

# test passing as_indexer still works but is ignored
# GH 22316 test the removal of as_indexer from match
values = Series(['fooBAD__barBAD', NA, 'foo'])
exp = Series([True, NA, False])
with tm.assert_produces_warning(FutureWarning):
with tm.assert_raises_regex(TypeError,
"match() got an unexpected "
"keyword argument 'as_indexer'"):
Copy link
Member

@gfyoung gfyoung Aug 15, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I like the thought. However, we generally just drop all of the relevant tests when a keyword is getting dropped after a deprecation.

result = values.str.match('.*BAD[_]+.*BAD', as_indexer=True)
tm.assert_series_equal(result, exp)
with tm.assert_produces_warning(FutureWarning):
result = values.str.match('.*BAD[_]+.*BAD', as_indexer=False)
tm.assert_series_equal(result, exp)
with tm.assert_produces_warning(FutureWarning):
result = values.str.match('.*(BAD[_]+).*(BAD)', as_indexer=True)
tm.assert_series_equal(result, exp)
pytest.raises(ValueError, values.str.match, '.*(BAD[_]+).*(BAD)',
as_indexer=False)

# mixed
mixed = Series(['aBAD_BAD', NA, 'BAD_b_BAD', True, datetime.today(),
Expand Down