Skip to content

Series.str.replace() is not actually the same as str.replace() #16808

Closed
@rosnfeld

Description

@rosnfeld

Code Sample

In [1]: import pandas as pd

In [2]: series = pd.Series(['a', '(b)'])

In [3]: series.str.replace('a', '[a]')
Out[3]: 
0    [a]
1    (b)
dtype: object

In [4]: series.str.replace('(b)', '[b]')  # unexpected behavior
Out[4]: 
0        a
1    ([b])
dtype: object

In [5]: series.str.replace('\(b\)', '[b]')  # need to escape
Out[5]: 
0      a
1    [b]
dtype: object

In [6]: '(b)'.replace('(b)', '[b]')   # Python str.replace is different, uses literal string
Out[6]: '[b]'

Problem description

The documentation for Series.str.replace says that it takes a "string or compiled regex" ... "String can be a character sequence or regular expression." ... "When repl is a string, every pat is replaced as with str.replace()"

However, that's not what is happening - it appears it's interpreting a string as a regex, so you need to escape characters like parentheses.

Expected Output

I would expect that for vanilla strings, it works like regular Python str.replace() - using literal strings instead of regexes.

Alternatively the documentation could be updated, but I think the Python str.replace() behavior is what most users would expect.

Output of pd.show_versions()

# Paste the output here pd.show_versions() here INSTALLED VERSIONS ------------------ commit: None python: 3.6.1.final.0 python-bits: 64 OS: Linux OS-release: 4.4.0-83-generic machine: x86_64 processor: x86_64 byteorder: little LC_ALL: None LANG: en_GB.UTF-8 LOCALE: en_GB.UTF-8

pandas: 0.20.2
pytest: 3.1.2
pip: 9.0.1
setuptools: 27.2.0
Cython: 0.25.2
numpy: 1.13.0
scipy: 0.19.1
xarray: None
IPython: 6.1.0
sphinx: None
patsy: None
dateutil: 2.6.0
pytz: 2017.2
blosc: None
bottleneck: None
tables: 3.3.0
numexpr: 2.6.2
feather: None
matplotlib: 2.0.2
openpyxl: None
xlrd: None
xlwt: None
xlsxwriter: None
lxml: None
bs4: None
html5lib: 0.999
sqlalchemy: None
pymysql: None
psycopg2: None
jinja2: 2.9.6
s3fs: None
pandas_gbq: None
pandas_datareader: None

Metadata

Metadata

Assignees

No one assigned

    Labels

    API DesignStringsString extension data type and string data

    Type

    No type

    Projects

    No projects

    Milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions