Skip to content

DataFrame regex with dictionary will not work correctly. #25742

Closed
@dragonator4

Description

@dragonator4

Code Sample

df = pd.DataFrame(
    {'A': [0, 1, 2],
     'B': ['ba\nt', 'foo', 'bait'],
     'C': ['abc', 'ba\nr', 'xyz']}
)
df.replace(regex={'\n': '', r'^fo.$': 'xyz'})  # Neither do r'\n' or '\\n' work.

Problem description

Does not replace the newline character. Also checked with other escaped characters like spaces, tabs. However, the other regex expression works:

	A	B	C
0	0	ba\nt	abc
1	1	xyz	ba\nr
2	2	bait	xyz

Expected Output

	A	B	C
0	0	bat	abc
1	1	xyz	bar
2	2	bait	xyz

Additional information

Working:

df.replace('\n', '', regex=True).replace('foo', 'xyz')

Related SO question.

Output of pd.show_versions()

INSTALLED VERSIONS

commit: None
python: 3.7.1.final.0
python-bits: 64
OS: Windows
OS-release: 10
machine: AMD64
processor: Intel64 Family 6 Model 158 Stepping 9, GenuineIntel
byteorder: little
LC_ALL: None
LANG: None
LOCALE: None.None

pandas: 0.24.1
pytest: None
pip: 19.0.3
setuptools: 40.8.0
Cython: 0.29.5
numpy: 1.15.4
scipy: 1.1.0
pyarrow: None
xarray: 0.11.3
IPython: 7.3.0
sphinx: None
patsy: 0.5.1
dateutil: 2.8.0
pytz: 2018.9
blosc: None
bottleneck: 1.2.1
tables: 3.4.4
numexpr: 2.6.9
feather: None
matplotlib: 3.0.3
openpyxl: 2.6.0
xlrd: 1.2.0
xlwt: 1.3.0
xlsxwriter: 1.1.5
lxml.etree: None
bs4: None
html5lib: None
sqlalchemy: 1.2.18
pymysql: None
psycopg2: 2.7.6.1 (dt dec pq3 ext lo64)
jinja2: 2.10
s3fs: None
fastparquet: None
pandas_gbq: None
pandas_datareader: None
gcsfs: None

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions