BUG: memory issues with `string[pyarrow]` after sorted `pd.merge`

### Pandas version checks

- [x] I have checked that this issue has not already been reported.

- [x] I have confirmed this bug exists on the [latest version](https://pandas.pydata.org/docs/whatsnew/index.html) of pandas.

- [x] I have confirmed this bug exists on the [main branch](https://pandas.pydata.org/docs/dev/getting_started/install.html#installing-the-development-version-of-pandas) of pandas.


### Reproducible Example

```python
import random
import string

import pandas as pd
import pyarrow as pa

# Gen random data ----------------------------------------------------------------------

random.seed(42)
txt = "".join(random.choices(string.printable, k=int(1e4)))
num = random.choices(range(int(1e6)), k=int(125e3))

# Gen dataframes -----------------------------------------------------------------------

a = pd.Series(num, dtype="Int64")
b = pd.Series([txt] * int(125e3), dtype="string[pyarrow]")

lhs = pd.DataFrame({"a": a, "b": b})
# Concatenation is necessary to reproduce bug (not sure why)
lhs = pd.concat([lhs, lhs], ignore_index=True, verify_integrity=True)

rhs = pd.DataFrame({"a": a}).drop_duplicates()

# Merge with and without sorting -------------------------------------------------------

df_nosort = pd.merge(left=lhs, right=rhs, on="a", sort=False)
print(df_nosort.memory_usage(deep=True))

df_sort = pd.merge(left=lhs, right=rhs, on="a", sort=True)
print(df_sort.memory_usage(deep=True))

# `b` cols are equal despite memory usage difference
print(df_nosort["b"].equals(df_sort["b"]))

# Write to parquet files ---------------------------------------------------------------

schema = pa.schema([pa.field("a", pa.int64()), pa.field("b", pa.string())])

df_nosort.to_parquet("df_nosort.parquet", schema=schema)
df_sort.to_parquet("df_sort.parquet", schema=schema)
```

### Issue Description

Issues only occur when series `b` has `dtype` of `string[pyarrow]` (not `string[python]`).

1. `.to_parquet` fails for `df_sort` but succeeds for `df_nosort`.
2. The memory usage of `b` is greater in `df_sort` than in `df_nosort`.
3. Despite differences in memory usage, `df_sort["b"]` is equal to `df_nosort["b"]`.


### Expected Behavior

1. I would expect `.to_parquet` to succeed for both dataframes.
2. I would expect `df_sort["b"]` to have the same memory usage as `df_nosort["b"]`. (I should note, however, that I lack a sophisticated understanding of memory management, so I may be mistaken.)
3. I would expect `df_nosort["b"].equals(df_sort["b"])` to return `False` if the series differ in memory usage. (Same caveat applies.)


### Installed Versions

<details>

INSTALLED VERSIONS
------------------
commit                : 0691c5cf90477d3503834d983f69350f250a6ff7
python                : 3.10.16
python-bits           : 64
OS                    : Darwin
OS-release            : 24.4.0
Version               : Darwin Kernel Version 24.4.0: Wed Mar 19 21:16:34 PDT 2025; root:xnu-11417.101.15~1/RELEASE_ARM64_T6000
machine               : arm64
processor             : arm
byteorder             : little
LC_ALL                : None
LANG                  : en_US.UTF-8
LOCALE                : en_US.UTF-8

pandas                : 2.2.3
numpy                 : 2.2.2
pytz                  : 2025.2
dateutil              : 2.9.0.post0
pip                   : 25.0
Cython                : None
sphinx                : None
IPython               : 8.35.0
adbc-driver-postgresql: None
adbc-driver-sqlite    : None
bs4                   : 4.13.3
blosc                 : None
bottleneck            : None
dataframe-api-compat  : None
fastparquet           : None
fsspec                : 2025.3.2
html5lib              : None
hypothesis            : None
gcsfs                 : None
jinja2                : 3.1.6
lxml.etree            : 5.3.1
matplotlib            : None
numba                 : None
numexpr               : None
odfpy                 : None
openpyxl              : 3.1.5
pandas_gbq            : None
psycopg2              : None
pymysql               : None
pyarrow               : 19.0.1
pyreadstat            : None
pytest                : None
python-calamine       : None
pyxlsb                : None
s3fs                  : None
scipy                 : None
sqlalchemy            : 2.0.39
tables                : None
tabulate              : None
xarray                : None
xlrd                  : 2.0.1
xlsxwriter            : None
zstandard             : None
tzdata                : 2025.2
qtpy                  : None
pyqt5                 : None

</details>


Provide feedback

Saved searches

Use saved searches to filter your results more quickly

BUG: memory issues with `string[pyarrow]` after sorted `pd.merge` #61322

Pandas version checks

Reproducible Example

Issue Description

Expected Behavior

Installed Versions

INSTALLED VERSIONS

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

BUG: memory issues with string[pyarrow] after sorted pd.merge #61322

Description

Pandas version checks

Reproducible Example

Issue Description

Expected Behavior

Installed Versions

INSTALLED VERSIONS

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions

BUG: memory issues with `string[pyarrow]` after sorted `pd.merge` #61322