Description
Pandas version checks
-
I have checked that this issue has not already been reported.
-
I have confirmed this bug exists on the latest version of pandas.
-
I have confirmed this bug exists on the main branch of pandas.
Reproducible Example
import pandas as pd
from omegaconf import OmegaConf
df = pd.DataFrame({"a": [1, 2], "b": [3, 4]})
cfg = OmegaConf.create({"cols": ["a", "b"]})
cols = cfg.cols # This is a ListConfig
df[cols] = df[cols] * 2 # Raises ValueError
Error message:
ValueError: Cannot set a DataFrame with multiple columns to the single column ['a', 'b']
Issue Description
When using an omegaconf.ListConfig
object to select columns in a Pandas DataFrame, the assignment operation fails with a ValueError
, even though the shapes, columns, and indices of the left-hand side (LHS) and right-hand side (RHS) match perfectly. This behavior is unexpected and confusing, as it is not immediately clear that the issue is caused by the type of the column selector.
Expected Behavior:
The assignment should succeed, as the shapes, columns, and indices of the LHS and RHS match.
Likely Context of Encountering This:
This issue is likely to occur in workflows where omegaconf.ListConfig is used to manage configurations, such as specifying column names for normalization or other data processing tasks. For example:
# Compute min and max for normalization
min_vals = data[target_cols].min()
max_vals = data[target_cols].max()
# Attempt to normalize using ListConfig as column selector
data[target_cols] = (data[target_cols] - min_vals) / (max_vals - min_vals) # This raises the same ValueError
Workaround:
Convert the ListConfig object to a standard Python list before using it in Pandas operations:
data[list(target_cols)] = (data[list(target_cols)] - min_vals) / (max_vals - min_vals)
Why This Is Confusing:
- The error message suggests that a single column is being assigned multiple columns, which is misleading.
- Shapes, columns, and even indexes match. Online there is no notes to be found on this edge case.
- The actual issue is the type of the column selector (ListConfig), which behaves like a list in many other contexts.
Proposed Solution:
- Improve the error message to indicate that the column selector type might be incompatible.
- Consider adding support for omegaconf.ListConfig as a valid column selector, since
isinstance(cols, Sequence)
is True.
Expected Behavior
The assignment should succeed, as the shapes, columns, and indices of the LHS and RHS match.
Installed Versions
INSTALLED VERSIONS
commit : 0691c5c
python : 3.11.7
python-bits : 64
OS : Darwin
OS-release : 23.6.0
Version : Darwin Kernel Version 23.6.0: Fri Nov 15 15:13:28 PST 2024; root:xnu-10063.141.1.702.7~1/RELEASE_X86_64
machine : x86_64
processor : i386
byteorder : little
LC_ALL : None
LANG : None
LOCALE : None.UTF-8
pandas : 2.2.3
numpy : 1.26.4
pytz : 2025.2
dateutil : 2.9.0.post0
pip : 24.0
Cython : None
sphinx : None
IPython : 9.1.0
adbc-driver-postgresql: None
adbc-driver-sqlite : None
bs4 : 4.13.3
blosc : None
bottleneck : None
dataframe-api-compat : None
fastparquet : None
fsspec : 2025.3.2
html5lib : None
hypothesis : None
gcsfs : None
jinja2 : 3.1.6
lxml.etree : 4.9.4
matplotlib : 3.10.1
numba : 0.58.1
numexpr : None
odfpy : None
openpyxl : None
pandas_gbq : None
psycopg2 : None
pymysql : None
pyarrow : 19.0.1
pyreadstat : None
pytest : 8.3.5
python-calamine : None
pyxlsb : None
s3fs : None
scipy : 1.15.2
sqlalchemy : None
tables : None
tabulate : None
xarray : None
xlrd : None
xlsxwriter : None
zstandard : None
tzdata : 2025.2
qtpy : None
pyqt5 : None