Description
Pandas version checks
-
I have checked that this issue has not already been reported.
-
I have confirmed this bug exists on the latest version of pandas.
-
I have confirmed this bug exists on the main branch of pandas.
Reproducible Example
import pandas as pd
df = pd.DataFrame((1,2,3), columns=['a#'])
df.query('a# > 2')
-------------------------------
KeyError Traceback (most recent call last)
File d:\Applications\Python\Python311\Lib\site-packages\pandas\core\computation\scope.py:231, in Scope.resolve(self, key, is_local)
230 if self.has_resolvers:
--> 231 return self.resolvers[key]
233 # if we're here that means that we have no locals and we also have
234 # no resolvers
File d:\Applications\Python\Python311\Lib\collections\__init__.py:1006, in ChainMap.__getitem__(self, key)
1005 pass
-> 1006 return self.__missing__(key)
File d:\Applications\Python\Python311\Lib\collections\__init__.py:998, in ChainMap.__missing__(self, key)
997 def __missing__(self, key):
--> 998 raise KeyError(key)
KeyError: 'a'
During handling of the above exception, another exception occurred:
KeyError Traceback (most recent call last)
File d:\Applications\Python\Python311\Lib\site-packages\pandas\core\computation\scope.py:242, in Scope.resolve(self, key, is_local)
238 try:
239 # last ditch effort we look in temporaries
240 # these are created when parsing indexing expressions
...
242 return self.temps[key]
243 except KeyError as err:
--> 244 raise UndefinedVariableError(key, is_local) from err
UndefinedVariableError: name 'a' is not defined
Issue Description
The query
function seems to treat symbol #
as a comment, it did not work as expected.
I also try to execute
df.query('`a#` > 2')
it still throws an exception
Traceback (most recent call last):
File d:\Applications\Python\Python311\Lib\site-packages\pandas\core\computation\parsing.py:192 in tokenize_string
yield tokenize_backtick_quoted_string(
File d:\Applications\Python\Python311\Lib\site-packages\pandas\core\computation\parsing.py:167 in tokenize_backtick_quoted_string
return BACKTICK_QUOTED_STRING, source[string_start:string_end]
UnboundLocalError: cannot access local variable 'string_end' where it is not associated with a value
The above exception was the direct cause of the following exception:
Traceback (most recent call last):
File d:\Applications\Python\Python311\Lib\site-packages\IPython\core\interactiveshell.py:3553 in run_code
exec(code_obj, self.user_global_ns, self.user_ns)
Cell In[59], [line 1](vscode-notebook-cell:?execution_count=59&line=1)
df.query('`a#` > 2')
File d:\Applications\Python\Python311\Lib\site-packages\pandas\core\frame.py:4823 in query
res = self.eval(expr, **kwargs)
File d:\Applications\Python\Python311\Lib\site-packages\pandas\core\frame.py:4949 in eval
...
raise SyntaxError(f"Failed to parse backticks in '{source}'.") from err
File <string>
SyntaxError: Failed to parse backticks in '`a#` > 2'.
Expected Behavior
like df[df['a#'] > 2]
Installed Versions
INSTALLED VERSIONS
commit : d9cdd2e
python : 3.11.6.final.0
python-bits : 64
OS : Windows
OS-release : 10
Version : 10.0.22631
machine : AMD64
processor : Intel64 Family 6 Model 183 Stepping 1, GenuineIntel
byteorder : little
LC_ALL : None
LANG : None
LOCALE : Chinese (Simplified)_China.936
pandas : 2.2.2
numpy : 1.26.3
pytz : 2023.3.post1
dateutil : 2.8.2
setuptools : 65.5.0
pip : 24.0
Cython : 3.0.8
pytest : None
hypothesis : None
sphinx : None
blosc : None
feather : None
xlsxwriter : None
lxml.etree : None
html5lib : None
pymysql : None
psycopg2 : None
jinja2 : 3.1.3
IPython : 8.20.0
pandas_datareader : None
adbc-driver-postgresql: None
adbc-driver-sqlite : None
bs4 : 4.12.2
bottleneck : None
dataframe-api-compat : None
fastparquet : None
fsspec : None
gcsfs : None
matplotlib : 3.8.2
numba : None
numexpr : None
odfpy : None
openpyxl : 3.1.2
pandas_gbq : None
pyarrow : None
pyreadstat : None
python-calamine : None
pyxlsb : None
s3fs : None
scipy : 1.11.4
sqlalchemy : None
tables : None
tabulate : None
xarray : None
xlrd : None
zstandard : None
tzdata : 2023.4
qtpy : None
pyqt5 : None