-
Notifications
You must be signed in to change notification settings - Fork 13.5k
[llvm-lit] Fix TypeError
string argument expected in lit's internal shell
#105925
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
@llvm/pr-subscribers-testing-tools Author: None (Harini0924) ChangesThis patch addresses an issue encountered when running the BOLT test suite with the lit internal shell using the command:
During execution, the following error was observed:
This error occurs because the This patch adds This change is relevant for [RFC] Enabling the Lit Internal Shell by Default Full diff: https://github.com/llvm/llvm-project/pull/105925.diff 1 Files Affected:
diff --git a/llvm/utils/lit/lit/TestRunner.py b/llvm/utils/lit/lit/TestRunner.py
index 2d9af9fbbb3634..625ce009515d5f 100644
--- a/llvm/utils/lit/lit/TestRunner.py
+++ b/llvm/utils/lit/lit/TestRunner.py
@@ -413,7 +413,7 @@ def maybeUnescape(arg):
for arg in args[:-1]:
stdout.write(encode(maybeUnescape(arg)))
stdout.write(encode(" "))
- stdout.write(encode(maybeUnescape(args[-1])))
+ stdout.write(encode(maybeUnescape(str(args[-1]))))
if write_newline:
stdout.write(encode("\n"))
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This seems fine. IIRC there's a method to expand globs that may also be useful, but this seems straightforward enough of a conversion to just do it here.
My big question is why does this only need to be done for args[-1]
and not the call to stdio.write
above that uses arg
? seems that they would have the same issue, right?
llvm/utils/lit/lit/TestRunner.py
Outdated
@@ -413,7 +413,7 @@ def maybeUnescape(arg): | |||
for arg in args[:-1]: | |||
stdout.write(encode(maybeUnescape(arg))) | |||
stdout.write(encode(" ")) | |||
stdout.write(encode(maybeUnescape(args[-1]))) | |||
stdout.write(encode(maybeUnescape(str(args[-1])))) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Conceptually, this seems like the wrong place to work around this problem. I would imagine the shell should expand globs before invoking any commands. Only doing it for the last one also seems odd - I guess that't the only case triggered by the current test suite, but surely we can construct inputs that have glob patterns elsewhere?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You're right that expanding globs should ideally happen before any commands are invoked. The fact that the issue only occurs with the last argument does seem odd, and it might just be that the current test suite only triggers the problem there.
We could address this by placing the line cmd.args = expand_glob_expressions(cmd.args, shenv.cwd)
at the top of the executeBuiltinEcho
function. This would ensure that all arguments are expanded consistently before the command is executed. However, I believe adding str()
to the last argument might be a more targeted and effective solution in this specific case. The problem appears to be that the last argument isn't being expanded correctly, leading to the error. By converting just the last argument to a string, we directly address the failing scenario without introducing broader changes that might impact other parts of the code.
While expanding all arguments at the top is a comprehensive solution, adding str()
to the last argument is a simpler fix that directly resolves the current issue. It keeps the changes minimal and focused on the specific failure encountered, which might be preferable in this context IMO. What are your thoughts on this approach?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I would much prefer using cmd.args = expand_glob_expressions(cmd.args, shenv.cwd)
to match mkdir and rm, but really the internal shell should expand globs before calling any builtin command. However, that is a more invasive change and should be done in a follow-up.
The fact that it only happens on the last arg indicates that we are missing test coverage and are most likely not doing the right thing: str()
will just convert the glob to a string instead of expanding it.
I also notice we already expand the first argument:
# Ensure args[0] is hashable.
args[0] = expand_glob(args[0], cmd_shenv.cwd)[0]
I missed #101590, but it seems odd just performing glob expansion on the first item - I'd strongly suggest adding some tests for glob expansion and expanding the whole command - this would allow dropping the special handling from rm/mkdir. Also does this mean we don't expand globs when calling external commands?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@arichardson In this PR Reapply "[llvm-lit] Add precommit test to verify current behavior of glob expansion in lit's internal shell" (#106763#) 107169, I have added the necessary tests to verify the current behavior. After analyzing the situation, I believe we should definitely expand args[0]
within the expand_glob_expressions
function itself, so we don't need to handle it separately later on.
By expanding args[0]
directly in this function, we can avoid the need for special handling elsewhere in the code. Once this change is approved, I can go ahead and update shtest-glob.py
to reflect the expected behavior by changing:
# CHECK: UNRESOLVED: shtest-glob :: glob-echo.txt ({{[^)]*}})
# CHECK: TypeError: string argument expected, got 'GlobItem'
to:
# CHECK: PASS: shtest-glob :: glob-echo.txt ({{[^)]*}})
Let me know if this approach looks good to you!
…nsion in lit's internal shell (#106325) This patch introduces a precommit test to verify the current behavior of glob expansion in lit's internal shell. The motivation for this test stems from an issue encountered during the BOLT test suite when running with the lit internal shell using the command: `LIT_USE_INTERNAL_SHELL=1 ninja check-bolt` During execution, the following error was observed: ``` File "/usr/local/google/home/harinidonthula/llvm-project/llvm/utils/lit/lit/TestRunner.py", line 416, in executeBuiltinEcho stdout.write(encode(maybeUnescape(args[-1]))) TypeError: string argument expected, got 'GlobItem' ``` The `executeBuiltinEcho` function in the lit testing framework expects a string to be passed to `stdout.write`, but it received a `GlobItem` object instead. This precommit test is designed to check the current behavior where the glob pattern isn't correctly expanded, leading to this `TypeError`. While this patch doesn't fix the issue, it helps in understanding and verifying the current behavior. The feedback I received from this [PR](#105925) suggests using `cmd.args = expand_glob_expressions(cmd.args, shenv.cwd)` to match the behavior of `executeBuiltinMkdir` and `executeBuiltinRm`, but it is recognized that the internal shell should ideally expand globs before calling any built-in command. **Request for Feedback:** I'm looking for feedback on how to improve this precommit test, specifically regarding the handling and expansion of glob patterns for commands like mkdir and rm within the internal shell. Currently, the args are expanded at the beginning of these functions, which should ensure proper glob expansion. However, I'd appreciate guidance on whether I should write additional tests to verify that mkdir and rm are handling glob expansions correctly. If such tests are recommended, I would also appreciate advice on the best approach to implement them, considering the existing framework and the way glob expansion is expected to function in the internal shell. Should these tests confirm that the current implementation passes, or are there specific edge cases I should be aware of? **Next Steps:** In my follow-up PR, I plan to address the UNRESOLVED error by expanding the entire command, ensuring correct and consistent behavior across all commands. The current test checks for an unresolved issue with the glob expansion, specifically looking for a `TypeError` due to an unexpanded `GlobItem`. This will be updated to reflect the correct behavior once the issue is resolved. This change is relevant for [[RFC] Enabling the Lit Internal Shell by Default](https://discourse.llvm.org/t/rfc-enabling-the-lit-internal-shell-by-default/80179/3)
✅ With the latest revision this PR passed the Python code formatter. |
llvm/utils/lit/lit/TestRunner.py
Outdated
|
||
# Expand all glob expressions | ||
args = expand_glob_expressions(args, cmd_shenv.cwd) | ||
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
trailing space
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The new lit test needs to be updated. Also commit message needs changing.
Code change LGTM
This patch modifies the expand_glob_expressions function in TestRunner.py to expand all arguments instead of just the first one. Previously, only args[0] was expanded for glob expressions, which could lead to unexpected behavior when multiple arguments needed glob expansion. Changes: The loop in expand_glob_expressions now iterates over all arguments instead of starting from the second one. The call to expand_glob_expressions has been moved earlier in the _executeShCmd function to ensure all arguments are expanded before any processing. Removed the previous separate expansion for args[0] as it is now redundant. These changes ensure that all glob expressions are properly expanded across all command arguments, improving the robustness and correctness of the lit internal shell.
125622e
to
665a341
Compare
@arichardson I’ve updated the commit message as suggested and changed the lit test from |
# Ensure args[0] is hashable. | ||
args[0] = expand_glob(args[0], cmd_shenv.cwd)[0] | ||
# Expand all glob expressions | ||
args = expand_glob_expressions(args, cmd_shenv.cwd) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
With this change the call to expand_glob_expressions
in executeBuiltinMkdir
and executeBuiltinRm()
can also be removed.
It would probably also be good to add a test that checks something like the following
# RUN: touch %t-first.txt %t-second.txt
# RUN: ls %t-*.txt
# RUN: rm -v %t-*.txt
# Check that both files were deleted.
# RUN: test ! -e %t-first.txt
# RUN: test ! -e %t-second.txt
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I initially didn’t remove the call to expand_glob_expressions
in executeBuiltinMkdir
and executeBuiltinRm
because removing it was causing an error. Here's the error I encountered:
File "/usr/local/google/home/harinidonthula/llvm-project/llvm/utils/lit/lit/TestRunner.py", line 467, in executeBuiltinRm
opts, args = getopt.gnu_getopt(args, "frR", ["--recursive"])
UnboundLocalError: cannot access local variable 'args' where it is not associated with a value
This error occurs because args
wasn't being initialized correctly after I removed the expand_glob_expressions(cmd.args, cmd_shenv.cwd)[1:]
line from these functions. The issue is that the functions still expect args to be expanded and processed, but when I rely solely on the expansion happening in _executeShCmd
, the arguments aren't passed in the correct form to these built-in functions.
Specifically, in the context of commands like mkdir
and rm
, args[0]
represents the command itself (e.g., mkdir or rm), and this should not be expanded. Expanding args[0]
could cause the command to be mistakenly treated as a file or directory to expand, which would break the logic. Therefore, only args[1:]
(the options or arguments) should undergo glob expansion. For example, in a command like mkdir -p folder*, we only want folder* to be expanded, not the command mkdir itself. Correct me if I am wrong, but that is what I was getting.
In short, removing expand_glob_expressions
from these functions without ensuring proper argument handling causes the built-in commands to fail when trying to access args
later in the flow, especially since args[0]
should not be expanded.
I'm also working on writing the other test. Let me know what you think.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't see expanding args[0] being a problem - I just tested with zsh and it just expands all globs:
❯ echo -e '#!/bin/sh\necho script executed' > script.sh && chmod +x script.sh
❯ ./sc*
script executed
❯ sc*
Command script.sh not found
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
And yes you will have to make additional changes to the builtinMkdir and builtinRM:
Instead of args
(which came from the line args = expand_glob_expressions(cmd.args, cmd_shenv.cwd)[1:]
you just use cmd.args
. args
won't magically appear in this function if you remove the line that assigns to the variable.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM. I wonder why we restricted the expansion to the first element in the first place?
…nsion in lit's internal shell (#106325) This patch introduces a precommit test to verify the current behavior of glob expansion in lit's internal shell. The motivation for this test stems from an issue encountered during the BOLT test suite when running with the lit internal shell using the command: `LIT_USE_INTERNAL_SHELL=1 ninja check-bolt` During execution, the following error was observed: ``` File "/usr/local/google/home/harinidonthula/llvm-project/llvm/utils/lit/lit/TestRunner.py", line 416, in executeBuiltinEcho stdout.write(encode(maybeUnescape(args[-1]))) TypeError: string argument expected, got 'GlobItem' ``` The `executeBuiltinEcho` function in the lit testing framework expects a string to be passed to `stdout.write`, but it received a `GlobItem` object instead. This precommit test is designed to check the current behavior where the glob pattern isn't correctly expanded, leading to this `TypeError`. While this patch doesn't fix the issue, it helps in understanding and verifying the current behavior. The feedback I received from this [PR](llvm/llvm-project#105925) suggests using `cmd.args = expand_glob_expressions(cmd.args, shenv.cwd)` to match the behavior of `executeBuiltinMkdir` and `executeBuiltinRm`, but it is recognized that the internal shell should ideally expand globs before calling any built-in command. **Request for Feedback:** I'm looking for feedback on how to improve this precommit test, specifically regarding the handling and expansion of glob patterns for commands like mkdir and rm within the internal shell. Currently, the args are expanded at the beginning of these functions, which should ensure proper glob expansion. However, I'd appreciate guidance on whether I should write additional tests to verify that mkdir and rm are handling glob expansions correctly. If such tests are recommended, I would also appreciate advice on the best approach to implement them, considering the existing framework and the way glob expansion is expected to function in the internal shell. Should these tests confirm that the current implementation passes, or are there specific edge cases I should be aware of? **Next Steps:** In my follow-up PR, I plan to address the UNRESOLVED error by expanding the entire command, ensuring correct and consistent behavior across all commands. The current test checks for an unresolved issue with the glob expansion, specifically looking for a `TypeError` due to an unexpanded `GlobItem`. This will be updated to reflect the correct behavior once the issue is resolved. This change is relevant for [[RFC] Enabling the Lit Internal Shell by Default](https://discourse.llvm.org/t/rfc-enabling-the-lit-internal-shell-by-default/80179/3)
This patch addresses an issue encountered when running the BOLT test suite with the lit internal shell using the command:
During execution, the following error was observed:
This error occurs because the
executeBuiltinEcho
function in the lit testing framework expects a string to be passed tostdout.write
, but it received aGlobItem
object instead. TheGlobItem
represents an unexpanded glob pattern that needs to be converted to a string before being processed.This patch modifies the
expand_glob_expressions
function inTestRunner.py
to expand all arguments instead of just the first one. Previously, onlyargs[0]
was expanded for glob expressions, which could lead to unexpected behavior when multiple arguments needed glob expansion.Changes:
expand_glob_expressions
now iterates over all arguments instead of starting from the second one.expand_glob_expressions
has been moved earlier in the_executeShCmd
function to ensure all arguments are expanded before any processing.shtest-glob.py
test to check for the correct behavior, changing the expected output fromUNRESOLVED
toPASS
.These changes ensure that all glob expressions are properly expanded across all command arguments, improving the robustness and correctness of the lit internal shell.
This change is relevant for [RFC] Enabling the Lit Internal Shell by Default
fixes: #102693