[llvm-lit] Fix `TypeError` string argument expected in lit's internal shell #105925

Harini0924 · 2024-08-24T05:46:23Z

This patch addresses an issue encountered when running the BOLT test suite with the lit internal shell using the command:

LIT_USE_INTERNAL_SHELL=1 ninja check-bolt

During execution, the following error was observed:

File "/usr/local/google/home/harinidonthula/llvm-project/llvm/utils/lit/lit/TestRunner.py", line 416, in executeBuiltinEcho
    stdout.write(encode(maybeUnescape(args[-1])))
TypeError: string argument expected, got 'GlobItem'

This error occurs because the executeBuiltinEcho function in the lit testing framework expects a string to be passed to stdout.write, but it received a GlobItem object instead. The GlobItem represents an unexpanded glob pattern that needs to be converted to a string before being processed.

This patch modifies the expand_glob_expressions function in TestRunner.py to expand all arguments instead of just the first one. Previously, only args[0] was expanded for glob expressions, which could lead to unexpected behavior when multiple arguments needed glob expansion.

Changes:

The loop in expand_glob_expressions now iterates over all arguments instead of starting from the second one.
The call to expand_glob_expressions has been moved earlier in the _executeShCmd function to ensure all arguments are expanded before any processing.
Removed the previous separate expansion for args[0] as it is now redundant.
Updated the shtest-glob.py test to check for the correct behavior, changing the expected output from UNRESOLVED to PASS.

These changes ensure that all glob expressions are properly expanded across all command arguments, improving the robustness and correctness of the lit internal shell.

This change is relevant for [RFC] Enabling the Lit Internal Shell by Default
fixes: #102693

llvmbot · 2024-08-24T05:46:52Z

@llvm/pr-subscribers-testing-tools

Author: None (Harini0924)

Changes

This patch addresses an issue encountered when running the BOLT test suite with the lit internal shell using the command:

LIT_USE_INTERNAL_SHELL=1 ninja check-bolt

During execution, the following error was observed:

File "/usr/local/google/home/harinidonthula/llvm-project/llvm/utils/lit/lit/TestRunner.py", line 416, in executeBuiltinEcho
    stdout.write(encode(maybeUnescape(args[-1])))
TypeError: string argument expected, got 'GlobItem'

This error occurs because the executeBuiltinEcho function in the lit testing framework expects a string to be passed to stdout.write, but it received a GlobItem object instead. The GlobItem represents an unexpanded glob pattern that needs to be converted to a string before being processed.

This patch adds str() to the arguments passed to stdout.write in the executeBuiltinEcho function. By converting the arguments to strings, including any GlobItem objects, we ensure compatibility and prevent the TypeError during test execution.

This change is relevant for [RFC] Enabling the Lit Internal Shell by Default
fixes: #102693

Full diff: https://github.com/llvm/llvm-project/pull/105925.diff

1 Files Affected:

(modified) llvm/utils/lit/lit/TestRunner.py (+1-1)

diff --git a/llvm/utils/lit/lit/TestRunner.py b/llvm/utils/lit/lit/TestRunner.py
index 2d9af9fbbb3634..625ce009515d5f 100644
--- a/llvm/utils/lit/lit/TestRunner.py
+++ b/llvm/utils/lit/lit/TestRunner.py
@@ -413,7 +413,7 @@ def maybeUnescape(arg):
         for arg in args[:-1]:
             stdout.write(encode(maybeUnescape(arg)))
             stdout.write(encode(" "))
-        stdout.write(encode(maybeUnescape(args[-1])))
+        stdout.write(encode(maybeUnescape(str(args[-1]))))
     if write_newline:
         stdout.write(encode("\n"))

ilovepi

This seems fine. IIRC there's a method to expand globs that may also be useful, but this seems straightforward enough of a conversion to just do it here.

My big question is why does this only need to be done for args[-1] and not the call to stdio.write above that uses arg? seems that they would have the same issue, right?

arichardson · 2024-08-26T23:32:14Z

llvm/utils/lit/lit/TestRunner.py

@@ -413,7 +413,7 @@ def maybeUnescape(arg):
        for arg in args[:-1]:
            stdout.write(encode(maybeUnescape(arg)))
            stdout.write(encode(" "))
-        stdout.write(encode(maybeUnescape(args[-1])))
+        stdout.write(encode(maybeUnescape(str(args[-1]))))


Conceptually, this seems like the wrong place to work around this problem. I would imagine the shell should expand globs before invoking any commands. Only doing it for the last one also seems odd - I guess that't the only case triggered by the current test suite, but surely we can construct inputs that have glob patterns elsewhere?

You're right that expanding globs should ideally happen before any commands are invoked. The fact that the issue only occurs with the last argument does seem odd, and it might just be that the current test suite only triggers the problem there.

We could address this by placing the line cmd.args = expand_glob_expressions(cmd.args, shenv.cwd) at the top of the executeBuiltinEcho function. This would ensure that all arguments are expanded consistently before the command is executed. However, I believe adding str() to the last argument might be a more targeted and effective solution in this specific case. The problem appears to be that the last argument isn't being expanded correctly, leading to the error. By converting just the last argument to a string, we directly address the failing scenario without introducing broader changes that might impact other parts of the code.

While expanding all arguments at the top is a comprehensive solution, adding str() to the last argument is a simpler fix that directly resolves the current issue. It keeps the changes minimal and focused on the specific failure encountered, which might be preferable in this context IMO. What are your thoughts on this approach?

I would much prefer using cmd.args = expand_glob_expressions(cmd.args, shenv.cwd) to match mkdir and rm, but really the internal shell should expand globs before calling any builtin command. However, that is a more invasive change and should be done in a follow-up.

The fact that it only happens on the last arg indicates that we are missing test coverage and are most likely not doing the right thing: str() will just convert the glob to a string instead of expanding it.

I also notice we already expand the first argument:

# Ensure args[0] is hashable. args[0] = expand_glob(args[0], cmd_shenv.cwd)[0]

I missed #101590, but it seems odd just performing glob expansion on the first item - I'd strongly suggest adding some tests for glob expansion and expanding the whole command - this would allow dropping the special handling from rm/mkdir. Also does this mean we don't expand globs when calling external commands?

@arichardson In this PR Reapply "[llvm-lit] Add precommit test to verify current behavior of glob expansion in lit's internal shell" (#106763#) 107169, I have added the necessary tests to verify the current behavior. After analyzing the situation, I believe we should definitely expand args[0] within the expand_glob_expressions function itself, so we don't need to handle it separately later on.

By expanding args[0] directly in this function, we can avoid the need for special handling elsewhere in the code. Once this change is approved, I can go ahead and update shtest-glob.py to reflect the expected behavior by changing:

# CHECK: UNRESOLVED: shtest-glob :: glob-echo.txt ({{[^)]*}}) # CHECK: TypeError: string argument expected, got 'GlobItem'

to:

# CHECK: PASS: shtest-glob :: glob-echo.txt ({{[^)]*}})

Let me know if this approach looks good to you!

…nsion in lit's internal shell (#106325) This patch introduces a precommit test to verify the current behavior of glob expansion in lit's internal shell. The motivation for this test stems from an issue encountered during the BOLT test suite when running with the lit internal shell using the command: `LIT_USE_INTERNAL_SHELL=1 ninja check-bolt` During execution, the following error was observed: ``` File "/usr/local/google/home/harinidonthula/llvm-project/llvm/utils/lit/lit/TestRunner.py", line 416, in executeBuiltinEcho stdout.write(encode(maybeUnescape(args[-1]))) TypeError: string argument expected, got 'GlobItem' ``` The `executeBuiltinEcho` function in the lit testing framework expects a string to be passed to `stdout.write`, but it received a `GlobItem` object instead. This precommit test is designed to check the current behavior where the glob pattern isn't correctly expanded, leading to this `TypeError`. While this patch doesn't fix the issue, it helps in understanding and verifying the current behavior. The feedback I received from this [PR](#105925) suggests using `cmd.args = expand_glob_expressions(cmd.args, shenv.cwd)` to match the behavior of `executeBuiltinMkdir` and `executeBuiltinRm`, but it is recognized that the internal shell should ideally expand globs before calling any built-in command. **Request for Feedback:** I'm looking for feedback on how to improve this precommit test, specifically regarding the handling and expansion of glob patterns for commands like mkdir and rm within the internal shell. Currently, the args are expanded at the beginning of these functions, which should ensure proper glob expansion. However, I'd appreciate guidance on whether I should write additional tests to verify that mkdir and rm are handling glob expansions correctly. If such tests are recommended, I would also appreciate advice on the best approach to implement them, considering the existing framework and the way glob expansion is expected to function in the internal shell. Should these tests confirm that the current implementation passes, or are there specific edge cases I should be aware of? **Next Steps:** In my follow-up PR, I plan to address the UNRESOLVED error by expanding the entire command, ensuring correct and consistent behavior across all commands. The current test checks for an unresolved issue with the glob expansion, specifically looking for a `TypeError` due to an unexpanded `GlobItem`. This will be updated to reflect the correct behavior once the issue is resolved. This change is relevant for [[RFC] Enabling the Lit Internal Shell by Default](https://discourse.llvm.org/t/rfc-enabling-the-lit-internal-shell-by-default/80179/3)

github-actions · 2024-09-05T08:15:15Z

✅ With the latest revision this PR passed the Python code formatter.

arichardson · 2024-09-05T15:09:01Z

llvm/utils/lit/lit/TestRunner.py

-
+        # Expand all glob expressions
+        args = expand_glob_expressions(args, cmd_shenv.cwd)
+        


trailing space

arichardson

The new lit test needs to be updated. Also commit message needs changing.

Code change LGTM

This patch modifies the expand_glob_expressions function in TestRunner.py to expand all arguments instead of just the first one. Previously, only args[0] was expanded for glob expressions, which could lead to unexpected behavior when multiple arguments needed glob expansion. Changes: The loop in expand_glob_expressions now iterates over all arguments instead of starting from the second one. The call to expand_glob_expressions has been moved earlier in the _executeShCmd function to ensure all arguments are expanded before any processing. Removed the previous separate expansion for args[0] as it is now redundant. These changes ensure that all glob expressions are properly expanded across all command arguments, improving the robustness and correctness of the lit internal shell.

Harini0924 · 2024-09-05T16:48:04Z

@arichardson I’ve updated the commit message as suggested and changed the lit test from UNRESOLVED to PASS. Let me know if there are any other changes you'd like me to make.

arichardson · 2024-09-05T17:28:00Z

llvm/utils/lit/lit/TestRunner.py

-        # Ensure args[0] is hashable.
-        args[0] = expand_glob(args[0], cmd_shenv.cwd)[0]
+        # Expand all glob expressions
+        args = expand_glob_expressions(args, cmd_shenv.cwd)


With this change the call to expand_glob_expressions in executeBuiltinMkdir and executeBuiltinRm() can also be removed.

It would probably also be good to add a test that checks something like the following

# RUN: touch %t-first.txt %t-second.txt # RUN: ls %t-*.txt # RUN: rm -v %t-*.txt # Check that both files were deleted. # RUN: test ! -e %t-first.txt # RUN: test ! -e %t-second.txt

I initially didn’t remove the call to expand_glob_expressions in executeBuiltinMkdir and executeBuiltinRm because removing it was causing an error. Here's the error I encountered:

File "/usr/local/google/home/harinidonthula/llvm-project/llvm/utils/lit/lit/TestRunner.py", line 467, in executeBuiltinRm opts, args = getopt.gnu_getopt(args, "frR", ["--recursive"]) UnboundLocalError: cannot access local variable 'args' where it is not associated with a value

This error occurs because args wasn't being initialized correctly after I removed the expand_glob_expressions(cmd.args, cmd_shenv.cwd)[1:] line from these functions. The issue is that the functions still expect args to be expanded and processed, but when I rely solely on the expansion happening in _executeShCmd, the arguments aren't passed in the correct form to these built-in functions.

Specifically, in the context of commands like mkdir and rm, args[0] represents the command itself (e.g., mkdir or rm), and this should not be expanded. Expanding args[0] could cause the command to be mistakenly treated as a file or directory to expand, which would break the logic. Therefore, only args[1:] (the options or arguments) should undergo glob expansion. For example, in a command like mkdir -p folder*, we only want folder* to be expanded, not the command mkdir itself. Correct me if I am wrong, but that is what I was getting.

In short, removing expand_glob_expressions from these functions without ensuring proper argument handling causes the built-in commands to fail when trying to access args later in the flow, especially since args[0] should not be expanded.

I'm also working on writing the other test. Let me know what you think.

I don't see expanding args[0] being a problem - I just tested with zsh and it just expands all globs:

❯ echo -e '#!/bin/sh\necho script executed' > script.sh && chmod +x script.sh ❯ ./sc* script executed ❯ sc* Command script.sh not found

And yes you will have to make additional changes to the builtinMkdir and builtinRM:
Instead of args (which came from the line args = expand_glob_expressions(cmd.args, cmd_shenv.cwd)[1:] you just use cmd.args. args won't magically appear in this function if you remove the line that assigns to the variable.

fmayer

LGTM. I wonder why we restricted the expansion to the first element in the first place?

…nsion in lit's internal shell (#106325) This patch introduces a precommit test to verify the current behavior of glob expansion in lit's internal shell. The motivation for this test stems from an issue encountered during the BOLT test suite when running with the lit internal shell using the command: `LIT_USE_INTERNAL_SHELL=1 ninja check-bolt` During execution, the following error was observed: ``` File "/usr/local/google/home/harinidonthula/llvm-project/llvm/utils/lit/lit/TestRunner.py", line 416, in executeBuiltinEcho stdout.write(encode(maybeUnescape(args[-1]))) TypeError: string argument expected, got 'GlobItem' ``` The `executeBuiltinEcho` function in the lit testing framework expects a string to be passed to `stdout.write`, but it received a `GlobItem` object instead. This precommit test is designed to check the current behavior where the glob pattern isn't correctly expanded, leading to this `TypeError`. While this patch doesn't fix the issue, it helps in understanding and verifying the current behavior. The feedback I received from this [PR](llvm/llvm-project#105925) suggests using `cmd.args = expand_glob_expressions(cmd.args, shenv.cwd)` to match the behavior of `executeBuiltinMkdir` and `executeBuiltinRm`, but it is recognized that the internal shell should ideally expand globs before calling any built-in command. **Request for Feedback:** I'm looking for feedback on how to improve this precommit test, specifically regarding the handling and expansion of glob patterns for commands like mkdir and rm within the internal shell. Currently, the args are expanded at the beginning of these functions, which should ensure proper glob expansion. However, I'd appreciate guidance on whether I should write additional tests to verify that mkdir and rm are handling glob expansions correctly. If such tests are recommended, I would also appreciate advice on the best approach to implement them, considering the existing framework and the way glob expansion is expected to function in the internal shell. Should these tests confirm that the current implementation passes, or are there specific edge cases I should be aware of? **Next Steps:** In my follow-up PR, I plan to address the UNRESOLVED error by expanding the entire command, ensuring correct and consistent behavior across all commands. The current test checks for an unresolved issue with the glob expansion, specifically looking for a `TypeError` due to an unexpanded `GlobItem`. This will be updated to reflect the correct behavior once the issue is resolved. This change is relevant for [[RFC] Enabling the Lit Internal Shell by Default](https://discourse.llvm.org/t/rfc-enabling-the-lit-internal-shell-by-default/80179/3)

Harini0924 requested a review from arichardson August 24, 2024 05:46

llvmbot added llvm-lit testing-tools labels Aug 24, 2024

Harini0924 requested review from petrhosek and ilovepi August 26, 2024 13:31

ilovepi reviewed Aug 26, 2024

View reviewed changes

arichardson reviewed Aug 26, 2024

View reviewed changes

Harini0924 mentioned this pull request Aug 28, 2024

[llvm-lit] Add precommit test to verify current behavior of glob expansion in lit's internal shell #106325

Merged

arichardson reviewed Sep 5, 2024

View reviewed changes

arichardson requested changes Sep 5, 2024

View reviewed changes

Harini0924 force-pushed the bolt_typeError branch from 125622e to 665a341 Compare September 5, 2024 16:41

Harini0924 requested a review from fmayer September 5, 2024 16:56

arichardson reviewed Sep 5, 2024

View reviewed changes

fmayer approved these changes Sep 5, 2024

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[llvm-lit] Fix `TypeError` string argument expected in lit's internal shell #105925

[llvm-lit] Fix `TypeError` string argument expected in lit's internal shell #105925

Harini0924 commented Aug 24, 2024 •

edited

Loading

llvmbot commented Aug 24, 2024

ilovepi left a comment

arichardson Aug 26, 2024

Harini0924 Aug 27, 2024

arichardson Aug 27, 2024

Harini0924 Sep 5, 2024

github-actions bot commented Sep 5, 2024 •

edited

Loading

arichardson Sep 5, 2024

arichardson left a comment

Harini0924 commented Sep 5, 2024

arichardson Sep 5, 2024

Harini0924 Sep 5, 2024

arichardson Sep 5, 2024

arichardson Sep 5, 2024

fmayer left a comment


		# Expand all glob expressions
		args = expand_glob_expressions(args, cmd_shenv.cwd)

[llvm-lit] Fix TypeError string argument expected in lit's internal shell #105925

Are you sure you want to change the base?

[llvm-lit] Fix TypeError string argument expected in lit's internal shell #105925

Conversation

Harini0924 commented Aug 24, 2024 • edited Loading

llvmbot commented Aug 24, 2024

ilovepi left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

github-actions bot commented Sep 5, 2024 • edited Loading

Choose a reason for hiding this comment

arichardson left a comment

Choose a reason for hiding this comment

Harini0924 commented Sep 5, 2024

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

fmayer left a comment

Choose a reason for hiding this comment

[llvm-lit] Fix `TypeError` string argument expected in lit's internal shell #105925

[llvm-lit] Fix `TypeError` string argument expected in lit's internal shell #105925

Harini0924 commented Aug 24, 2024 •

edited

Loading

github-actions bot commented Sep 5, 2024 •

edited

Loading