-
-
Notifications
You must be signed in to change notification settings - Fork 346
Fix CI regression where most Windows failures passed CI #1559
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
This runs `cargo nextest ...` and `cargo test --doc` in separate steps in the `test-fast` job, so that the job fails when `cargo nextest ...` fails. Otherwise, with `pwsh` on Windows, test failures (other than doctests) are masked. Background: Since 89a0567 (GitoxideLabs#1556), doctests are run on all three major platforms, and not only on the full test job done on Ubuntu. But the way this was done relied on a script failing as soon as (or, at least, whenever) any command in the script failed. That works on Ubuntu and macOS, where a `bash` shell is used by default, with `-e` passed. But on Windows, GitHub Actions uses `pwsh` as the default shell. `pwsh` is not run in a way that causes it to stop at the first failing command. So, on Windows, when the `cargo nextest` command failed but the `cargo test --doc` command that followed it in the same script step passed, the step passed, thus allowing the job and workflow to pass. This was observed in GitoxideLabs#1429 after a rebase (see comments). Note that this is not related to the increased use of `nextest`. While that was also done in GitoxideLabs#1556, it did not affect the `test-fast` job where the bug was introduced, which was already using `nextest`. This fixes the problem by putting the two commands in separate steps. This is simpler than doing anything in PowerShell to make the script stop, such as using `&&` or attempting to produce `-e`-like behavior. Another option could be to use `bash` as the shell, which is a Git Bash environment suitable for running the tests. The reason I didn't do that is that I think it is valuable to see the results when the tests are run from a PowerShell environment. In particular, continuing to use PowerShell here should help in investigating GitoxideLabs#1359 (and shows that the claim I made is overly strong, since CI on Windows with `pwsh` not itself started from a Unix-style shell is not "Git Bash or a similar environment").
Although the new CI job in #1429 after rebasing that PR's branch onto this has not finished yet as of this writing, the expected failures can already be observed, so I've marked this PR as ready for review. Update: The job there has failed, as it should, confirming that the change here is effective. |
I didn't think to mention this in the commit message (which I don't plan to amend unless that has to be done for some other reason), but another reason for doing it this way is that CI output is more readable with these two test commands in separate steps, as well as it being immediately obvious whether the |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks so much, I am very glad we caught this!
This has also led to a refinement and explanation for #1359, detailed in #1359 (comment), which I now expect to be able to fix soon. |
The `test-fast` job has been intended to run doctests since 89a0567 (GitoxideLabs#1556). But because there are no doctests in the top-level project and neither `--workspace` nor its `--all` alias were passed, the effect has been: Compiling ... Finished `test` profile [unoptimized + debuginfo] target(s) in 28.55s Doc-tests gitoxide running 0 tests test result: ok. 0 passed; 0 failed; 0 ignored; 0 measured; 0 filtered out; finished in 0.00s (Where ... stands for details in various "Compiling lines. That output is copied from https://github.com/GitoxideLabs/gitoxide/actions/runs/11690609999/job/32555922512#step:9:70 though that log will eventually become available, and only the build time changes.) Note that zero tests are run and the status reports zero of each possible kind of result. There are (at least currently) no doctests in the top-level package, and `--workspace` is not implied. This adds `--workspace` to the command that runs the doctests, so it will collect and run doctests throughout the entire workspace. For now, this is not done on the corresponding command in the `unit-tests` rule in `justfile`; it may make sense to do that, but if it is done, then this step should probably be omitted on the `ubuntu-latest` run of `test-fast` since the CI job that runs `just unit-tests` is `test` which itself runs on `ubuntu-latest`. (The changes in GitoxideLabs#1556 were revised in GitoxideLabs#1559, but that only fixed a problem with reporting results from non-doctest tests. I had not noticed the problem of not running any doctests at that time.)
The `test-fast` job has been intended to run doctests since 89a0567 (GitoxideLabs#1556). But because there are no doctests in the top-level project and neither `--workspace` nor its `--all` alias were passed, the effect has been: Compiling ... Finished `test` profile [unoptimized + debuginfo] target(s) in 28.55s Doc-tests gitoxide running 0 tests test result: ok. 0 passed; 0 failed; 0 ignored; 0 measured; 0 filtered out; finished in 0.00s (Where ... stands for details in various "Compiling lines. That output is copied from https://github.com/GitoxideLabs/gitoxide/actions/runs/11690609999/job/32555922512#step:9:70 though that log will eventually become unavailable, and only the build time changes.) Note that zero tests are run and the status reports zero of each possible kind of result. There are (at least currently) no doctests in the top-level package, and `--workspace` is not implied. This adds `--workspace` to the command that runs the doctests, so it will collect and run doctests throughout the entire workspace. For now, this is not done on the corresponding command in the `unit-tests` rule in `justfile`; it may make sense to do that, but if it is done, then this step should probably be omitted on the `ubuntu-latest` run of `test-fast` since the CI job that runs `just unit-tests` is `test` which itself runs on `ubuntu-latest`. (The changes in GitoxideLabs#1556 were revised in GitoxideLabs#1559, but that only fixed a problem with reporting results from non-doctest tests. I had not noticed the problem of not running any doctests at that time.)
This seeks to make improvements in four overlapping areas: - The CI `test-fast` job for `windows-latest` had been taking the longest, and integrating PRs would be more efficient if it could be sped up. If it didn't have to build and run doctests, then it would run markedly faster. `test-fast` runs doctests because... - The `unit-tests` recipe in the `justfile`, which is one of the recipes the "full" CI `test` job runs, runs a number of `nextest` commands on individual crates (with `-p`, except for testing the top-level `gitoxide` crate, and not with `--workspace`). It also ran doctests, but only on the `gitoxide` top-level crate. But the `gitoxide` crate currently has no doctests, while some `gix-*` crates do. - On CI, we usually prefer `--no-fail-fast`. But it is not always used, because the commands in the `unit-tests` justfile recipe do not use it. `--no-fail-fast` is not always preferred when running tests locally, but... - Both locally and on CI, in most cases that a test fails in a commmand in the `unit-tests` justfile recipe, the effect is that tests have run and results have been reported for multiple `nextest` commands, yet not all the tests specified in the most recent `nextest` command to run. Thus, omitting `--no-fail-fast` may not have the most intuitive effect in `just unit-tests`, even when run locally (and even if the user would omit `--workspace` in individual `cargo nextest` runs if carried out manually). This commit makes the following changes: 1. Add `--no-fail-fast` to each of the commands in the `unit-tests` recipe in the `justfile`: the numerous `cargo nextest` commands, as well as the `cargo test` command used to run doctests. 2. Add `--workspace` to the `cargo test` command used to run doctests in the `unit-tests` recipe in the `justfile`, and move it to the end of the recipe. 3. Not to be confused with that `cargo test` command, move the other `cargo test` command used to run doctests in the `ci.yml` workflow (which alredy passed `--workspace`, as its purpose was to run all doctests in all crates) from the `tests-fast` job definition into the `test-32bit-windows-size` job, and rename that latter job `test-32bit-windows-size-doc` accordingly. The rationale for (3) may not be obvious. The idea is: - Running the doctests on only one Unix-like system should be enough, so long as they are run for all crates in the workspace. So the change in the `unit-tests` recipe in the `justfile` makes it so the CI `test` job (which includes a `just unit-tests` run) covers doctests sufficiently, *except* for Windows. - Although we should probably keep running doctests regularly on Windows, removing it from `test-fast`, including on Windows, is the simplest way to make the Windows `test-fast` job run faster. (It also makes the job definition clearer, since some of the other steps relate to each other more closely than they do to the step that ran the doctests.) - It should be sufficient to run the doctests in any Windows environment. And it is best to avoid adding a new Windows job just for this, since various other Windows jobs might be added sometime soon (such as for ARM64, native Windows containers, the Git for Windows SDK, MinGit, BusyBox MinGit, and possibly others; some of these may be possible to combine, but likely a few more Windows jobs may be introduced for these, so avoiding adding extra Windows jobs now may make it easier to avoid having too many Windows jobs, in terms of queuing, GHA cache usage, energy usage, and other resources). So if this can be added to another Windows job without causing problems, that is preferable. - The Windows job that took the least amount of time, usually by several minutes, was the `test-32bit-windows-size` job. It is hope that this keeps the benefits of GitoxideLabs#1556, GitoxideLabs#1559, and GitoxideLabs#1654, while improving CI testing performance most of the time.
This seeks to make improvements in four overlapping areas: - The CI `test-fast` job for `windows-latest` had been taking the longest, and integrating PRs would be more efficient if it could be sped up. If it didn't have to build and run doctests, then it would run markedly faster. `test-fast` runs doctests because... - The `unit-tests` recipe in the `justfile`, which is one of the recipes the "full" CI `test` job runs via the `ci-test` recipe, runs a number of `nextest` commands on individual crates (with `-p`, except for testing the top-level `gitoxide` crate, and not with `--workspace`). It also ran doctests, but only on the `gitoxide` top-level crate. But the `gitoxide` crate currently has no doctests, while some `gix-*` crates do. - On CI, we usually prefer `--no-fail-fast`. But it is not always used, because the commands in the `unit-tests` justfile recipe do not use it. `--no-fail-fast` is not always preferred when running tests locally, but... - Both locally and on CI, in most cases that a test fails in a commmand in the `unit-tests` justfile recipe, the effect is that tests have run and results have been reported for multiple `nextest` commands, yet not all the tests specified in the most recent `nextest` command to run. Thus, omitting `--no-fail-fast` may not have the most intuitive effect in `just unit-tests`, even when run locally (and even if the user would omit `--workspace` in individual `cargo nextest` runs if carried out manually). This commit makes the following changes: 1. Add `--no-fail-fast` to each of the commands in the `unit-tests` recipe in the `justfile`: the numerous `cargo nextest` commands, as well as the `cargo test` command used to run doctests. 2. Add `--workspace` to the `cargo test` command used to run doctests in the `unit-tests` recipe in the `justfile`, and move it to the end of the recipe. 3. Not to be confused with that `cargo test` command, move the other `cargo test` command used to run doctests in the `ci.yml` workflow (which alredy passed `--workspace`, as its purpose was to run all doctests in all crates) from the `tests-fast` job definition into the `test-32bit-windows-size` job, and rename that latter job `test-32bit-windows-size-doc` accordingly. The rationale for (3) may not be obvious. The idea is: - Running the doctests on only one Unix-like system should be enough, so long as they are run for all crates in the workspace. So the change in the `unit-tests` recipe in the `justfile` makes it so the CI `test` job (which includes a `unit-tests` run) covers doctests sufficiently, *except* for Windows. - Although we should probably keep running doctests regularly on Windows, removing it from `test-fast`, including on Windows, is the simplest way to make the Windows `test-fast` job run faster. (It also makes the job definition clearer, since some of the other steps relate to each other more closely than they do to the step that ran the doctests.) - It should be sufficient to run the doctests in any Windows environment. And it is best to avoid adding a new Windows job just for this, since various other Windows jobs might be added sometime soon (such as for ARM64, native Windows containers, the Git for Windows SDK, MinGit, BusyBox MinGit, and possibly others; some of these may be possible to combine, but likely a few more Windows jobs may be introduced for these, so avoiding adding extra Windows jobs now may make it easier to avoid having too many Windows jobs, in terms of queuing, GHA cache usage, energy usage, and other resources). So if this can be added to another Windows job without causing problems, that is preferable. - The Windows job that took the least amount of time, usually by several minutes, was the `test-32bit-windows-size` job. It is hope that this keeps the benefits of GitoxideLabs#1556, GitoxideLabs#1559, and GitoxideLabs#1654, while improving CI testing performance most of the time.
This seeks to make improvements in four overlapping areas: - The CI `test-fast` job for `windows-latest` had been taking the longest, and integrating PRs would be more efficient if it could be sped up. If it didn't have to build and run doctests, then it would run markedly faster. `test-fast` runs doctests because... - The `unit-tests` recipe in the `justfile`, which is one of the recipes the "full" CI `test` job runs via the `ci-test` recipe, runs a number of `nextest` commands on individual crates (with `-p`, except for testing the top-level `gitoxide` crate, and not with `--workspace`). It also ran doctests, but only on the `gitoxide` top-level crate. But the `gitoxide` crate currently has no doctests, while some `gix-*` crates do. - On CI, we usually prefer `--no-fail-fast`. But it is not always used, because the commands in the `unit-tests` justfile recipe do not use it. `--no-fail-fast` is not always preferred when running tests locally, but... - Both locally and on CI, in most cases that a test fails in a commmand in the `unit-tests` justfile recipe, the effect is that tests have run and results have been reported for multiple `nextest` commands, yet not all the tests specified in the most recent `nextest` command to run. Thus, omitting `--no-fail-fast` may not have the most intuitive effect in `just unit-tests`, even when run locally (even if the user would omit `--no-fail-fast` in individual `cargo nextest` runs carried out manually). This commit makes the following changes: 1. Add `--no-fail-fast` to each of the commands in the `unit-tests` recipe in the `justfile`: the numerous `cargo nextest` commands, as well as the `cargo test` command used to run doctests. 2. Add `--workspace` to the `cargo test` command used to run doctests in the `unit-tests` recipe in the `justfile`, and move it to the end of the recipe. 3. Not to be confused with that `cargo test` command, move the other `cargo test` command used to run doctests in the `ci.yml` workflow (which alredy passed `--workspace`, as its purpose was to run all doctests in all crates) from the `tests-fast` job definition into the `test-32bit-windows-size` job, and rename that latter job `test-32bit-windows-size-doc` accordingly. The rationale for (3) may not be obvious. The idea is: - Running the doctests on only one Unix-like system should be enough, so long as they are run for all crates in the workspace. So the change in the `unit-tests` recipe in the `justfile` makes it so the CI `test` job (which includes a `unit-tests` run) covers doctests sufficiently, *except* for Windows. - Although we should probably keep running doctests regularly on Windows, removing it from `test-fast`, including on Windows, is the simplest way to make the Windows `test-fast` job run faster. (It also makes the job definition clearer, since some of the other steps relate to each other more closely than they do to the step that ran the doctests.) - It should be sufficient to run the doctests in any Windows environment. And it is best to avoid adding a new Windows job just for this, since various other Windows jobs might be added sometime soon (such as for ARM64, native Windows containers, the Git for Windows SDK, MinGit, BusyBox MinGit, and possibly others; some of these may be possible to combine, but likely a few more Windows jobs may be introduced for these, so avoiding adding extra Windows jobs now may make it easier to avoid having too many Windows jobs, in terms of queuing, GHA cache usage, energy usage, and other resources). So if this can be added to another Windows job without causing problems, that is preferable. - The Windows job that took the least amount of time, usually by several minutes, was the `test-32bit-windows-size` job. It is hope that this keeps the benefits of GitoxideLabs#1556, GitoxideLabs#1559, and GitoxideLabs#1654, while improving CI testing performance most of the time.
This fixes an edge case in the Windows MSRV job, by using `bash` for script steps on all platforms. The MSRV workflow had alredy implicitly used `bash` on Ubuntu, which implicitly passed `-e` (as also happens when `shell: bash` is used explicitly). But it had implicitly used `pwsh` on Windows. Although GHA runners make an effort to fail `pwsh` script steps when a command has failed, PowerShell (whether `pwsh` or `powershell`) does not have an analogue of POSIX `-e`. Script steps with `pwsh` do not always fail fast, nor even always fail at all, when a command fails that is not the last command in the script. Recent experiments confirm that running an implicit `pwsh` script step with two commands, each of which run external programs, where the first program exits normally but indicating failure (such as with an exit code of 1) but the second program does not fail, will run both commands, then report success for the whole step. This is to say that the kind of bug that 4f2ab5b (GitoxideLabs#1559) fixed in the Windows `test-fast` job in `ci.yml` also existed in the Windows `check-msrv` job in `msrv.yml` (which GitoxideLabs#1559 did not fix). Fortunately, this version of the bug is far less severe, in that the circumstnaces under which a failure would be concealed appear to have been unlikely. However, at least one such concealed-failure mode was plausible. If the `rustup toolchain install ...` command failed, thereby causing `rustup default...` to fail, then the cargo update command could still succeed, masking those two failures. Error messages for the failures would still be shown in the log, but the step would have reported success. Then the `gix check` commands run via `just ci-check-msrv` would check using the wrong toolchain, i.e. a typically newer toolchain than the MSRV, present in the `windows-2022` runner image. This commit fixes that bug by setting a default of `shell: bash` for all script steps not specifying `shell:`. This implicitly passes `-e` to the `bash` interpreter. Although `-e` has its own intricacies, for simple commands such as those shown here, it has the intuitive fail-fast behavior. (The alternative of splitting the steps up so each step directly runs only a single command, as was done in GitoxideLabs#1559, was considered. But the code is less readable if written that way. It may make sense to organize the commands here into more steps, but most likely some steps will continue to have two or more commands.) This commit also rewords a conceptually related comment in `ci.yml` for clarity, and so that differences in wording between comments on `shell: bash` in various workflows reflect differences in circumstances.
This fixes an edge case in the Windows MSRV job, by using `bash` for script steps on all platforms. The MSRV workflow had alredy implicitly used `bash` on Ubuntu, which implicitly passed `-e` (as also happens when `shell: bash` is used explicitly). But it had implicitly used `pwsh` on Windows. Although GHA runners make an effort to fail `pwsh` script steps when a command has failed, PowerShell (whether `pwsh` or `powershell`) does not have an analogue of POSIX `-e`. Script steps with `pwsh` do not always fail fast, nor even always fail at all, when a command fails that is not the last command in the script. Recent experiments confirm that running an implicit `pwsh` script step with two commands, each of which run external programs, where the first program exits normally but indicating failure (such as with an exit code of 1) but the second program does not fail, will run both commands, then report success for the whole step. This is to say that the kind of bug that 4f2ab5b (GitoxideLabs#1559) fixed in the Windows `test-fast` job in `ci.yml` also existed in the Windows `check-msrv` job in `msrv.yml` (which GitoxideLabs#1559 did not fix). Fortunately, this version of the bug is far less severe, in that the circumstnaces under which a failure would be concealed appear to have been unlikely. However, at least one such concealed-failure mode was plausible. If the `rustup toolchain install ...` command failed, thereby causing `rustup default...` to fail, then the cargo update command could still succeed, masking those two failures. Error messages for the failures would still be shown in the log, but the step would have reported success. Then the `gix check` commands run via `just ci-check-msrv` would check using the wrong toolchain, i.e. a typically newer toolchain than the MSRV, present in the `windows-2022` runner image. This commit fixes that bug by setting a default of `shell: bash` for all script steps not specifying `shell:`. This implicitly passes `-e` to the `bash` interpreter. Although `-e` has its own intricacies, for simple commands such as those shown here, it has the intuitive fail-fast behavior. (The alternative of splitting the steps up so each step directly runs only a single command, as was done in GitoxideLabs#1559, was considered. But the code is less readable if written that way. It may make sense to organize the commands here into more steps, but most likely some steps will continue to have two or more commands.) This commit also rewords a conceptually related comment in `ci.yml` for clarity, and so that differences in wording between comments on `shell: bash` in various workflows reflect differences in circumstances.
This fixes an edge case in the Windows MSRV job, by using `bash` for script steps on all platforms. The MSRV workflow had alredy implicitly used `bash` on Ubuntu, which implicitly passed `-e` (as also happens when `shell: bash` is used explicitly). But it had implicitly used `pwsh` on Windows. Although GHA runners make an effort to fail `pwsh` script steps when a command has failed, PowerShell (whether `pwsh` or `powershell`) does not have an analogue of POSIX `-e`. Script steps with `pwsh` do not always fail fast, nor even always fail at all, when a command fails that is not the last command in the script. Recent experiments confirm that running an implicit `pwsh` script step with two commands, each of which run external programs, where the first program exits normally but indicating failure (such as with an exit code of 1) but the second program does not fail, will run both commands, then report success for the whole step. This is to say that the kind of bug that 4f2ab5b (GitoxideLabs#1559) fixed in the Windows `test-fast` job in `ci.yml` also existed in the Windows `check-msrv` job in `msrv.yml` (which GitoxideLabs#1559 did not fix). Fortunately, this version of the bug is far less severe, in that the circumstnaces under which a failure would be concealed appear to have been unlikely. However, at least one such concealed-failure mode was plausible. If the `rustup toolchain install ...` command failed, thereby causing `rustup default...` to fail, then the cargo update command could still succeed, masking those two failures. Error messages for the failures would still be shown in the log, but the step would have reported success. Then the `gix check` commands run via `just ci-check-msrv` would check using the wrong toolchain, i.e. a typically newer toolchain than the MSRV, present in the `windows-2022` runner image. This commit fixes that bug by setting a default of `shell: bash` for all script steps not specifying `shell:`. This implicitly passes `-e` to the `bash` interpreter. Although `-e` has its own intricacies, for simple commands such as those shown here, it has the intuitive fail-fast behavior. (The alternative of splitting the steps up so each step directly runs only a single command, as was done in GitoxideLabs#1559, was considered. But the code seems like it will be less readable if written that way. It may make sense to organize the commands here into more steps, but likely with some steps having two or more commands.) This commit also rewords a conceptually related comment in `ci.yml` for clarity, and so that differences in wording between comments on `shell: bash` in various workflows reflect differences in circumstances.
This commit splits the commands in the steps of the `check-msrv` job definition to be one per step, documenting each step. In 510847b I predicted that it would not make sense to split the steps to one command per step in `msrv.yml`. The change made there of using `bash` even on Windows allowed for experimenting with how the workflow logs look when reorganized in various ways. But in view of other changes -- and to better clarify that two toolchains were installed, but the MSRV toolchain is set default, so it is used by all subsequent operations *except* the `-Zminimal-versions` dependency downgrade that currently requires `nightly` -- it now looks like having one command per step is better after all. When running only one (simple) command per step, the main change in 510847b is no longer needed. That is, this now avoids that problem in the same way it has been avoided in `test-fast` in `ci.yml` since 4f2ab5b (GitoxideLabs#1559). So this commit also removes `shell: bash` in `msrv.yml` (but keeps the comment clarification in `ci.yml`).
This runs
cargo nextest ...
andcargo test --doc
in separate steps in thetest-fast
job, so that the job fails whencargo nextest ...
fails. Otherwise, withpwsh
on Windows, test failures (other than doctests) are masked.Background: Since 89a0567 (#1556), doctests are run on all three major platforms, and not only on the full test job done on Ubuntu. But the way this was done relied on a script failing as soon as (or, at least, whenever) any command in the script failed. That works on Ubuntu and macOS, where a
bash
shell is used by default, with-e
passed. But on Windows, GitHub Actions usespwsh
as the default shell.pwsh
is not run in a way that causes it to stop at the first failing command.So, on Windows, when the
cargo nextest
command failed but thecargo test --doc
command that followed it in the same script step passed, the step passed, thus allowing the job and workflow to pass. This was observed in #1429 after a rebase (see comments).Note that this is not related to the increased use of
nextest
. While that was also done in #1556, it did not affect thetest-fast
job where the bug was introduced, which was already usingnextest
.This fixes the problem by putting the two commands in separate steps.
This is simpler than doing anything in PowerShell to make the script stop, such as using
&&
or attempting to produce-e
-like behavior.Another option could be to use
bash
as the shell, which is a Git Bash environment suitable for running the tests. The reason I didn't do that is that I think it is valuable to see the results when the tests are run from a PowerShell environment.In particular, continuing to use PowerShell here should help in investigating #1359 (and shows that the claim I made is overly strong, since CI on Windows with
pwsh
not itself started from a Unix-style shell is not "Git Bash or a similar environment").This is a draft because I'm going to rebase #1359 onto this or otherwise test that this really does catch failures. That should not take long.