Skip to content

bootstrap: retry cargo invocations if stderr contains a known pattern #134472

Open
@marcoieni

Description

@marcoieni

Why

Our CI auto builds sometimes fail for known reasons that are not related to the PRs we are trying to merge.

Most of the times, these errors are hard to understand and fix (or can't be fixed at all), decreasing the success rate of the auto builds for several weeks or months.
The impact is that we loose days of parallel compute time and hours of maintainers time that need to analyze the error message and reschedule the PRs in the merge queue.

Feature

We want to list the the stderr of the known issues we are aware of in the config.toml file that bootstrap uses. These patterns can be expressed as regex.
We want bootstrap to retry cargo invocations up to two times if stderr matches one of the listed patterns.

This would help reduce the failure rate of our CI because it would significantly reduce the percentage of jobs failing due to spurious errors.

The error messages need to be precise enough to avoid retrying cargo invocations over genuine problems.

Known error patterns can be found here. Not all of them can be listed.

As a start, we could just have 1 stderr string in the list (this one doesn't need to be a regex):

Questions

  • Is config.toml the right place to put the known stderr patterns? In Zulip, Jieyou proposed introducing another file: retry-patterns.toml. I'll leave it to the bootstrap team to decide.
  • Which format do we use to write the stderr patterns in the config.toml file? For example, it can be an array of strings. It could also be an "object" if we want to customize how many times to retry per error message. I'll leave it to the boostrap team to decide.
  • how do we make sure these patterns are present in the config.toml used for CI? I'm not familiar with how the config.toml for the CI is generated.

Zulip links

  • idea proposed here
  • agreement reached here

Metadata

Metadata

Assignees

No one assigned

    Labels

    A-CIArea: Our Github Actions CIA-bootstrap-configArea: bootstrap `config.toml` and the config systemC-enhancementCategory: An issue proposing an enhancement or a PR with one.E-hardCall for participation: Hard difficulty. Experience needed to fix: A lot.T-bootstrapRelevant to the bootstrap subteam: Rust's build system (x.py and src/bootstrap)T-infraRelevant to the infrastructure team, which will review and decide on the PR/issue.

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions