-
Notifications
You must be signed in to change notification settings - Fork 13.5k
[GitHub] Add workflows to manage merging of PRs from authors without commit access #124910
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
@llvm/pr-subscribers-github-workflow Author: David Spickett (DavidSpickett) ChangesNote: This process could be simplified if we were able to write to the repository in response to the event generated when an approval is given. Due to security restrictions in GitHub, we cannot do this, see: https://github.com/orgs/community/discussions/26651 https://github.com/orgs/community/discussions/55940 I am introducing a pair of workflows that together do the following:
Due to the limitations mentioned above, the workflow is actually 2 workflows. The first checks each opened PR and adds a "no-commit-access" label if the author has no comit access. I've implemented this as an addition to the existing workflow that runs on PR open. A second workflow runs periodically and inspects all PRs that are open and have this label. If the PR has approvals, a comment is added to tell the author how to make the PR ready for merging. (content from https://llvm.org/docs/Contributing.html#how-to-submit-a-patch) The author can take as much time as they like to do this, and when they are done, they reply with a comment with specific text to say so. Note:
When the next workflow run sees this comment it will add another comment, asking the approvers to merge the PR, and then it will remove the no-commit-access label. Removing the label makes the PR invisible to future runs of the workflow. I have a demo PR on my fork: DavidSpickett#105 (note that that ran with testing hacks because I am not a new user there) This leaves the PR approved, ready to merge, with no extra labels. At this point approvers can just click the button, or if more approvals are required, they can wait and at least we know that it's in a decent state. There is a corner case where you get 1 approval, and go through this process, but another reviewer vetos that approval. In this case, the label can be manually added and the bot comments and the user's magic comment removed, reseting the situation. Maybe we can make this situation less manual, but it's going to get very complex very quickly so I haven't attempted to in this first version. Ultimately, there'll always be some cases where we want to wait on more approvers. Even if we say "here is label you can add for a final approval", someone could add that and someone else could disagree. I think practical experience will be more informative than theory crafting here. Full diff: https://github.com/llvm/llvm-project/pull/124910.diff 3 Files Affected:
diff --git a/.github/workflows/check-prs-need-merge.yml b/.github/workflows/check-prs-need-merge.yml
new file mode 100644
index 00000000000000..eeba4cbd3b43a5
--- /dev/null
+++ b/.github/workflows/check-prs-need-merge.yml
@@ -0,0 +1,38 @@
+name: "Find PRs That Need Merging on the Author's Behalf"
+
+permissions:
+ contents: read
+
+on:
+ workflow_dispatch:
+ schedule:
+ # * is a special character in YAML so you have to quote this string
+ # Run once an hour
+ - cron: '5 * * * *'
+
+jobs:
+ check_needs_merge:
+ runs-on: ubuntu-latest
+ permissions:
+ # May change labels and add a comment.
+ pull-requests: write
+ if: >-
+ (github.repository == 'llvm/llvm-project')
+ steps:
+ - name: Checkout Automation Script
+ uses: actions/checkout@v4
+ with:
+ sparse-checkout: llvm/utils/git/
+ ref: main
+
+ - name: Setup Automation Script
+ working-directory: ./llvm/utils/git/
+ run: |
+ pip install --require-hashes -r requirements.txt
+
+ - name: Check Open PRs
+ working-directory: ./llvm/utils/git/
+ run: |
+ python3 ./github-automation.py \
+ --token '${{ secrets.GITHUB_TOKEN }}' \
+ check-prs-need-merge
diff --git a/.github/workflows/new-prs.yml b/.github/workflows/new-prs.yml
index 88175d6f8d64d4..bd8fe91c1e8ecc 100644
--- a/.github/workflows/new-prs.yml
+++ b/.github/workflows/new-prs.yml
@@ -73,3 +73,31 @@ jobs:
# workaround for https://github.com/actions/labeler/issues/112
sync-labels: ''
repo-token: ${{ secrets.ISSUE_SUBSCRIBER_TOKEN }}
+
+ check-commit-access:
+ runs-on: ubuntu-latest
+ permissions:
+ pull-requests: write
+ if: >-
+ (github.repository == 'llvm/llvm-project') &&
+ (github.event.action == 'opened')
+ steps:
+ - name: Checkout Automation Script
+ uses: actions/checkout@v4
+ with:
+ sparse-checkout: llvm/utils/git/
+ ref: main
+
+ - name: Setup Automation Script
+ working-directory: ./llvm/utils/git/
+ run: |
+ pip install --require-hashes -r requirements.txt
+
+ - name: Check Commit Access
+ working-directory: ./llvm/utils/git/
+ run: |
+ python3 ./github-automation.py \
+ --token '${{ secrets.GITHUB_TOKEN }}' \
+ check-commit-access \
+ --issue-number "${{ github.event.pull_request.number }}" \
+ --author "${{ github.event.pull_request.user.login }}"
diff --git a/llvm/utils/git/github-automation.py b/llvm/utils/git/github-automation.py
index da467f46b4dd31..20f9979c25fc77 100755
--- a/llvm/utils/git/github-automation.py
+++ b/llvm/utils/git/github-automation.py
@@ -290,6 +290,152 @@ def run(self) -> bool:
return True
+"""
+CheckCommitAccess and CheckPRsNeedMerge work together to implement a flow that
+notifies approvers of PRs from authors without commit access, when an approved
+PR is ready to merge. The steps are as follows:
+
+* New PRs from users without commit access are labelled to indicate this, as
+ they are opened.
+* Periodically, PRs with this tag are checked. If they have approvals, the author
+ is prompted to make the PR ready for merging.
+* The author then replies with a specific comment to say this has been done.
+* This comment is picked up by the next workflow run. We then add a comment prompting
+ the approvers of the PR to merge it, and remove the label we added earlier.
+
+We do allow for the author to post the magic comment earlier, and will skip prompting
+them if they have done that.
+
+We do this call and response so that:
+* Authors do not have to rush to update the PR.
+* The subsequent comment to approvers causes a notification to them which they
+ can immediately act upon.
+
+If we were able to write to the repo in response to a pull_request_review event,
+we could run the second part in response to the review submitted event. See:
+https://github.com/orgs/community/discussions/26651
+https://github.com/orgs/community/discussions/55940
+
+We cannot, so the second part runs on a timer instead.
+"""
+
+
+def user_can_merge(user: str, repo):
+ try:
+ return repo.get_collaborator_permission(user) in ["admin", "write"]
+ # There is a UnknownObjectException for this scenario, but this method
+ # does not use it.
+ except github.GithubException as e:
+ # 404 means the author was not found in the collaborator list, so we
+ # know they don't have push permissions. Anything else is a real API
+ # issue, raise it so it is visible.
+ if e.status != 404:
+ raise e
+ return False
+
+
+NO_COMMIT_ACCESS_LABEL = "no-commit-access"
+
+
+class CheckCommitAccess:
+ def __init__(self, token: str, repo: str, pr_number: int, author: str):
+ self.repo = github.Github(token).get_repo(repo)
+ self.pr = self.repo.get_issue(pr_number).as_pull_request()
+ self.author = author
+
+ def run(self) -> bool:
+ if not user_can_merge(self.author, self.repo):
+ self.pr.as_issue().add_to_labels(NO_COMMIT_ACCESS_LABEL)
+
+ return True
+
+
+class CheckPRsNeedMerge:
+ PROMPT_AUTHOR_COMMENT_TAG = "<!--NEEDS MERGE PROMPT AUTHOR-->\n"
+ PR_READY_COMMENT = "! This PR is ready to merge !"
+
+ def __init__(self, token: str, repo: str):
+ self.repo = github.Github(token).get_repo(repo)
+
+ @staticmethod
+ def at_users(users):
+ return ", ".join([f"@{user.login}" for user in users])
+
+ def prompt_author(self, pull) -> bool:
+ # Tell the PR author to prepare the PR for merge.
+ pull.as_issue().create_comment(
+ f"""\
+{self.PROMPT_AUTHOR_COMMENT_TAG}
+{self.at_users([pull.user])} please ensure that this PR is ready to be merged. Make sure that:
+* The PR title and description describe the final changes. These will be used as the title and message of the final squashed commit. The titles and messages of commits in the PR will **not** be used.
+* You have set a valid [email address](https://llvm.org/docs/DeveloperPolicy.html#github-email-address) in your GitHub account. This will be associated with this contribution.
+
+When the PR is ready to be merged please reply with a comment that is exactly "{self.PR_READY_COMMENT}"."""
+ )
+
+ return True
+
+ def prompt_approvers(self, pull, approvers) -> bool:
+ # Even approvers may not have commit access.
+ can_merge = [a for a in approvers if user_can_merge(a, self.repo)]
+
+ if not can_merge:
+ # If no one can merge, find someone who can.
+ to_approvers = f"{self.at_users(approvers)}, please find someone who can merge this PR on behalf of {at_users([pull.user])}."
+ elif len(can_merge) == 1:
+ # Ask this specific approver to merge.
+ to_approvers = f"{self.at_users(can_merge)}, please merge this PR on behalf of {self.at_users([pull.user])}."
+ else:
+ # Ask all who can merge to do so.
+ to_approvers = f"{self.at_users(can_merge)}, one of you should merge this PR on behalf of {self.at_users([pull.user])}."
+
+ pull.as_issue().create_comment(to_approvers)
+
+ return True
+
+ def run(self) -> bool:
+ # "Either open, closed, or all to filter by state." - no "approved"
+ # unfortunately.
+ for pull in self.repo.get_pulls(state="open"):
+ for label in pull.as_issue().get_labels():
+ if label.name == NO_COMMIT_ACCESS_LABEL:
+ break
+ else:
+ # PR is from someone with commit access.
+ continue
+
+ approvers = [r.user for r in pull.get_reviews() if r.state == "APPROVED"]
+ if not approvers:
+ # Can't do anything until there is at least one approval.
+ continue
+
+ found_prompt_author_comment = False
+ found_author_comment = False
+ for comment in pull.get_issue_comments():
+ # If the PR author wrote the magic response comment.
+ if (
+ comment.user.login == pull.user.login
+ and self.PR_READY_COMMENT in comment.body
+ ):
+ found_author_comment = True
+ # Either they responded to our prompting, or knew ahead of time
+ # what to do, either is fine.
+ break
+ elif self.PROMPT_AUTHOR_COMMENT_TAG in comment.body:
+ found_prompt_author_comment = True
+
+ if found_author_comment:
+ self.prompt_approvers(pull, approvers)
+ pull.as_issue().remove_from_labels(NO_COMMIT_ACCESS_LABEL)
+ elif not found_prompt_author_comment:
+ self.prompt_author(pull)
+ elif found_prompt_author_comment:
+ # Waiting for a response from the author.
+ pass
+
+ return True
+
+
def setup_llvmbot_git(git_dir="."):
"""
Configure the git repo in `git_dir` with the llvmbot account so
@@ -680,6 +826,12 @@ def request_release_note(token: str, repo_name: str, pr_number: int):
pr_buildbot_information_parser.add_argument("--issue-number", type=int, required=True)
pr_buildbot_information_parser.add_argument("--author", type=str, required=True)
+check_commit_access_parser = subparsers.add_parser("check-commit-access")
+check_commit_access_parser.add_argument("--issue-number", type=int, required=True)
+check_commit_access_parser.add_argument("--author", type=str, required=True)
+
+check_pr_needs_merge_parser = subparsers.add_parser("check-prs-need-merge")
+
release_workflow_parser = subparsers.add_parser("release-workflow")
release_workflow_parser.add_argument(
"--llvm-project-dir",
@@ -751,6 +903,14 @@ def request_release_note(token: str, repo_name: str, pr_number: int):
args.token, args.repo, args.issue_number, args.author
)
pr_buildbot_information.run()
+elif args.command == "check-commit-access":
+ check_commit_access = CheckCommitAccess(
+ args.token, args.repo, args.issue_number, args.author
+ )
+ check_commit_access.run()
+elif args.command == "check-prs-need-merge":
+ check_needs_merge = CheckPRsNeedMerge(args.token, args.repo)
+ check_needs_merge.run()
elif args.command == "release-workflow":
release_workflow = ReleaseWorkflow(
args.token,
|
As noted in the PR description, this "state machine" is somewhat fragile and will always have some weird scenarios. But I don't say that to excuse that, please come up with as many weird situations as you can think of to trip this up! That way we can at least look at each one and weigh the complexity of handling it vs. the impact of it happening. |
This is reasonable. Many Github repos have actions in response to keywords in comments (run tests, benchmarks, trigger external CI, etc).
Manually adding requires authors and/or reviewers to know they have to do it, and documentation is hard in this case because it's "elsewhere". Another corner case is removing the tag and warning me but I miss the ping (lost in emails, crash browser) and then the state is lost. How about we have two tags: Workflow:
Note that the user's "ready to merge" comments will also be sent to the reviewers, so we don't lose that tracking either. |
Yes, if this went into production I think it would have to be with documentation on the website about how it works. For approvers mostly, an author should just know that there is a process that will help them.
Yes, or the approvers could just disappear, as do some authors. So a second label would let us proactively query PRs to find something in this state. The idea with removing the tag at the end was so that we do not look at that PR again while it remains open, but I can do the same thing with 2 tags. If you have no-commit-access, then we check the PR, if you have both tags, we ignore the PR, on the assumption that everything is in hand.
I should have written it out in pseudocode too :) I'll do that for the next update. I think formatting jobs trigger on branch update, so I can extend that to remove a label if needed. |
Slight problem here where you comment then the label is added. Then you later update the branch which removes the label but the next time we look at the PR, the original comment is there and we'd then put back the label. Despite their being no "new" ready to merge comment. In theory we could automate the removal of that comment. Or we could use a new comment from the bot as the boundary below which are possible "new" ready to merge comments. Would get rather spammy if you update the branch a lot post approval, so maybe there are better ways. |
I think the proposed flow here is too complicated. It should be enough to just add comment asking for someone to commit the PR after it's been reviewed. |
This seems to just get us back to the original spot where the PR author needs to know our processes in order to get something merge? It's not too different from just asking them to ping the reviewers asking them to merge (from what I'm understanding). |
It's different in that the PR author doesn't have to do it. Automation will remind the reviewer that the PR needs merging after it has been approved. And I agree with @tstellar that I'd prefer that over the more involved process proposed here. I think the only thing I'd suggest is to delay that comment until all CI workflows on the PR have completed (either success or fail). Otherwise I have to remember to come back to the PR after CI is finished. |
Something like:
This ping also include the list of things to check, which I am currently sending to the author. Then if the approver judges it not to be ready then they can reference that list when saying so.
I'll work on a version like this next week. |
6f2f61a
to
f2cb434
Compare
The new version:
The PR description has been updated to explain the way it works. I still have a feeling this might need more to it, but no use doing that without proof it would help anyone. Also I will almost certainly have to patch some part of this post-commit, so it's best to keep it small after all. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This seems fine to me. Thank You.
55e6e24
to
f2cb434
Compare
Force pushed to remove a commit I meant to push to my fork instead, nothing actually changed. |
I realised a few things.
libcxx documents that you should have 2 approvals from the reviewers group, but does not seem to use any GitHub feature to enforce that. Is that the case @ldionne ? Related to that, it should be easy for use to opt out any given sub-project from this workflow by ignoring PRs with certain labels. So there is a quick workaround if it does mess up an existing review process. |
Our documentation is a bit outdated, this was in the Phabricator times. We haven't revisited that, but the current de facto process is that people are free to merge when the General comment about the workflow: It seems surprising to me that we're using a label named From my perspective, what would be the most helpful (to libc++ at least) is a workflow that checks PRs that have been open for longer than "some amount of time" and which have been approved, and pings them. In that comment, you can also mention that the reviewers should be the ones to merge if the PR creator doesn't have commit access. Furthermore, I looked at the code and I didn't see how we were handling the case where:
Do we currently comment with |
Since it's only used for this workflow and removing it is a cheap way to remove PRs from the search results, that's why I remove it. I agree that there are uses where you might want to use it for queries. For instance, we could figure out how many PRs came from people without commit access in the last N months (but this is a different concern, I'm not using that to justify its use here). I decided not to make it "needs-merge" or anything that implied that it should be merged just because it has the label. But yes it could be something specific to the workflow. We can leave the label on and search the existing comments to see if we have pinged already, it's just a bit less neat in the implementation. We can ditch the label completely and check the author's access each time we look at the PR, it's just more API calls. I preferred the label solution as it also added a hint visually, that might help inform reviewers should the workflow not cover some scenario.
Yes we could do that. You mean generally for any PR, with a unique message if they don't have commit access. We can either continue to react to the first time we see approvals there, then delay pings after that, or we can delay the initial ping as well.
The The version in the PR now is simplified and does not require this. You are correct that I have not handled that case. Of all the most recent reviews from each reviewer, I need to check that they are all approvals or comments, and none are requesting changes. Thanks, I wouldn't have spotted that, not until production anyway :) We could go further and require that all open comments are resolved, but this is less intuitive if you are the author. Might need the workflow itself to comment "this could be merged but for X and Y". This is also not part of the general llvm flow at this time, plenty of PRs get merged with unresolved comments. They may have actually been addressed in the code, but no one clicked the button on GitHub. Which is maybe something to address later but I'm trying not to change policy as part of this work. I think reframing this as "ping approved but unmerged PRs after N days", later followed by "customise that ping for those without commit access" could be a path forward. As I do like the idea of extending this to all PRs. |
One of the reasons why I'm mentioning this is that it would also be nice to have in mind the closing of stale PRs. We may not want to implement that right now, and we may never want to do that, but I'm thinking that with the proposed solution the notion of a stale PR becomes just a minor variation on the check you're already doing. |
…commit access These changes aim to make contributors and reviewers aware when a PR should be merged on the contributor's behalf. It happens in two parts: * Newly opened PRs will be labelled with "no-commit-access" if the author does not have commit access. * A new workflow will periodically check all open PRs with this label to see if: * There is at least one approval * No one is requesting changes * All checks are finished (not passed just finished, some failures can be explained) * If they do, it will remove the label and add a comment: * Instructing the author to make it merge ready, if needed. * Instructing approvers to merge it themselves, or to find someone who can. **Note:** This process could be simplified if we were able to write to the repository in response to the event generated when an approval is given. Due to security restrictions in GitHub, we cannot do this, see: https://github.com/orgs/community/discussions/26651 https://github.com/orgs/community/discussions/55940 Instead, we run the new workflow periodically. Checking Check status is done using PyGitHub's version of: https://docs.github.com/en/rest/checks/runs?apiVersion=2022-11-28#list-check-runs-for-a-git-reference There are some potential corner cases here, but the idea is that this is enough to make both author and reviewers aware that merge on behalf is an option. From there, I hope that they can communicate directly on the PR. If this does not happen in practice, we can revisit this and add more automation where it makes sense.
f2cb434
to
3d3429f
Compare
Addressed the following so I don't forget them later:
I will be back with a refactored version based on the discussion above. |
These changes aim to make contributors and reviewers aware when a PR
should be merged on the contributor's behalf.
It happens in two parts:
author does not have commit access.
label to see if they have approvals, and all checks have finished
(not passed, just finished, some failures can be explained).
someone who can.
Note: This process could be simplified if we were able to write to the
repository in response to the event generated when an approval is given.
Due to security restrictions in GitHub, we cannot do this, see:
https://github.com/orgs/community/discussions/26651
https://github.com/orgs/community/discussions/55940
Instead, we run the new workflow periodically.
Checking Check status is done using PyGitHub's version of:
https://docs.github.com/en/rest/checks/runs?apiVersion=2022-11-28#list-check-runs-for-a-git-reference
There are some potential corner cases here, but the idea is that
this is enough to make both author and reviewers aware that
merge on behalf is an option. From there, I hope that they can
communicate directly on the PR.
If this does not happen in practice, we can revisit this and
add more automation where it makes sense.