Skip to content

[GitHub] Add workflow to check author's commit access on new PRs #123593

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
wants to merge 3 commits into from

Conversation

DavidSpickett
Copy link
Collaborator

@DavidSpickett DavidSpickett commented Jan 20, 2025

This workflow will run on every opened PR and add a label if the author does not have the required permissions to merge their own PR.

The permission check is based on code from #81142, which tried to do this when a review was approved. That had to be reverted in #81722 because the event that it was triggered by did not have permissions to write to the PR.

So we have a slight disadvantage that a label could be wrong by the time the review is approved but ok, the author can click the button themselves then anyway.

Plus, you could search by the label to find anything waiting for someone to merge on behalf.

https://docs.github.com/en/rest/collaborators/collaborators?apiVersion=2022-11-28#get-repository-permissions-for-a-user is the GitHub API we end up calling and lists the possible values as "admin, write, read, and none".

This workflow will run on every opened PR and add a label if the
author does not have the required permissions to merge their own
PR.

The permission check is based on code from llvm#81142,
which tried to do this when a review was approved. That had to be
reverted in llvm#81722 because
the event that it was triggered by did not have permissions to write
to the PR.

So we have a slight disadvantage that a label could be wrong by
the time the review is approved but ok, the author can click
the button themselves then anyway.

Plus, you could search by the label to find anything waiting
for someone to merge on behalf.
@llvmbot
Copy link
Member

llvmbot commented Jan 20, 2025

@llvm/pr-subscribers-github-workflow

Author: David Spickett (DavidSpickett)

Changes

This workflow will run on every opened PR and add a label if the author does not have the required permissions to merge their own PR.

The permission check is based on code from #81142, which tried to do this when a review was approved. That had to be reverted in #81722 because the event that it was triggered by did not have permissions to write to the PR.

So we have a slight disadvantage that a label could be wrong by the time the review is approved but ok, the author can click the button themselves then anyway.

Plus, you could search by the label to find anything waiting for someone to merge on behalf.


Full diff: https://github.com/llvm/llvm-project/pull/123593.diff

2 Files Affected:

  • (modified) .github/workflows/new-prs.yml (+28)
  • (modified) llvm/utils/git/github-automation.py (+36)
diff --git a/.github/workflows/new-prs.yml b/.github/workflows/new-prs.yml
index 88175d6f8d64d4..344976b6be92c1 100644
--- a/.github/workflows/new-prs.yml
+++ b/.github/workflows/new-prs.yml
@@ -73,3 +73,31 @@ jobs:
           # workaround for https://github.com/actions/labeler/issues/112
           sync-labels: ''
           repo-token: ${{ secrets.ISSUE_SUBSCRIBER_TOKEN }}
+
+  check-commit-access:
+      runs-on: ubuntu-latest
+      permissions:
+        pull-requests: write
+      if: >-
+        (github.repository == 'llvm/llvm-project') &&
+        (github.event.action == 'opened')
+      steps:
+        - name: Checkout Automation Script
+          uses: actions/checkout@v4
+          with:
+            sparse-checkout: llvm/utils/git/
+            ref: main
+
+        - name: Setup Automation Script
+          working-directory: ./llvm/utils/git/
+          run: |
+            pip install --require-hashes -r requirements.txt
+
+        - name: Check Commit Access
+          working-directory: ./llvm/utils/git/
+          run: |
+            python3 ./github-automation.py \
+              --token '${{ secrets.GITHUB_TOKEN }}' \
+              check-commit-access \
+              --issue-number "${{ github.event.pull_request.number }}" \
+              --author "${{ github.event.pull_request.user.login }}"
\ No newline at end of file
diff --git a/llvm/utils/git/github-automation.py b/llvm/utils/git/github-automation.py
index da467f46b4dd31..43d81e644e2024 100755
--- a/llvm/utils/git/github-automation.py
+++ b/llvm/utils/git/github-automation.py
@@ -290,6 +290,33 @@ def run(self) -> bool:
         return True
 
 
+class CheckCommitAccess:
+    def __init__(
+        self, token: str, repo: str, pr_number: int, author: str):
+        self.repo = github.Github(token).get_repo(repo)
+        self.pr = self.repo.get_issue(pr_number).as_pull_request()
+        self.author = author
+
+    def can_merge(self, user: str) -> bool:
+        try:
+            return self.repo.get_collaborator_permission(user) in ["admin", "write"]
+        # There is a UnknownObjectException for this scenario, but this method
+        # does not use it.
+        except github.GithubException as e:
+            # 404 means the author was not found in the collaborator list, so we
+            # know they don't have push permissions. Anything else is a real API
+            # issue, raise it so it is visible.
+            if e.status != 404:
+                raise e
+            return False
+
+    def run(self) -> bool:
+        if not self.can_merge(self.author):
+            self.pr.as_issue().add_to_labels("no-commit-access")
+
+        return True
+
+
 def setup_llvmbot_git(git_dir="."):
     """
     Configure the git repo in `git_dir` with the llvmbot account so
@@ -680,6 +707,10 @@ def request_release_note(token: str, repo_name: str, pr_number: int):
 pr_buildbot_information_parser.add_argument("--issue-number", type=int, required=True)
 pr_buildbot_information_parser.add_argument("--author", type=str, required=True)
 
+check_commit_access_parser = subparsers.add_parser("check-commit-access")
+check_commit_access_parser.add_argument("--issue-number", type=int, required=True)
+check_commit_access_parser.add_argument("--author", type=str, required=True)
+
 release_workflow_parser = subparsers.add_parser("release-workflow")
 release_workflow_parser.add_argument(
     "--llvm-project-dir",
@@ -751,6 +782,11 @@ def request_release_note(token: str, repo_name: str, pr_number: int):
         args.token, args.repo, args.issue_number, args.author
     )
     pr_buildbot_information.run()
+elif args.command == "check-commit-access":
+    check_commit_access = CheckCommitAccess(
+        args.token, args.repo, args.issue_number, args.author
+    )
+    check_commit_access.run()
 elif args.command == "release-workflow":
     release_workflow = ReleaseWorkflow(
         args.token,

@DavidSpickett
Copy link
Collaborator Author

DavidSpickett commented Jan 20, 2025

Tested on my fork, where I added the label no-commit-access. We can bikeshed the name, I went with that because I wanted something that did not imply the PR is ready to merge, like "merge-on-behalf" might.

The label will need to be added by an admin before this can be merged.

Edit: actually it appears I can add labels but regardless, we decide on a name first.

@DavidSpickett
Copy link
Collaborator Author

I checked, and the labeller action we use for general labels cannot do this sort of check. Just file paths and branch names - https://github.com/actions/labeler?tab=readme-ov-file#match-object.

Copy link

github-actions bot commented Jan 20, 2025

✅ With the latest revision this PR passed the Python code formatter.

@jplehr
Copy link
Contributor

jplehr commented Jan 20, 2025

Thank you for putting this together. This is amazing.

@rengolin
Copy link
Member

Not an expert in github actions, but this looks ok to me.


def can_merge(self, user: str) -> bool:
try:
return self.repo.get_collaborator_permission(user) in ["admin", "write"]
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is probably plenty sufficient, but there's also the collaborators endpoint that enumerates explicit permissions like push: https://docs.github.com/en/rest/collaborators/collaborators?apiVersion=2022-11-28#list-repository-collaborators

It does however require paginating through the list so this is probably kinder to the rate limits

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I did not know about this, however I don't think we can use it here.https://pygithub.readthedocs.io/en/stable/github_objects/Repository.html?highlight=get_collaborator_permission#github.Repository.Repository.get_collaborators returns a list of all collaborators in the form of https://pygithub.readthedocs.io/en/stable/github_objects/NamedUser.html#github.NamedUser.NamedUser, which doesn't have a way to get to the permissions.

I see existing code on GitHub mainly using get_collaborators to get a list of usernames, then passing those one by one to get_collaborator_permission. I presume in workflows that cover the whole project not just one PR, here we just have them to check.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Interesting that pygithub truncates so much of the response 🤔

I would say the current solution is good to go then.

Copy link
Contributor

@jroelofs jroelofs left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks so much for this!

Copy link
Contributor

@jplehr jplehr left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not a GH workflow expert, afaict this looks good though.

@DavidSpickett
Copy link
Collaborator Author

I will leave this up for a few days because A: there's no rush and B: it has been suggested to have a separate discussion about reviewers merging on behalf, and that might result in a different solution to this.

Copy link
Contributor

@boomanaiden154 boomanaiden154 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Haven't looked in depth at the code yet, but would it make more sense to split this into a separate workflow that triggers on pull_request_review?

I think labelling/writing a comment pinging the reviewer who approved the PR (maybe after checking the reviewer is a member of the LLVM org) that the author needs someone else to merge their PR could help streamline this even more.

@DavidSpickett
Copy link
Collaborator Author

Haven't looked in depth at the code yet, but would it make more sense to split this into a separate workflow that triggers on pull_request_review?

I think labelling/writing a comment pinging the reviewer who approved the PR (maybe after checking the reviewer is a member of the LLVM org) that the author needs someone else to merge their PR could help streamline this even more.

I also think this, but the reality of GitHub disagrees with this. The event that would cause that workflow can only read from the repository, see https://github.com/orgs/community/discussions/26651 and https://github.com/orgs/community/discussions/55940. This is why my original attempt was reverted.

However, I agree that a notification to the author as close as possible to the approval is best.

As suggested on the RFC, we may be able to work around this with a label on open, then a timed sweep of PRs where we comment on any labelled and approved PR, then remove the label.

This workflow can check things like whether the author has set an email address, whether one of the approvers has permissions, anything we need to make either party aware of.

I'm not going to be surprised if GitHub has somehow made this impossible too - but I have hope that it can work :).

In other words: thanks to folks who reviewed this PR, but it will likely become one component of a pair of workflows instead.

@boomanaiden154
Copy link
Contributor

I also think this, but the reality of GitHub disagrees with this. The event that would cause that workflow can only read from the repository, see https://github.com/orgs/community/discussions/26651 and https://github.com/orgs/community/discussions/55940. This is why my original attempt was reverted.

Ah. I figured it could have been something like this. Sorry for pointing out something you already tried.

In other words: thanks to folks who reviewed this PR, but it will likely become one component of a pair of workflows instead.

Thanks for the effort you are putting into this problem. Happy to review anything you end up coming up with.

@DavidSpickett
Copy link
Collaborator Author

Closing in favour of #124910.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

7 participants