Skip to content

[workflows] Split pr-code-format into two parts to make it more secure #78216

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 6 commits into from
Feb 2, 2024
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
89 changes: 89 additions & 0 deletions .github/workflows/issue-write.yml
Original file line number Diff line number Diff line change
@@ -0,0 +1,89 @@
name: Comment on an issue

on:
workflow_run:
workflows: ["Check code formatting"]
types:
- completed

permissions:
contents: read

jobs:
pr-comment:
runs-on: ubuntu-latest
permissions:
pull-requests: write
if: >
github.event.workflow_run.event == 'pull_request'
steps:
- name: 'Download artifact'
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think i you use the newer upload-artifacts and download-artifacts packages, then you don't need to deal with all the listing/downloading/unzip code -- it'll do it for you just based on a name and run_id.

# v7.0.1
uses: actions/github-script@60a0d83039c74a4aee543508d2ffcb1c3799cdea
with:
script: |
let allArtifacts = await github.rest.actions.listWorkflowRunArtifacts({
owner: context.repo.owner,
repo: context.repo.repo,
run_id: context.payload.workflow_run.id,
});
let matchArtifact = allArtifacts.data.artifacts.filter((artifact) => {
return artifact.name == "workflow-args"
})[0];
let download = await github.rest.actions.downloadArtifact({
owner: context.repo.owner,
repo: context.repo.repo,
artifact_id: matchArtifact.id,
archive_format: 'zip',
});
let fs = require('fs');
fs.writeFileSync(`${process.env.GITHUB_WORKSPACE}/workflow-args.zip`, Buffer.from(download.data));
- run: unzip workflow-args.zip

- name: 'Comment on PR'
uses: actions/github-script@v3
with:
github-token: ${{ secrets.GITHUB_TOKEN }}
script: |
var fs = require('fs');
const comments = JSON.parse(fs.readFileSync('./comments'));
if (!comments) {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

should we handle errors from reading the json here and print the problems? so that we can debug that later if something goes wrong?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I added a step to always dump the comment file (pass or fail) at the end of the job.

return;
}
let runInfo = await github.actions.getWorkflowRun({
owner: context.repo.owner,
repo: context.repo.repo,
run_id: context.payload.workflow_run.id
});
if (runInfo.data.pull_requests.length != 1) {
console.log("Error retrieving pull request number");
console.log(runInfo);
return;
}
const prNumber = runInfo.data.pull_requests[0].number;
await comments.forEach(function (comment) {
if (comment.id) {
github.issues.updateComment({
owner: context.repo.owner,
repo: context.repo.repo,
issue_number: prNumber,
comment_id: comment.id,
body: comment.body
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not sure input validation is that necessary here, but this change means that the JSON artifact sent over should be completely untrusted as anyone can theoretically modify the workflow and make it send whatever. I don't think this should happen much in practice (depending upon the settings for whether or not new contributors need approval for workflow runs), but it theoretically gives the ability for anyone to post a comment on any PR/Issue under the Github moniker.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I was able to figure out how to get the PR number from the event payload, so that will prevent someone from posting a comment on any issue in the project. So now the only untrusted inputs are the comment id and the comment body.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe we should ensure that the provided commit_id was previously created by this script, rather than a random other user's comment?

});
} else {
github.issues.createComment({
owner: context.repo.owner,
repo: context.repo.repo,
issue_number: prNumber,
body: comment.body
});
}
});
- name: Dump comments file
if: always()
run: cat comments
30 changes: 12 additions & 18 deletions .github/workflows/pr-code-format.yml
Original file line number Diff line number Diff line change
@@ -1,7 +1,5 @@
name: "Check code formatting"
on: pull_request_target
permissions:
pull-requests: write
on: pull_request

jobs:
code_formatter:
Expand All @@ -27,18 +25,6 @@ jobs:
separator: ","
skip_initial_fetch: true

# We need to make sure that we aren't executing/using any code from the
# PR for security reasons as we're using pull_request_target. Checkout
# the target branch with the necessary files.
- name: Fetch code formatting utils
uses: actions/checkout@v4
with:
sparse-checkout: |
llvm/utils/git/requirements_formatting.txt
llvm/utils/git/code-format-helper.py
sparse-checkout-cone-mode: false
path: code-format-tools

- name: "Listed files"
env:
CHANGED_FILES: ${{ steps.changed-files.outputs.all_changed_files }}
Expand All @@ -56,10 +42,10 @@ jobs:
with:
python-version: '3.11'
cache: 'pip'
cache-dependency-path: 'code-format-tools/llvm/utils/git/requirements_formatting.txt'
cache-dependency-path: 'llvm/utils/git/requirements_formatting.txt'

- name: Install python dependencies
run: pip install -r code-format-tools/llvm/utils/git/requirements_formatting.txt
run: pip install -r llvm/utils/git/requirements_formatting.txt

- name: Run code formatter
env:
Expand All @@ -72,9 +58,17 @@ jobs:
# explicitly in code-format-helper.py and not have to diff starting at
# the merge base.
run: |
python ./code-format-tools/llvm/utils/git/code-format-helper.py \
python ./llvm/utils/git/code-format-helper.py \
--write-comment-to-file \
--token ${{ secrets.GITHUB_TOKEN }} \
--issue-number $GITHUB_PR_NUMBER \
--start-rev $(git merge-base $START_REV $END_REV) \
--end-rev $END_REV \
--changed-files "$CHANGED_FILES"
- uses: actions/upload-artifact@v2
if: always()
with:
name: workflow-args
path: |
comments
26 changes: 26 additions & 0 deletions llvm/utils/git/code-format-helper.py
Original file line number Diff line number Diff line change
Expand Up @@ -44,6 +44,7 @@ class FormatArgs:
token: str = None
verbose: bool = True
issue_number: int = 0
write_comment_to_file: bool = False

def __init__(self, args: argparse.Namespace = None) -> None:
if not args is None:
Expand All @@ -53,12 +54,14 @@ def __init__(self, args: argparse.Namespace = None) -> None:
self.token = args.token
self.changed_files = args.changed_files
self.issue_number = args.issue_number
self.write_comment_to_file = args.write_comment_to_file


class FormatHelper:
COMMENT_TAG = "<!--LLVM CODE FORMAT COMMENT: {fmt}-->"
name: str
friendly_name: str
comment: dict = None

@property
def comment_tag(self) -> str:
Expand Down Expand Up @@ -119,6 +122,13 @@ def update_pr(self, comment_text: str, args: FormatArgs, create_new: bool) -> No
comment_text = self.comment_tag + "\n\n" + comment_text

existing_comment = self.find_comment(pr)

if args.write_comment_to_file:
self.comment = {"body": comment_text}
if existing_comment:
self.comment["id"] = existing_comment.id
return

if existing_comment:
existing_comment.edit(comment_text)
elif create_new:
Expand Down Expand Up @@ -309,6 +319,8 @@ def hook_main():
if fmt.has_tool():
if not fmt.run(args.changed_files, args):
failed_fmts.append(fmt.name)
if fmt.comment:
comments.append(fmt.comment)
else:
print(f"Couldn't find {fmt.name}, can't check " + fmt.friendly_name.lower())

Expand Down Expand Up @@ -349,6 +361,11 @@ def hook_main():
type=str,
help="Comma separated list of files that has been changed",
)
parser.add_argument(
"--write-comment-to-file",
action="store_true",
help="Don't post comments on the PR, instead write the comments and metadata a file called 'comment'",
)

args = FormatArgs(parser.parse_args())

Expand All @@ -357,9 +374,18 @@ def hook_main():
changed_files = args.changed_files.split(",")

failed_formatters = []
comments = []
for fmt in ALL_FORMATTERS:
if not fmt.run(changed_files, args):
failed_formatters.append(fmt.name)
if fmt.comment:
comments.append(fmt.comment)

if len(comments):
with open("comments", "w") as f:
import json

json.dump(comments, f)

if len(failed_formatters) > 0:
print(f"error: some formatters failed: {' '.join(failed_formatters)}")
Expand Down