Backup your GitHub repositories using Bash script and GitHub CLI defining a simple GraphQl query

Aniello Musella - Jan 10 '21 - - Dev Community

I wanna share a bash script that I wrote to backup all repositories hosted in GitHub. The script uses the GitHub CLI to inquiry the GitHub APIs using a simple GraphQl query.

Why this script

The goal of the script is to backup on your workstation all repositories of specific owner hosted on GitHub (e.g. your company or other companies like Facebook, Google, Microsoft, etc.). The script will download all owner's repositories (even those privates if you have been granted to access).

Before to use the script

You must have a valid account on GitHub and you have to download and install the GitHub CLI (if you need more info about this cli tool follow this link). After the installation of the GitHub CLI you can check the installation running the following command:

animus@host:~$gh --version
Enter fullscreen mode Exit fullscreen mode

You should see the version of the cli like shown below (at the time of writing the last version is 1.4.0 )

gh version 1.4.0 (2020-12-15)
https://github.com/cli/cli/releases/latest
Enter fullscreen mode Exit fullscreen mode

After you need to authenticate the cli running the following command to start interactive setup of your credentials:

animus@host:~$gh auth login
Enter fullscreen mode Exit fullscreen mode

Follow the directions on the screen to use your GitHub credentials in the GitHub CLI (you can use either the browser or an authentication token to authenticate you).

How to use the script

Download the script.

Run the script without filtering

Run the script defining the only two mandatory parameters owner and the backup directory downloading all repositories and all branches.

animus@host:~$repos.github.backup.sh -ow OWNER -dr DIRECTORY
Enter fullscreen mode Exit fullscreen mode

Run this script with filtering

You can run the script filtering with custom regular expressions repositories names (e.g. ^react*.) or branches name (e.g. ^((master)|(main)|(develop))$).

animus@host:~$repos.github.backup.sh -ow OWNER -dr DIRECTORY -rr "VALID REGEX" -br "VALID REGEX" 
Enter fullscreen mode Exit fullscreen mode

Run this script in help mode

Display the help running:

animus@host:~$repos.github.backup.sh --help
Enter fullscreen mode Exit fullscreen mode

You'll get help

usage: repos.github.backup.sh -ow OWNER -dr DIRECTORY
Backup one or more repositories hosted on GitHub in a local directory.

Options:
-ow, --owner          Owner of repository
-dr, --directory      Backup Directory 
-rr, --reposRegex     Regular expression to match repositories names to backup (default any)
-br, --branchesRegex  Regular expression to match repositories branches to backup (default any)
-ps, --pageSize       Page size of api paging (default 10)
--help                Display this help and exit

Enter fullscreen mode Exit fullscreen mode

Examples of use

Download all react repositories owned by Facebook:

repos.github.backup.sh --owner facebook -dr . -rr "^react.*" -br "^((master)|(main)|(develop))$"
Enter fullscreen mode Exit fullscreen mode

Download all vscode repositories owned by Microsoft:

repos.github.backup.sh --owner microsoft -dr . -rr "^vscode..*" -br "^((master)|(main)|(develop))$"
Enter fullscreen mode Exit fullscreen mode

Download all dart repositories owned by Google:

repos.github.backup.sh --owner google -dr . -rr "^dart..*" -br "^((master)|(main)|(develop))$"
Enter fullscreen mode Exit fullscreen mode

Where is GraphQl?

GraphQl is used inside the script as parameter to the following command of the GitHub CLI:

gh api graphql -F  owner=$owner -F pageSize=$pageSize -F afterCursor=$endCursor  -f query="$graphQlQuery"
Enter fullscreen mode Exit fullscreen mode

Query is parameterized and it's reported below:

query ($owner: String!, $pageSize: Int!, $afterCursor: String) {
  repositoryOwner(login: $owner) {
    repositories(first: $pageSize, after: $afterCursor) {                
      edges {
        node {
           name
        }        
      }
      pageInfo{
        hasNextPage
        endCursor
      }
    }
  }
}

Enter fullscreen mode Exit fullscreen mode

In this query the script asks all repositories name (just that, we don't need nothing more to reach the goal) owned by a owner using a cursor for pagination (with a specific page size).

Conclusions

In this script we have seen the usefulness of GitHub Apis, the practicality of GitHub CLI, the simplicity of GraphQl and versatility of Bash, put togheter we get a simple solution to backup a list of repositories hosted on GitHub.

Points of interest and possible evolutions

  • Exploring GitHub APIs ecosystem.
  • Practice with GitHub APIs and GraphQl using GraphQL API Explorer based on GraphiQL.
  • Explore all possible commands of the GitHub CLI
  • Make a more complex GraphQl query to get more information with goal to evolve toward somenthing different.
  • Improve the script making a more user-friendly experience.
  • Try to use a GraphQl client based on a language of your preference to reach the same goal.

Suggestions and corrections are welcome.

See you soon!

AM

. . . . . . . . . . . . . . .
Terabox Video Player