Git Submodules Revisited

Dave Cridland - May 12 '18 - - Dev Community

Git's submodules are so universally derided that there's practically an entire industry devoted to providing alternatives for managing dependencies.

But like anything in git, it's often worth giving the man-pages a good going-over and figuring out whether there's some options that do what you want, or to see if they've improved lately.

What I want

So, Metre is my exemplar project. It's got a slew of submodules, in part because some of our customers run (really) ancient versions of Linux and so we're going to need to statically link. Yay, fun!

But that means managing and shipping our own build of OpenSSL, for example - and that's a terrifying prospect for our Security Guy (lovely chap called Simon). It's pretty terrifying for me, too, actually.

In practical terms, then, our release cycle involves advancing along a stable branch on all the submodules, such that we're confident that we've picked up any bugfixes. This needs to be as simple as possible - really, a single command we can run as we need to.

But, we want to have high confidence that checking out a particular commit hash of Metre will give us the same dependencies we built with.

Git Submodule Add

Initially, I went for git submodule and a lot of manual work. I (lead dev) wasn't happy with this. Simon The Security Guy wasn't happy with this. Pete, one of our senior devs, conducted a full review of the project and highlighted it too.

The problem is that one slip and a dependency could be left with a serious security issue in. And Metre is meant to be all about security.

The plus-side of git submodule is that it tracks the commit hashes of submodules, and you can check them all out at the right hash with either a git clone --recursive or a git submodule update --init --recursive.

We considered switching to something else, but then we'd lose much of the built-in smarts of git submodule, and that's also a pain.

Oh, look - branches!

A deep dive into man git-submodule and man 7 gitsubmodules, however, found me gold.

First, there's a -b branch switch to git submodule add. That adds the submodule at a specific branch, and moreover sets the "tracking branch" - the one git normally pulls from - to the remote origin branch just as you'd normally do.

Second, I found a config option of submodule.{submodule name}.branch, which stores this. This isn't quite as great as you'd think, though, because while you can set this in git config for the repository, it's not tracked.

Fear My Editor Skillz

However, submodule configuration is stored in the repository in the .gitmodules file at the top. So you can edit that file, find the section, and simply add a branch key right there:

[submodule "deps/spiffing"]
        path = deps/spiffing
        url = http://github.com/surevine/spiffing
[submodule "deps/openssl"]
        path = deps/openssl
        url = git://git.openssl.org/openssl.git
        branch = OpenSSL_1_1_0-stable
Enter fullscreen mode Exit fullscreen mode

The default is master, though, so if that's all you wanted, you've got that already.

Updating Branches

The normal command for updating submodules is git submodule update. There's three flags of interest:

--init performs a git submodule init if the submodule isn't already cloned into place, and nothing otherwise - so it's always safe to use.

--recursive recurses through each submodule, running the same git submodule update command in each.

--remote is the magic - that performs a git pull along the remote tracking branch. It's this that we want.

Workflow Summary

So now the workflow looks like this:

git clone --recursive git@github.com:surevine/metre - clones the repository and checks out the HEAD of master.

git checkout foo - checks out the foo branch or commit - and will switch the submodules to the commits they were for foo.

git submodule update --init --recursive --remote - updates all submodules recursively along their tracking branches. Without the --remote, it'll reset the submodule working directories to the "right" commit for the parent.

Finally, you can:

git config submodule.recurse true - tells git that most commands should act recursively, in particular git pull.

. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
Terabox Video Player