Posted on

The business is growing, but so is the team and the code base. The team starts reaching out for linters, formatters, and other tools to keep the collaboration sane and have a more consistent code base. Naturally, the policy accompanying the introduction of these tools is to format whatever lines you touch only, to avoid losing the git history.

Normally, you would push a new commit, and wait 30 minutes for CI to bless your PR. Only for the build to fail due to an extra whitespace. The git police at your company have disabled force pushes, so you can't amend the commit. It sure is nice browsing a git history full of Formatting, Do the lint dance and Please the CI gods. This should have been caught locally, you think.

While there is no shortage of IDE tools and integrations, they merely point out the mistakes in your code. It is up to you (or your coworkers) to follow their advice, which can be lost within a sea of warnings, usually because of already existing legacy code that no one wants to touch.

Let's see how can we improve our local workflow by making use of githooks.

Git hooks

A git hook is a script that is triggered by a certain git event. These reside in the .git/hooks/ directory of your project. Some sample hooks are included by default:

$ ls .git/hooks/
applypatch-msg.sample
commit-msg.sample
fsmonitor-watchman.sample
post-update.sample
pre-applypatch.sample
pre-commit.sample
prepare-commit-msg.sample
pre-push.sample
pre-rebase.sample
pre-receive.sample
update.sample

To activate one of these hooks, simply remove the .sample suffix from the file name. Likewise, you can introduce a new hook by creating a new script that has the name of the git event. You can browse the manpage to find all the available hooks:

$ man githooks

We're interested in the pre-commit, post-checkout, and post-merge hooks. Here are the relevant parts of the git manual:

pre-commit

This hook is invoked by git-commit(1), and can be bypassed with the --no-verify option. It takes no parameters, and is invoked before obtaining the proposed commit log message and making a commit.

Exiting with a non-zero status from this script causes the git commit command to abort before creating a commit.

post-checkout

This hook is invoked when a git-checkout(1) is run after having updated the worktree.

The hook is given three parameters: the ref of the previous HEAD, the ref of the new HEAD (which may or may not have changed), and a flag indicating whether the checkout was a branch checkout (changing branches, flag=1) or a file checkout (retrieving a file from the index, flag=0). This hook cannot affect the outcome of git checkout.

It is also run after git-clone(1), unless the --no-checkout (-n) option is used. The first parameter given to the hook is the null-ref, the second the ref of the new HEAD and the flag is always 1. Likewise for git worktree add unless --no-checkout is used.

post-merge

This hook is invoked by git-merge(1), which happens when a git pull is done on a local repository. The hook takes a single parameter, a status flag specifying whether or not the merge being done was a squash merge. This hook cannot affect the outcome of git merge and is not executed, if the merge failed due to conflicts.

pre-commit is the usually used for making sure the code is up to standards before pushing. This is where you run your linters, formatters, vulnerability checkers, and critical unit tests.

post-checkout and post-merge are usually used to make sure your local environment (e.g. dependencies, migrations, etc) is up-to-date.

Managing your hooks

There are many tools to manage your git hooks for you:

I'll be using lefthook for the examples below, but you can certainly adjust them to a different manager that better suits your taste. You can install it from here.

To initialize it for your project:

$ cd /path/to/your/git/repo
$ lefthook install

An empty lefthook.yml file will be created. To test out the execution of a hook:

$ lefthook run <hook-name>

We will be using pre-commit, post-checkout, and post-merge so go ahead and register them:

$ lefthook add pre-commit post-checkout post-merge

If you had an existing hook (e.g. from a different tool), it will be backed up as .git/hooks/<hook-name>.old


Recipes

The following are some recipes that I use when working with Python projects, but it should be easy to adapt them to a different language.

Lint your code

pre-commit

Let's lint our Python code using flake8. Edit your lefthook.yml file:

pre-commit:
  commands:
    lint:
      glob: "*.py"
      run: flake8

This lints all the files, and usually you want to lint the changed files only. Let's change that:

pre-commit:
  commands:
    lint:
      glob: "*.py"
      run: flake8 {staged_files}

If you linter supports diffs:

pre-commit:
  commands:
    lint:
      glob: "*.py"
      run: git diff --cached -U0 {staged_files} | flake8 --diff

One thing to note here is that flake8 --diff doesn't actually lint the diff content, instead, it uses the line numbers from the diff as pointers to the changed files, and then lints the file on disk. This means unstaged changes within the same line numbers might fail your commit.

To avoid this, we'll need to:

  1. Save our unstaged changes as a patch
  2. Discard the unstaged changes from the working tree
  3. Run the linter
  4. Restore the unstaged changes
pre-commit:
  commands:
    01-save:
      run: git diff > unstaged.diff && git apply -R unstaged.diff
    02-lint:
      glob: "*.py"
      run: git diff --cached -U0 {staged_files} | flake8 --diff
    03-restore:
      run: git apply unstaged.diff && rm unstaged.diff

I had to give each command a number to run them in order. Depending on your hooks manager, you might not need to do this.

Sync your environment between branches

post-checkout post-merge

If you're working in a fast-paced environment, chances are you'll be switching/merging branches constantly, and syncing your local environment can get tedious.

I use poetry to manage Python dependencies, but that's mostly irrelevant for this task as every package manager usually have the same set of commands.

We will need to use shell scripts for this task, so let's create a directory to hold these scripts:

$ lefthook add -d post-checkout
$ lefthook add -d post-merge

Add the following scripts with their respective paths:

.lefthook/post-checkout/sync-dependencies.sh

#!/usr/bin/env sh
is_branch_checkout=$3

# exit if not a branch checkout
# 1 -> branch checkout
# 0 -> file checkout
if [[ $is_branch_checkout != '1' ]]; then
  exit 0
fi

path="poetry.lock"

prev_sha=$(git rev-parse "$1:$path")
current_sha=$(git rev-parse "$2:$path")

if [[ $prev_sha != $current_sha ]]; then
  poetry install --no-interaction
fi

.lefthook/post-merge/sync-dependencies.sh

#!/usr/bin/env sh
path="poetry.lock"

revision_before_merge=$(git rev-parse HEAD@{1})
prev_sha=$(git rev-parse "$revision_before_merge:$path")
current_sha=$(git rev-parse "HEAD:$path")

if [[ $prev_sha != $current_sha ]]; then
  poetry install --no-interaction
fi

The gist of these scripts is to check if the file has any changes between two branches, by comparing the file revision before/after a checkout or a merge operation. git rev-parse is used to resolve the SHA of the revision.


Wrapping up

You can extend the recipes here to also sync database migrations between branches, or prevent commits that add a new dependency without regenerating the lock file (i.e. prevent the lock file getting out of sync). There's no shortage of ideas to enable productive workflows that simplifies your daily job.

It's also important to keep things simple to ease adoption within the workplace. These hooks are automated scripts that will be executed automatically, on different systems, on behalf of users from various technical backgrounds. Engineers, Product Owners, Designers, etc. are using the code base every day, and it's important to not disrupt their productivity due to some brittle scripts.

Lefthook can use a local-only config file so you can experiment locally with creative/complex hooks while limiting the shared config to the simple ones.