Git is a fantastic tool to version control code. And the advantages of
version controlling my work are so evident that I want to version control
everything else I do besides code. Most of my work consists of editing text
files, and I have even forced my workflow into text files just to be able
to version control more of it. Thus, I keep track of my source code,
org-mode notes and then some \(\LaTeX\) files.
However, while working on a report, I realized that I generate figures and
other data plots that also require to be version controlled. I do send
those images, binary blobs, into git. But I don’t really have a good way to
track their changes. Github provides a great tool for checking the diff of
image files. But I want to do that locally on my machine. So I decided to
solve this and found a solution from these
websites 1, 2, 3. From now on I can diff image
files thanks to imagemagick
and git difftool
.
The configuration
First start creating a .gitattributes file. It can be specific to a project
or global to the user, if you save it in your home directory. This file
tells git how to treat files during version control. In this case I’ll
define svg and png image files as binaries, so that they never show a text
diff representation in git. pdf files on the other hand will be treated
with a special filter.
1*.svg binary2*.png binary3*.pdf diff=pdf
Next in ~/.gitconfig, my global configuration, I setup how to treat pdf
files. Their text representation is the information pdfinfo can give me
about them. I only need to add this 2 lines in the .gitconfig file.
1[diff "pdf"]2textconv=pdfinfo
More exciting now is to add the difftool configuration. I call it
image_diff and then declare the command cmd that will perform the diff
action.
$LOCAL and $REMOTE are variables intrinsic to git and correspond to the
old/staged file and the current/unstaged file. compare
takes the
2 files that can be treated by imagemagick and creates a comparison png
stream (defined by png:-). The output is piped to montage
to create a
more informative 3 column image with the reference file to the left, the
diff in the middle and to the right is the current file. -geometry 400x
sets the size of the image, feel free to scale it. -font Liberation-Sans is
the font of the labels, I set it up because montage seems to default to
Helvetica which I don’t care to install in my system.
The workflow
When I’m working on my code or any text file I can review/stage and commit
all my changes from the shell or with any other tool I use to communicate
with git.
I review and stage all my changes for text files in the usual way. But
for image files I can now review the changes using the git difftool that I
just defined.
1git difftool -t image_diff
This will ask me if I want to launch image_diff to evaluate the diffs of
every file not staged. When it comes to the image file I
accept, it immediately brings into display the diff image.
After reviewing the changes and being conscious why they happened I stage
the modified image and do a new commit.
As scientist I studied the physics of the very small quantum world. As a computer hacker I distill code. Software is eating the world, and less code means less errors, less problems. Millions of lines of legacy code demand attention and have to be understood and simplified for future reliable operation.