Versioning of OpenOffice/LibreOffice documents using git
Update: With the option to save documents directly in Flat XML format
(i.e. *.fodt
instead of *.odt
) within LibreOffice, this article is obsolete.
Flat XML files receive the best git versioning you can expect. The images in your
documents are embedded in the XML file as well. Thanks to Simon for this helpful
comment. However, in another comment it was suggested to
still process the document to get rid of cursor markers that produce noise in
the diff.
At the moment I’m preparing my application documents for my job after finishing my travels. Of course, I track changes of my LaTeX documents with the git versioning system. This is pretty straight-forward, because LaTeX documents are only plain text.
Beside this, I’m using as well some LibreOffice (compatible to OpenOffice) documents, whose changes are not obvious as it uses a binary format. For sure, git can handle binary files, but the log is not really helpfull.
Binary files "a/overview.ods" and "b/overview.ods" differ
Since git version 1.6.1 or later, you can specify dedicated diff filters per file extensions to track these files as well as plain text files.
Knowing that files OpenDocument files *.ods
, *.odt
or *.odp
are only gziped folders of xml
files, it is perfectly obvious to extract the files before comparision. This way,
a plain text diff can be done. I’m using odt2txt, which is available in the
openSUSE software archive.
Unfortunately, the whole file content is presented in one single, very long text line. As git is based on tracking changing lines, this is not very helpfull. So I decided to use additionally an XML file formatter (xmllint is already installed) to solve this problem.
#!/usr/bin/env sh
# place this file to ~/bin/odf2prettytxt and have ~/bin in $PATH
set -o errexit
/usr/bin/odt2txt --raw "$@" | /usr/bin/xmllint --format -
This filter must be announced to your git installation
git config --global diff.odf.textconv "odf2prettytxt"
Lastly you can enable this filter per project/folder by adding a line to your
.gitattributes
file. This is done for the *.ods
files using this one liner.
echo '*.ods diff=odf' >> .gitattributes