This series is written by a representative of the latter group, which is comprised mostly of what might be called "productivity users" (perhaps "tinkerly productivity users?"). Though my lack of training precludes me from writing code or improving anyone else's, I can, nonetheless, try and figure out creative ways of utilizing open source programs. And again, because of my lack of expertise, though I may be capable of deploying open source programs in creative ways, my modest technical acumen hinders me from utilizing those programs in what may be the most optimal ways. The open-source character, then, of this series, consists in my presentation to the community of open source users and programmers of my own crude and halting attempts at accomplishing computing tasks, in the hope that those who are more knowledgeable than me can offer advice, alternatives, and corrections. The desired end result is the discovery, through a communal process, of optimal and/or alternate ways of accomplishing the sorts of tasks that I and other open source productivity users need to perform.

Wednesday, December 19, 2012

Eighth Installment: compress and encrypt/decrypt a directory

I recently visited a relative who is studying in the natural sciences and who, surprisingly, is even less capable in certain technical aspects of computing than I am. He was trying to create, on his Mac, a script that would run as a cron job, and asked me for some pointers. Though I know the basics about cron and was willing to pitch in, I wasn't so sure about the script: you see, calling my bash skills rudimentary would be high praise. Nonetheless I decided that, with some web searching, I might be able to assist with that, too. Sure enough, I was able to find just the sort of information that would help us create a script that would tar and compress, then encrypt, a target directory. Details--shamelessly lifted from various locales on the web--are included below.

Over the years that I've been using Linux I have, of course, read more than a few articles that describe methods of encrypting files or partitions. Most recently, for example, there appeared on Lxer an article that described a clever way of encrypting a local directory that then gets backed up to some cloud storage service like dropbox. I've bumped up against the issue of encryption when doing fresh installations as well, as it has been the case for some time now that an option is given on installation for many Linux distros of encrypting, for example, the /home directory.

Despite reading at least some of those articles with interest, I did not feel the need to implement such encryption on my own systems. So it was not until someone else asked my assistance in doing something like this that I actually tried it myself. As you will see, it was actually fairly simple to implement. But first, a few caveats.

I'll skip any details in the following description regarding the cron aspect of this project--not that I could provide a whole lot of enlightenment anyway--other than to say that it's a handy way to make programs or processes run on a set schedule on computers than run *nix. One way I've used it is to cause a weather map, which I've set up as the desktop background on one of my computers, to update every 10 minutes--look for a future entry in this blog on how I managed that.

I'll also not speak in any depth about another ingredient in this recipe--tar--other than to say that it is an abbreviation for for "tape archive." I myself do not understand its workings terribly well, though I've used it on several occasions. I will mention on a related note, however, that, in my research, I ran across articles that used another, similar, utility--dd (an abbreviation for "disk dump")--to create compressed and encrypted archives. But I did not follow up on the dd option and so cannot post any further information about how that was done.

Finally, I can't speak in any depth about the program I used for doing the encryption--openssl--or about another program with which I experimented and which also does encryption--gpg. But I promise, despite those rather glaring deficits, that I will describe something I managed to accomplish and which you, too, should be able to accomplish by following the steps outlined.

Perhaps in some future entry for this blog I'll be able to further explore tar, dd, and/or cron. But for now I'm going to focus my attention mainly on the option we ended up using, which involved mainly tar and openssl.

The relative in question, as I mentioned, works in the natural sciences. He has a directory of his ongoing work that he wants to back up regularly, but to which he does not want anyone else to have access. His choice for backing up beyond his own PC, is to use dropbox. So the task was, as mentioned, to compress and encrypt the target directory: moving it to the location on the local machine where dropbox would find it so as to back it up will also not be covered in this write-up, though that step did end up being part of his final resolution.

So, what's left? It was quite easy to find directions of the web for doing all this. I pretty much went with the first workable solution I found, which came from the linuxquestions forum (the relevant thread can be found here).

The incantation we used was as follows:

tar -cj target-dir | openssl enc -aes128 -salt -out target-dir.tar.bz2.enc -e -a -k password

What that line does may be evident to most, but I will offer a bit of review nonetheless. The target directory is first tar'red and compressed with bzip, (the c option stands for "create" and the j option specifies that the created file should be compressed with bzip2) then it is piped to openssl for encryption. The word "password" is, obviously, to be replaced by whatever password the user chooses.

One possible drawback to this method, as pointed out in the thread from which I lifted it, is that the encryption password gets entered, in plain text, right on the command line (which is slightly less of an issue with a cron script such as we were creating). Thus, anyone who can gain access to the machine can, by using the command line history, see what the encryption password was. Since someone gaining access to his computer and viewing the command line history was not a concern for the fellow I was helping, this is the solution we implemented. But that potential concern can be easily remedied by simply leaving off the -k password switch at the end, which has the effect of prompting the user for a password, which does not get echoed to the command line.

To decrypt the file, the following command--which prompts for the password--is used:

openssl enc -aes128 -in target-dir.tar.bz2.enc -out target-dir.tar.bz2 -d -a

The file can then be uncompressed and untar'red. This part of the process could likely be reduced from two steps (decryption, then uncompression/untar'ing) to one by using a pipe, but since it was presumed, for purposes of this project, that the file would act simply as insurance against loss--the need for ever actually recovering the content being very unlikely--I did not pursue streamlining that aspect.

I did manage to find and test a couple of other variants which I will offer here as well. The second variant was found here, and follows:

tar -cj target-dir | openssl enc -e -a -salt -bf -out target-dir.blowfish

It is much the same as the first variant, though it uses a different encryption method called blowfish. I am uncertain which of these two encryption schemes is considered better. To decrypt the compressed directory, the following command is used:

openssl enc -d -a -bf -in target-dir.blowfish -out target-dir-decrypt.tar.bz2

Finally, I discovered yet another variant, details about which can be found here. A sample of how to use this one is as follows:

tar -cjf target-dir.tar.bz2 target-dir/ | gpg -r user -e target-dir.tar.bz2

As will be noted, this variant uses gpg to encrypt the directory. Of course user must be replaced by the name of someone who has a valid gpg key on the system, usually the primary user of said machine or account.

An interesting feature I discovered about this method is that a time-sensitive gpg key can be created, i.e., one that expires after a certain interval. If I understand correctly how this works, once the key expires, the directory can no longer be decrypted.* This feature should, obviously, be used with care.

Decrypting the directory can be done in the following way:

gpg --output target-dir.tar.bz2 --decrypt target-dir.tar.bz2.gpg

The same two-step process of decrypting, then untar'ing/uncompressing, applies to these two methods as well.

This sums up what I have to offer in this entry. Now that winter is upon us northern-hemispherers, I may be able to post more frequent entries. There are a few things I've been wanting to document for some time now.

* Correction: an anonymous commenter writes of my claim that the key expiration makes decrypting the file no longer possible that "Unfortunately not. The expired key is no longer trusted but is still functional."


  1. "If I understand correctly how this works, once the key expires, the directory can no longer be decrypted."

    Unfortunately not. The expired key is no longer trusted but is still functional.

    (If this were not the case, DRM would be trivial to implement.)

    1. Ok. Thanks for the correction. I'll revise the article accordingly.