Skip to content

Mark Embling

Introducing Storelet

Backups: we all know that they're vital. Right? Backups are really really important. To that end, I've written storelet, a tiny little library for easily writing simple backup scripts using Python.

I run a few Ubuntu servers such as the one that served you this blog, and each one has had at least one or two cobbled-together scripts for running backup tasks. These varied from simple shell scripts to Ruby and Python scripts. Broadly they were all doing exactly the same tasks, but for some unknown reason I'd written each one from scratch, generally going about it in a slightly different way.

Madness.

Starting now, all my backup scripts have been replaced with nice little Python versions using storelet, my new library.

Right now, storelet supports the following:

  • Creating a compressed ZIP file
  • Including any number of directories, from which files/directories are recursively included
  • Creating new directories and doing anything else required...
  • ... such as running database backup commands
  • Uploading the entire thing to Amazon S3

A Simple Example

A simple script using storelet looks like this:

import storelet

with storelet.ZipBackup("example") as backup:
    backup.include_directory("/home/mark/some/directory")
    backup.include_directory("/some/other/directory")

    backup.save_to_s3("my-bucket", "<access_key_goes_here>", "<secret_key_goes_here>")

What this will do is take everything inside the /home/mark/some/directory and /some/other/directory directories and put them in a ZIP file called example, which will then be uploaded to S3 and put inside the my-bucket bucket.

Change Behaviour

Storelet can also preserve the entire path inside the file, so the /home/mark/some/directory structure remains inside the final ZIP. To do this, simply use the preserve_paths argument:

backup.include_directory("/home/mark/some/directory", preserve_paths=True)

Sometimes it might be useful to put everything inside another (new) directory in the ZIP file. Storelet makes this easy as well via the name argument:

backup.include_directory("/home/mark/some/directory", name="files")

Of course both arguments can be provided, in which case the files would be stored inside their original heirarchy which itself would be nested inside a new directory named using the value of the name argument, like this:

example.zip
|
+- files
   |
   + home
     |
     + mark
       |
       + ...    

Custom Commands or Code

Storelet realises that it isn't always just files you need to back up. In my case, I also need to run database backups and for that, storelet provides the following:

from subprocess import call
import storelet

with storelet.ZipBackup("example") as backup:
    with backup.include_new_dir("generated_directory") as d:
        call(["touch", "%s/touched.file" % d])
        # shell out to anything you like and put the output at str(d)
        # alternatively, write any Python code you like and put files into str(d)

In the above example, the final ZIP file will contain a directory named generated_directory, which in turn will contain an empty file called touched.file, created by calling the touch command. Any commands can be run here using Python's built-in ability to call other processes. Alternatively, there is nothing stopping you adding any normal Python code and writing out files to the location specified by str(d).

The way this works is that when with backup.include_new_dir("whatever") as d: is called, a new temporary directory is created on disk, normally under /tmp. The string representation of the resulting variable (d) contains the location on disk (e.g. /tmp/whatever). You can then place files into this directory. At the end of the with block, the contents of that directory are then included in the generated archive, in a directory with your given name.

Saving the Backup

Right now, the only backup process is uploading to Amazon S3:

backup.save_to_s3("my-bucket", "<access_key_goes_here>", "<secret_key_goes_here>")

In the future, it is my intention to add more methods of preserving the backups, such as other storage providers and the ability to copy to a location (perhaps on another disk/partition, for example).

Storelet's Future

Right now, the only type of backup is a ZIP file, using ZipBackup. In the future, I may add others such as tar files and so on. I imagine there might be plenty of other use-cases out there which I've not thought of yet which storelet might be able to help with.

Feel free to fork it, make changes and send a pull request. Storelet is brand new and I'm sure there's a lot to improve. Also, beware that it is very early days and as such, storelet has not really been battle-tested yet. Right now, there aren't tests of any kind (I know...). This is something I'd like to address in the future. And as with all backup strategies, no matter what you do with storelet, please make sure you check and test your backups.