Introduction
Recently the hosting company for website started supporting SSH access. This meant I could ditch the unsecure FTP transfers and do everything though SFTP and rsync over ssh. Beside making editing of files much easier this also allowed me to implement a rolling/rotating backup of the website. While it can be argued that such backup would never be needed as the hosting company surely has a safe storage solution I have personally experienced the loss of data from a server breakdown at the hosting company.
The python script
Below I have written a python script to automate the backup and keep the last 12 weeks of changes in separate folders with hard links in between. This means for website like mine (with low amount of changes) that the backup does not take up much more than the size of the websize + size of changes (which are small). The script defaults to 12 copies backups and I run the script through cron every week on my home Linux server. The script can also be run on the command line if needed with the syntax:
rsync-backup-websites.py user@host:/www/ /home/tjansson/backup/websites/host/
A cron line to run the script monthly on the first day of the month at 4:05 in the morning.
5 4 1 * * /home/tjansson/bin/rsync-backup-websites.py user@host:/www/ /home/tjansson/backup/websites/host/
On a final note it is assumed for this script to work through cron, that the ssh access is setup using keys and perhaps ssh-agent for passwordless access to the server.
#!/usr/bin/env python import os import argparse import shutil if __name__ == '__main__': parser = argparse.ArgumentParser(description='This script does rotating backup using rsync ') parser.add_argument('source', type=str, help='The source. Example: user@host:/www/') parser.add_argument('backup_path', type=str, help='The backup path template. Example: /home/tjansson/backup/websites/host/') parser.add_argument('-c', '--copies', type=str, default=12, help='The maximum number of copies to save in the rotation. Default=12') parser.add_argument('-d', '--debug', dest='debug', action='store_true', help='Turn on verbose debugging') args = parser.parse_args() # Folder template folder = '{}backup{}'.format(args.backup_path, '{}') # Delete the oldest folder folder_old = folder.format(args.copies) if os.path.isdir(folder_old): if args.debug: print 'Removing the oldest folder: {}'.format(folder_old) shutil.rmtree(folder_old) # Rotating backups if args.debug: print 'Rotating backups' for i in range(args.copies-1,-1,-1): folder_0 = folder.format(i) folder_1 = folder.format(i+1) if os.path.isdir(folder_0): if args.debug: print 'mv {} {}'.format(folder_0, folder_1) os.system('mv {} {}'.format(folder_0, folder_1)) #Execute the RSYNC target = folder.format(0) link = folder.format(1) if not os.path.isdir(target): os.mkdir(target) if not os.path.isdir(link): cmd = 'rsync -ah --delete -e ssh {source} {target}'.format(link=link, source=args.source, target=target) else: cmd = 'rsync -ah --delete -e ssh --link-dest="{link}" {source} {target}'.format(link=link, source=args.source, target=target) if args.debug: print 'Rsyncing the latests changes' print cmd os.system(cmd) os.system('touch {}'.format(target))
Further reading and inspiration to this post
www.mikerubel.org/computers/rsync_snapshots/
en.wikipedia.org/wiki/Backup_rotation_scheme
firstsiteguide.com