An archive is a complete mirror of a set of files at a certain point in time. An archive of a site includes all of the data in that site at the time the archive is created; because archives store a relatively large amount of information, they are often “compressed” into a single file (called a zip file) to reduce their total file size and make them faster to upload or download.
Onsite and Offsite Backups:
Onsite backups is a way of backing up files/folders or even the entire server to a destination medium which is in the same physical vicinity as that of the source. For eg; a USB drive, an external hard drive etc.
Offsite backup backs up files to an entirely different physical location. For eg: Cloud backups. The best way is to have both Onsite and Offsite backup configured for your servers for ultimate data protection. Offsite protection should be configured so that in case of any physical disaster like fire, or earth quake, the data remains safe since it is in some other part of the planet. So, as a Disaster Recovery method, we should always configure Offsite backups. On the other hand, Onsite backup is required for quick recovery of data in case any files or database gets corrupted.
There are mainly two types of backups.
1. Image Backup
2. Snapshot Backup.
In an Image Backup, the volume being backed up won’t be available for other application. The backup archive client is the only process with access to the volume. So, we can consider Image Backup as an Offline Backup. The main disadvantage of Image backup is we have to make the volume offline or lock it down to start the backup. Also, it does not provide a per file based backup.
In a Snapshot Backup, the file system or raw logical volume stays active and will be available for read and write operations during the backup. So, this is considered as an Online Backup. The main disadvantage of Snapshot backup is that it requires additional software to be installed.
Snapshot backup records changes to files at a certain point in time. Once the Snapshot Backups is enabled, it takes a snapshot of your web site files every few hours. Each snapshot records only changes that have been made to your site files; it does not take a snapshot of your entire web site again. So, it acts like an incremental backup. Because snapshots are taken every few hours, say four hours, changes you make to your site will not be recorded immediately, but rather at the end of the four hour window. In other words, a file must exist for four hours to guarantee its inclusion in a snapshot. Snapshot backups also “roll over” after few weeks (as configured), which means that you cannot use this utility to restore a version of a file that’s beyond those few weeks.
I have been using rSnapshot backup utility to configure backups on Linux servers. This utility uses both hard links and rsync combinations to manage full and incremental backups. Rsnapshot backup is very easy to configure and once it’s setup and configured, it will take care of deleting and rotating the old backups. So less user intervention is required once it’s setup. It also uses very less disk space. The disk space required is just a little more than the space of one full backup, plus incremental. This comes as a criteria when your drive is lacking enough free space to accommodate 3 copies of backup.
Prerequisites: You require the following package installed on your Linux distro:
Download the latest source tar ball from: http://www.rsnapshot.org/downloads.html
Now, untar the source code package.
# tar xzvf rsnapshot-1.3.1.tar.gz
Change to the source directory and run the configure script:
# cd rsnapshot-1.3.1/
# ./configure –sysconfdir=/etc
# make install
Now rsnapshot is installed under /usr/local, with the config file in /etc
Rsnapshot Configuration – Specify the destination media / parition:
The major configuration changes to be done is to specify the backup destination / media and to specify what all to backup every few hours.
A sample copy of the rsnapshot config file is provided with the package. We need to just copy the file.
# cp /etc/rsnapshot.conf.default /etc/rsnapshot.conf
# vi /etc/rsnapshot.conf
The main directive that requires changes are as follows:
Here the backup destination is a different partition /backupdrive and ‘.snapshot’ is the folder where all the backups are stored.
Now, modify the path to the various programs like rm for removing files, rsync, ssh etc. Usually, you won’t need to update anything here unless you have customized the path to your various utilities.
By default backup interval is set as follows:
interval hourly 6
interval daily 7
interval weekly 4
This means that rsnapshot is taken every four hours, or six times a day (these are the hourly intervals), 7 times a week, 4 times a month. Thus it covers the whole month (4 weeks). You don’t need to update anything here unless you want any changes in hourly backups.
Now, configure the backup points:
backup /home/soj/ localhost/
backup /etc/ localhost/
backup /usr/local/ localhost/
Here in the first line, “backup” parameter says what to backup, followed by “/home/soj”, which means the home folder of user “soj” should be backed up to a destination (third column) which is relative to the snapshot_root; in our case “/backupdrive/.snapshots/”
NOTE: In the above backup points is that each column is separated by tab delimiter and not spaces.
You can now test the snapshot configuration with the following command:
# rsnapshot configtest
You can verify hourly backup configuration as follows:
# rsnapshot -t hourly
The above command simulates an hourly backup and prints out the command that will be executes once it’s run for real.
Now, you can edit the cron job to automate the rsnapshot process:
# crontab -e
Add the following entries,
0 */4 * * * /usr/local/bin/rsnapshot hourly
30 23 * * * /usr/local/bin/rsnapshot daily
Cron should be timed in a way that the hourly backup is finished before performing the daily backup.
Also, rsnapshot can be used to perform remote backups. Check the configuration file for more information on it.
This is a video I have created when I setup a new partition for backup and then configured rsnapshot to start backing up files and folders every four hours.