Difference between revisions of "Tar, rsync and other backup tools"

From docwiki
Jump to: navigation, search
(tar)
(tar)
Line 43: Line 43:
   
 
The above would e'''x'''tract the files into the current directory. When you extract files it is always a good idea to do this in its own directory so that you do not overwrite other things. Some archives (like the dip.tgz) are built so that they extract into their own subdirectory but others (e.g. the bilder.tgz) will mess up your current directory with a lot of files.
 
The above would e'''x'''tract the files into the current directory. When you extract files it is always a good idea to do this in its own directory so that you do not overwrite other things. Some archives (like the dip.tgz) are built so that they extract into their own subdirectory but others (e.g. the bilder.tgz) will mess up your current directory with a lot of files.
  +
  +
=== Incremental Backups ===
  +
  +
When you have a lot of files to backup you do not have enough space to make a full copy every day and it would also take a long time. So you might want to choose to make incremental backups during the week and only a full backup on the weekend. You can tell tar to only take files newer then the date of some other file or some date. You can also tell tar to take the list of files that should be in the backup from an other file. You then would first create the list by some other tools and then tell tar: take this list and backup everything there.

Revision as of 08:43, 3 April 2020


Motivation

To keep a copy of your data in a different place you do not always need a complicated backup software. Often the small tools that come with any Linux distribution are easy and simple and do the job. Here you will see a few examples on how to use them.

The Problem of Backup

When you have not dealt with backup before then your fist thought will be: I need a copy of my data. But it is not that simple. Besides a copy of your data you will also want the metadata: When the file was created, some special files (like symbolic links) and you want the permissions. Who is allowed to access what. This is especially true if you want your backup of e.g. the file server of your company. Just imagine that you loos your data and need to restore from backup, only to find out that now all the 500 people in your company can access all files from everyone else. For sure some of them will not be happy about this.

Or, if you want a backup of your system and want to restore it, it will not work if all file permissions are wrong and all special files like device files and symbolic links are missing, etc..

Further more often want more then one version of your files. Anna from the Accounting department messed up her file and overwritten it with some garbage last month but only found out today. So it does not help here much if you have a backup of the files from only last week.

You also want your backup to be in a different place then your normal computer. E.g. if someone steals your computer then they will also steal the external drive that is next to it. The same with fire, etc..

tar

tar (short for Tape ARchive) was used to write data to magnetic tapes. While it can be still used for that, today it is more like a file archive similar to ZIP. Tar by itself bundles many files and directories into one file without compression. You can also compress this file with a tool that can only compress but not bundle into an archive like gzip or bzip2. Modern tar version have support for this compression and so you can create a file that is both TARed and compressed. Usually that would be an e.g. .tar.gz or .tar.bz2 file. Often .tgz is used instead of .tar.gz. In fact the files can have any extension. Using the one mentioned here is just a convention.

Here are a few examples of using tar:

$ tar cfvz dip.tgz Diplomarbeit/ 
$ tar cfvz bilder.tgz  *.png *.jpg 

This creates an archive file named dip.tgz with all the content of the Diplomarbeit/ folder (and sub folders) and an archive bilder.tgz with all .png and .jpg files from the current directory.

The option c says: create. The f means: put it in a file instead of tape. The v is for verbose - tell us what you are doing and the z means: compress it with gzip.

$ tar tfvz bilder.tgz

This would test the file. It lists the content of the archieve.

$ mkdir test
$ cd test
$ tar xfvz dip.tgz

The above would extract the files into the current directory. When you extract files it is always a good idea to do this in its own directory so that you do not overwrite other things. Some archives (like the dip.tgz) are built so that they extract into their own subdirectory but others (e.g. the bilder.tgz) will mess up your current directory with a lot of files.

Incremental Backups

When you have a lot of files to backup you do not have enough space to make a full copy every day and it would also take a long time. So you might want to choose to make incremental backups during the week and only a full backup on the weekend. You can tell tar to only take files newer then the date of some other file or some date. You can also tell tar to take the list of files that should be in the backup from an other file. You then would first create the list by some other tools and then tell tar: take this list and backup everything there.