aboutsummaryrefslogtreecommitdiffstats
path: root/DOCUMENTATION
blob: 9d346b4157a2b44e4ab9bce7a4870c527ef0d89f (plain) (blame)
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
SquashFu - an alternative backup solution
  Inspired by http://forums.gentoo.org/viewtopic-p-4732709.html

Requirements: aufs, aufs2-util, squashfs-tools, rsync, bc

Goal: To create a backup solution which provides incremental backups and compression,
      and which also provides an easy way to roll back.

Design:
  A directory structure is created as follows (with some terminology included):
      backup_root/
       |- seed.sfs  <-- squash, or seed
       |- .bin.list <-- bin inventory list (or binventory)
       |- ro/       <-- squash mount point
       |- rw/       <-- union mount point
       |- .bins/    <-- incrementals
           |-1/
           | .....
           | .....
           | .....
           | .....
           | .....
           |-n/

  seed.sfs is created from an initial backup and compressed using SquashFS, which is
  simply a filesystem which focuses on compression, but is read only. It's mounted,
  using a loopback device, on ro/. Using aufs2, a union mount is formed by ro/ and each 
  of the numbered bins, each corresponding to an incremental backup.

  At the time of the backup, the next available bin is created and logged to an inventory
  sheet with a timestamp. A union is created with all the available bins, mounted in
  reverse chronological order on top of the seed (newest to oldest) on rw/. At this point,
  the union represents the state of your files at the end of the last backup. The newest
  branch is marked as read/write, and rsync is called. Because this top branch is the
  only writable location in the union, the files rsync generates with the -u (update)
  flag are placed into this branch. The backup finishes, and the union and seed are
  unmounted.

  At this point, Squashfu ensures compliance with the user's settings of MAX_BINS. If
  the current number of used bins exceeds this value, a new seed is generated. The 
  number of old incrementals merged into the new seed is determined by the difference
  between MAX_BINS and MIN_BINS in the config file. In this way, you always have
  MIN_BINS available to roll back to, but you're not forced to recompress your seed
  at every backup -- an operation that may take a long time depending on how big
  your backup source is.

  If and when you want to roll back, execute Squashfu with the -R action, and supply
  the number of bins you want to roll back. The bins are ordered chronologically,
  and the oldest "number_of_bins - bins_to_rollback" are mounted on the union mount
  point.

WARNING:
  You should not, under any circumstances, add or remove files contained in the bins, 
  nor should you alter your binventory's time stamps. Doing so can result in non-recoverable
  damage to the integrity of the backups.

Further reading:
  http://en.wikipedia.org/wiki/Aufs
  http://en.wikipedia.org/wiki/UnionFS
  http://aufs.sourceforge.net/
  http://en.wikipedia.org/wiki/SquashFS
  http://en.wikipedia.org/wiki/Rsync