aboutsummaryrefslogtreecommitdiffstats
path: root/README
blob: ba12f95abc3e4f8fdc7963b6c64165e2af451273 (plain) (blame)
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
SquashFu - a combination of fun, squashfs, joy, happiness, aufs and rsync
===================================================================
REQUIREMENTS: aufs, aufs2-util, squashfs-tools, rsync

SquashFu is currently beta quality. While there may be some undiscovered bugs,
the major components and structures in place will not change unless absolutely
necessary. This means that backups created today will still be useful 6 months
from now (if your config supports this).

Mini FAQ:

What currently works?
 - Regular incrementals. Execute with sudo and the -B option
 - Automatic merging incremenetals down to MIN_BINS once MAX_BINS has been exceeded
 - Rolling back. Execute with sudo, the -R option, and a number of backups to roll back
 - Reporting, executed with -Q, shows approx. disk usage and bin information

What's not yet included?
 - Options (don't expect a lot of them)
 - A lot of error catching. While rsync takes care of itself and you'll suffer no
   damage by aborting rsync in the middle of a backup, you could easily destroy
   things by, for example, setting MIN_BINS greater than MAX_BINS.

What not to expect?
 - Elephants
 - Salvation

If you don't care about your cat...
===================================================================

Goal: To create a backup solution which provides incremental backups and compression,
      and which also provides an easy way to roll back.

Design:
  A directory structure is created as follows (with some terminology included):
      backup_root/
       |- seed.sfs  <-- squash, or seed
       |- .bin.list <-- bin inventory list (or binventory)
       |- ro/       <-- squash mount point
       |- rw/       <-- union mount point
       |- .bins/    <-- incrementals
           |-1/
           | .....
           | .....
           | .....
           | .....
           | .....
           |-n/

  seed.sfs is created from an initial backup and compressed using SquashFS, which is
  simply a read only filesystem which focuses on compression. It's mounted, using a 
  loopback device, on ro/.

  At the time of the backup, the next available bin is determined, created, and logged
  to an inventory sheet with a timestamp. A union is created with all the available bins,
  mounted in reverse chronological order on top of the seed (newest to oldest) on rw/.
  At this point, the union represents the state of your files at the end of the last
  backup. The newest branch is marked as read/write, and rsync is called. Because this
  top branch is the only writable location in the union, the files rsync generates with
  the -u (update) flag are placed into this branch. The backup finishes, and the union
  and seed are unmounted.

  At this point, Squashfu ensures compliance with the user's settings of MAX_BINS. If
  the current number of used bins exceeds this value, a new seed is generated. The 
  number of old incrementals merged into the new seed is determined by the difference
  between MAX_BINS and MIN_BINS in the config file. In this way, you always have
  MIN_BINS available to roll back to, but you're not forced to recompress your seed
  at every backup -- an operation that may take a long time depending on how big
  your backup source is.

  If and when you want to roll back, execute Squashfu with the -R action, and supply
  the number of bins you want to roll back. The bins are ordered chronologically,
  and the oldest "number_of_bins - bins_to_rollback" are mounted on the union mount
  point.

WARNING:
  You should not, under any circumstances, add or remove files contained in the bins, 
  nor should you alter your binventory's time stamps. Doing so can result in non-recoverable
  damage to the integrity of the backups.

Further reading:
  http://en.wikipedia.org/wiki/Aufs
  http://en.wikipedia.org/wiki/UnionFS
  http://aufs.sourceforge.net/
  http://en.wikipedia.org/wiki/SquashFS
  http://en.wikipedia.org/wiki/Rsync


INSTALL (if you're not on Arch Linux)
-------------------------------------
-Copy squashfu.conf as /etc/squashfu.conf
-Make a backup directory somewhere where you want your files
-READ OVER /etc/squashfu.conf and set it the way you want
-Run manually with 'squashfu -B'
-If everything goes well, make a cron job to run this bad boy at the time you want
-Hang onto ya nuts. As this is still an alpha, I take no responsibility for
 loss of data, and I likely don't care much about bug reports.


KNOWN BUGS
------------------------------------
1/11/2010 (Kernel: 2.6.32.3)
    Issue: Aufs takes a long time to unmount.
    Fix:   If using Ext4, mount with the 'nobarrier' option. No resolution known
           at this time for other FS's.


TODO
---------------------------------------
In no particular order....

- Slim down calls to unmount seed. No harm in leaving it mounted mid-run.
- Create options parser (and determine options)
    - specify alternate config file
    - report usage after backup completes
- Fix output funcs for non-color scenario
- Add semantic error checking for config
- Create man page