CdBk Manual
v 0.6.x - 2401
http://cdbk.sourceforge.net/
Licensed via GPL-v.2
Abdullah Ramazanoglu <aramazan@users.sourceforge.net>
CdBk is a CD-RW backup utility aiming convenience and efficiency :
Needs minimum CD intervention. Fills CDs 100% with individual .bz2
files. Saves all special files and attributes. Works online or batched :
No realtime CD change needed even for multi-CD backups.
1. Introduction :
There are quite a few backup utilities out there on the net. So it can
be a time consuming search to find the most suitable one for specific needs.
In this intro I will try my best to describe CdBk as plain and clear as
possible so that you should see whether it is suitable for your needs without
having to study it further, or without having to download and try it yourself.
1.1 A Quick Glance :
This is CdBk, a server class CD based backup utility to manage gigantic
amounts of backup data with the least administration hassle possible. It
is written in bash, uses bzip2 to compress files individually, fills up
CDs to full capacity, uses postponed-work scheme (phasing) for excess data
that doesn't fit in the CD, doesn't need online CD change, backs-up in
ISO-9660 (RockRidge) format so that you can access the backup as a regular
file-system, also backs up special files (devices, sockets, symlinks etc.)
and full attributes (dates, ownerships, full permissions including SUID/SGID/Sticky
bits etc.) so that you can back up system as well as user data, can be
run in background as a cron job, as well as online from a terminal. It
is very simple to use.
While it is originally written for best CD exploitation in server backups,
it turned out to be equally well suited for personal uses.
It is open sourced, and for developers, the code is well commented in
case you need to customize it for your own specific needs.
Among other features, CdBk has several major features which distinct
it from other CD based backup utilities :
-
It fills up the CDs to full capacity with individually compressed files
: Having both of these features together is unique to CdBk, so far as I
know.
-
If a backup run cannot fill the last CD to full capacity, it continues
on the same CD in following incremental runs, and doesn't require a new
CD unless current one gets filled up : No partially filled "last CD in
the set" per backup run. And no unnecessary intervention (CD change) per
backup run.
-
It doesn't need CD change on the fly (uses no more than one CD per run)
: No administrator intervention is needed while it is running. (However
admin intervention is "loosely" required (see 5) to change the CD between
two runs of CdBk when the CD is filled up in previous run.)
-
While it postpones excess data to following runs, it adapts itself to then-current
state of system at each run. So, assuming average 1.5GB per CD via compression, it
is perfectly OK to back up a 75GB live system in working hours, 10 CDs-worth
of data a day throughout the week, without having to freeze the system,
and without any discrepancy in the resulting backup set. (See "Advanced
Usage" chapter for live system and database backups.)
-
Once full backup is done, you can throw in the last (partially filled)
CDRW, schedule CdBk to run periodically at nights for incremental backups,
and just forget the CDRW there in the tray. CdBk will not ask you to change
the CD unless it gets full, and even when it does, CdBk will request you
so in the message log and postpone excess data for the next run. From then
on, CdBk will not take further backups on that CDRW anymore unless you
change it, protecting filled-up and "closed" CDRW against omittance. So
if you forget to change CD for several days (or several months for that
matter) it is no problem : Postponed incremental to-do list will just grow
bigger and bigger. When the CDRW is changed, all cumulated data gets flushed
to CDRW at the next run.
These were also the main reasons why I wrote CdBk. For a more detailed
discussion read
1.3 Why CdBk .
Look at the following links for full
features
and to-do list , as well as known
limitations and
bugs .
These features have some side effects though, which are:
-
It doesn't generate a CD-set that contains everything to be backed-up in
each run. Instead, it backs up at most one CD-full of data, and postpones
the rest of the work for the next run. The next run can be started right
after the previous one, or the next day, or next week, etc. What does it
boil down to? Consider a file "save.me" that is to be backed up, which
is last modified on 20th day of the month. We run CdBk on 21st. Because
CD has gotten full before this "save.me" file can be backed up, it is postponed
for the next run of CdBk, which is due one week later (28th). Now this
postponed "save.me" file is re-modified on 23rd. What happened to 20th-day
copy? It has not been backed up, so it is lost.. But here is some relief:
Normally CdBk should be scheduled with such a frequency that average
amount of modified files between two runs of CdBk would not exceed a CD
capacity. Then "postponing" will happen seldom. And "losing a previous
version due to postponing" should be still lesser a probability (But
possible). Note that, since CdBk currently creates single-session CDs (squashing
older version files on CD with the newest one), this effect is hidden
for the time being. But it will surface when CdBk starts supporting
multi-session CDs in future. See [3] for that.
-
Backup levels don't reflect a snapshot in time. They simply refer to the
number of CD in a set. "Master backup" (Level 1) simply means CD-1, even
though there were 10-CDs worth of data eligible for backup. In this sense,
there is no "master-backup" CD-set and no "incremental backup" CD-sets.
Instead, there is one backup set that grows up. In other backup utilities,
one usually takes a master backup and gets e.g. a 10-CD backup set that
snapshots a point in time. Next week an incremental backup is run and 2-CD
incremental backup set is produced. So on, each backup run gets an absolute
or relative snapshot of current time. But in contrast with those utilities,
in CdBk, one runs CdBk initially and gets one and only one CD (i.e. CD-1)
filled to full capacity (so called master backup in CdBk terms). Remaining
9-CDs' worth of data postponed to next run. Next run, which is "incremental"
in CdBk terms, produces again one CD (CD-2) with 8-CDs' worth of postponed
data. So on, in 10 runs there is no more data to be backed up. These 10
runs can take 10 hours if CdBk is run back to back, or 10 days if it is
scheduled everyday, or 10 weeks if scheduled each week.
-
While CdBk utilizes the same CDRW in repeated runs
(until CDRW is filled up), backup operations on partly filled CDRWs will
cause previous version of a file be replaced by the newest version. Consider
the "save.me" file that gets modified all the time (say, twice a day).
Let us run CdBk everyday, which fills a CD in 5 days on average, with incremental
backups. The "save.me" file will get backed up in day-1. Then newer version
will be saved on day-2 replacing previous day's version. This way, when
the CD is filled up in day-5, all the previous versions of "save.me" will
be lost except the latest one. If "save.me" has been deleted from system
by the 4th day, then no save.me file will exist in backup CD. This is because
CD is blanked and content is recreated each time CdBk is run (it
has the benefit of using up less CD space). This, in turn, is because CD
is created as single-session. In the future a multi-session option will
probably be added, which will keep each incremental backup in a separate
session on the same CD. Then the user will be able to choose preferred
behavior (single or multi session). Until then, multi session effect can
be obtained (in a somewhat inefficient manner) by using a fresh CD each
time cdbk is run.
-
CdBk is currently geared towards CD-RW media, because of this single-session
capability and blank-recreate cycles. CD-R media is also OK, but either
CdBk must be run less frequently so that CD is (mostly) filled up with
incremental data, or some CD space wasting will result. This will also
change in future releases with multi-session feature.
-
So, if you want to be able to revert back (a file or system) to a specific
date, CdBk is not for you : With CdBk you can only recover the last version
of a file in a CD. However, when the CD gets full and a new CD is started
to be filled up, then previous CD's content is frozen. In other words,
new versions of a file will only affect the last CD in the set. That is,
some files (with relatively higher modification frequency) will be duplicated
in various CDs of the backup set, enabling you to recover the version of
your choice. With multi-session this will be mostly solved, but there will
still be a possibility of "loosing a file version due to postponing".
-
If you must have all the files (that are eligible for an incremental backup)
be backed up in single run, i.e. if you can't afford today's work be postponed
to tomorrow, then CdBk is not for you : CdBk will backup until CD gets
full. Then it will log a message asking for a CD change and it will exit,
delaying rest of the "backup work" to next run.
-
It can not be used to create an emergency system recovery CD : It needs
at least a minimal system up and running for restoral. For creating a bootable
backup CD see Mondo .
1.2 Typical Uses :
Typically you will use CD-RW discs, and use cron to schedule CdBk daily
or weekly past midnight. (You can also use it online for personal use.)
CdBk will grow incremental backups on the same CD until it gets full. When
CD fills up, it will not prompt you a "Please Change CD"
sort of message and wait for you to change the CD. Instead, it will log
a message (if running online from a terminal, it will also issue the same
message on screen) requesting you to change the CD and simply exit, postponing
remaining part of the backup for the next run. Unless you change the CD,
it will not take further backups at following runs over the filled-up CD
(protecting the filled up and "closed" CD against delayed admin intervention).
This is where CdBk shows one of its strengths: It will not require you
to change the CD at 5:00 am in the morning! Instead, it keeps an exhaustive
to-do list (list of files to be backed up). At each run CdBk grows this
list incrementally with new entries (if any), and writes this list on CD
till it gets full (with rest of to-do list postponed to next run) or the
list is successfully backed up (fitted) completely on CD. As long as your
average
daily incremental backup size is less than the compressed capacity of a
CD (avg. ~1.4G), daily fluctuations in backup size, be it 5 MB or 5 GB,
won't affect your usual working pattern : Each morning (or each Monday)
you just look at the message log, and change the CD if so requested.
Here is how I use it for personal machines and servers :
On my personal PC at home, for a full backup I run it online (with
-m flag for the very first run), and if it issues a "Change CD" message,
I change CD and run it again (without -m flag), until it doesn't
issue a CD change request. After the full backup is done, I run cdbk
(without
-m) once a week, online from console, putting the last CD in tray.
If/when it issues a CD change request, I put a new CD in tray and run it
again (no -m). In short, I don't do scheduled backups on home
PC.
On servers, for a full backup I get sufficient number of blank CDRWs,
insert the first one, and run CdBk online (with -m flag for the
very first run) in the morning of a working day. It issues CD change request
while exiting. I insert the second CDRW and re-run CdBk (no -m)
and go on like this. Is it evening before doing the whole backup work?
No problem. I leave for home, and continue in the morning of next day from
where I left off. I go on like this until no more CD-change request issued,
i.e. full backup is done. Then, I keep the last CDRW in tray, and schedule
CdBk to run everyday at 4 am. Then I check CdBk message log once or twice
a week and put in a new CDRW if so requested. That's it.
As an extreme example, you can even build up an initial backup in a
very extended period of time: Assume that you have a system with 65 G of
backup-worthy data. (With a conservative average of 2:1 overall compression
ratio, this should fit in at most 50 CD's of 650M each). And assume that
each day your system produces an average of an additional 100M incremental
data to be backed up : You just insert a blank CD/RW in tray at working
hours, clear history of CdBk (to force a master backup), and cron cdbk
to run every day at 4:00 AM and leave for home. Tomorrow morning you will
find the first CD filled up, and a message in CdBk message log asking you
to change the CD. You change CD, and the next morning you will find it
filled up too. Going on like this, in 55 days or so you will have your
whole system backed up (including incremental newcomer files). Now that
you have everything backed up, you put the final CD-RW in tray, and just
forget it there for two weeks, until it gets full. Now your administrative
burden is simply looking at the message log once or twice a week, and change
CD approximately twice a month. By the way, you can leave work for 2 months
for a trip, and when you return back, you will find that the CD you left
in tray has been filled up correctly, and 2 months' worth of to-do list
(4 CDs in our example) has been grown, waiting for a CD-change. Changing
a CD a day, in 4 days you're again on the track.
Starting with CdBk-0.6.2 it is also possible to sync to an external
backup. With this scheme, you would first take a full backup of your system
using another backup manager (probably a tape backup). Then, with CdBk
you would take master backup synced to the full backup on tape. Such a
synced master backup is actually an incremental backup relative to the
full backup on tape. From that point on, you go on with usual incremental
backup cycles using CdBk. The net result is, you will have a full backup
on tape, and all the increments on CDs. See "-M" option for details.
Alternatively, since there are ways to consistently
backup a system in working hours (Never live databases! We will discuss
database
backups later) you may start master backup in Monday morning, run
cdbk back to back and fill up 9 CDs at working hours in Monday, and
leave the 10th for night run. Continuing like this, 10 CDs a day, by Friday
evening you will have the full backup set. And with the last backup run
by Friday night, you will have synced back your backup set to a consistent
state. From then on, since each incremental backup run will (assumingly)
be scheduled to night times, you will always have a consistent backup set.
This scheme has some peculiar characteristics too. For one, backup levels
does not necessarily represent a specific date, as in regular incremental
backups. Instead, each backup level refers to a CD, which can have files
with way different time-stamps. In this sense, "master backup" simply means
CD number-1. All the other CDs are "incremental" from CdBk's point of view.
In other incremental backup schemes, incremental means "since the last
time a backup was taken". In CdBk it means "since the time previous
CD has been filled up and archived" (which is equivalent to "since
the last time a backup was taken on previous CD" or, "since the freeze
time of previous CD")
BTW, while it is primarily designed to run in background for servers,
it is equally at home for online foreground running on a personal machine,
(or on an attended backup of a server) as long as you are willing to wait
till it finishes, change the CD, run it again, wait again... until it doesn't
request CD-change anymore. Once everything on your home machine is backed
up, insert the last (partly filled) CD/RW in tray every other week and
just run
cdbk : It will add last two weeks' incremental backup
onto CD.
Lastly, there are two additional utilities provided, one for listing
CD-set (or a specific CD) contents, and one for restoral. Since backup
CDs are in ISO-9660 RockRidge format, you can simply mount them and copy
back whichever files you want. But, both to warrant that files are restored
back ditto with their original attributes, and to automatically decompress
only those files that were compressed on-the-fly while backing up,
a special restore utility (cdrest) comes handy, especially when restoring
high-level or crowded directories. See cdlist
and cdrest for more information on them.
1.3 Why CdBk :
I have a server and limited budget, which translates into using CD/RW
for backup. Apparently there are a lot of backup utilities out there on
the net, so I studied most of them for a week, downloading and trying more
promising ones. Unfortunately all of them had at least one of the following
drawbacks :
-
Some were ex-tape-backup tools modified as an afterthought to backup on
CD too: With clumsy CD handling capabilities and most of the drawbacks
below.
-
Some use CD to its full capacity, at the cost of chopped tarballs : Few
use tar, most use afio, with "multivolume" flag turned on. With afio's
"compress on the fly" feature it is possible to
produce compressed backups. And since afio can also be instructed to produce
chops of 650M files with multivolume feature, it is fairly easy and efficient
way to both have compressed CDs, and fill them upto full capacity too.
Here comes the drawbacks: Since output is actually a chopped-up set of
one giant monolithic chunk, afio (or tar) needs many CDs together sequentially
for a simple file restoral. Also, a lost CD or sometimes a corruption makes
the whole CD set useless. Also, since it generates all the CDs in one
run, an afio/tar based utility stops short and asks (and waits for) you to change the CD every
once in a while : You are effectively tied-up to the machine until the
whole backup operation finishes. Also, in most cases the last CD is utilized average
50% capacity: When afio is done with a backup, remaining space on the last
CD is wasted. I would still have gone for an afio based utility, if only
afio would allow to close and open an archive at multivolume boundaries,
i.e. produce multiple afio archives, one per volume, instead of -alas-
chopping them : I could then modify the utility so that I would get an
afio archive consisting of self-contained CDs. Then I could try to live
with other drawbacks, instead of writing CdBk.
-
At least one of them (as I recall) was filling CD to full capacity with
individual files, but without compression.
-
They invariably require you to cater for their CD change needs online.
They stop short, (don't exit) requesting you to change the CD - now!..
If you have a rather big data to backup (say, tens of CDs) this can be
a "weekmare".
-
In most of them, each incremental backup run needs its own CD set, with
the last CD having an average of 50% utilization. In the end, for e.g. 30
daily increments, an average of 15 CDs wasted. Also, since a busy file
will be included in most/all incremental backups, there is a high level
of duplication (which is also a blessing sometimes). Yes, CD is cheap, but juggling
is not: For each incremental backup, user is expected to change at least
one CD. It may still be OK to feed another CD for each run, regardless
of whether previous one is fully utilized or not. But what if an incremental
backup didn't fit on a single CD and you are asked to change CD in the
middle of the night? Perhaps I could use 2 CD/RW drives, preload them with
blank CDRWs, and tell the backup utility to switch to next drive when the
first one is filled? Well, I don't know any CD based backup utility with
such a feature. Even if it does exist, then what if a specific incremental
backup needs 5 CDs to complete? (E.g. several users "saved" most of their
local disk to net-drive that day, and you are incrementally backing up
that net-drive at 4:00 am in the morning, by scheduled / unattended run
of the backup tool.)
In server world, people are used to DLT tapes, magazines etc. fancy stuff:
There was no really usable CD/RW solution for a server on the net. Existing
ones were rather geared towards personal use, to be run online and to handle
at most 10 CDs or so, with some intervention requirements and CD space
wasting. This is why I wrote CdBk: It efficiently manages backing-up of
very big amounts (relative to CD capacity) of live data onto CD/RW discs,
requiring next to no administration (again relative to CD capacity).
Additionally, this server side approach doesn't affect its usability
for online personal use at all. Perhaps a GUI and some context sensitive
help could have been added (ideas welcome). But CdBk is so simple to use,
a GUI is hardly needed (perhaps for initial setup). Currently there is
no setup utility : Just untar it, specify which directories to be backed
up in cdbk.include file, and just run it. While there is cdbk.conf
file for configuration , and there are command
line options to override them on per-run basis, most parameters, (e.g.
CD drive's emulated SCSI address, its device file, mount point, writing/blanking
speed etc) are defaulted to such values that it will work out of the box
80% of the time. A typical use for master backup (i.e. CD #1):
which is equivalent to
./cdbk -l 1 (hyphen ell one)
or, if you just installed it, (empty history will force master backup)
A typical use for following (incremental) backups:
For each incremental backup just issue that command: It will automatically
decide which level you were at, whether a CD change was due, whether it
has actually been changed, etc. and act accordingly. These auto-decisions
are done depending on cumulative history records kept in
"cdset" subdirectory
of CdBk. Specifying
"-m" option, or just deleting (or corrupting
the contents of) this directory forces a master backup (not without user's
confirmation!), which also clears and recreates
"cdset"
from scratch.
It is that simple.
So, this concludes introduction part of the manual. If you are still
unsure whether CdBk is for you or not, or if you decide to give it a try
and later on see that it is not really what you have expected, that means
this intro didn't live up to its promise (to be sufficient enough to give
full perspective of the utility). In this case please drop me
a mail about it, so that I could either update the manual or include the
feature or fix the bug. Thank you.
2. Installation and Initial
Setup :
2.0 Requirements :
[To be completed.]
CdBk is written on Gelecek-Linux
1.1 (a RedHat 7.1 derivative) and
initially tested on Gelecek-1.1 and RedHat-7.1 on Intel architecture only.
However, it is a processor independent bash code. Currently known platform
status is listed in Table-1. If you run it on other versions / systems
/ architectures please let me know about it (you can use Table-1 as a template)
so that I can add your platform in next release of this manual.
Table-1 : Currently known platforms that CdBk is reported to run
out of the box.
Operating System
|
Version
|
Tested Architecture
|
Gelecek Linux |
1.1 |
x86 |
RedHat Linux |
7.1 , 7.2 |
x86 |
Here are the components used, with their versions on original development
platform (Gelecek-1.1) :
Components below are already installed normally on most of the Linux
systems with a CD-RW drive, I guess. Older version components may work
(newer versions should work) though not tested. Not all programs
that come with these components are actually used by CdBk. Used ones are
given in parenthesis :
kernel-2.4.2-2
bash-2.04-21
cdrecord-1.9-6 (cdrecord readcd)
mkisofs-1.9-6 (mkisofs)
bzip2-1.0.1-3 (bzip2)
diffutils-2.7-21 (cmp diff)
sh-utils-2.0-11 (date nice)
textutils-2.0e-7 (comm cut sort tail tr uniq ...)
fileutils-4.0w-3 (cp dd df ln ...)
findutils-4.1.5-4 (find)
losetup-2.10r-5 (losetup)
mount-2.10r-5 (mount)
e2fsprogs-1.19-3 (mke2fs)
sed-3.02-8 (sed)
grep-2.4.2-3 (grep)
Other requirements:
-
Loopback device support must be available / enabled in kernel. (/dev/loop*)
-
At least [CD capacity + 10M] of free space on hard disk. (Double
it for buffered CD burning)
-
1000 BogoMips
or higher CPU power recommended for quick compression : The more CPU power,
the shorter it takes preparing CD content. 1000 BogoMips should prepare
650M CD roughly in an hour or so. (Add to this, burning and optionally
verifying time to process a full CD). Some real-world examples :
-
Celeron-475 (950 BogoMips) with 64M RAM, 4X blanking and 4X unbuffered
burning, including post-verification, takes about 2 hours to process a
full 650M CDRW.
-
PIII-866 (???? BogoMips) with 384M RAM, the rest is same as above, takes
about ?.? hours.
2.1 Quick Start :
-
Login as root user.
-
Download cdbk-0.6.x.tar.bz2
and put it in a directory where only root has access (e.g. /root
), and change to that directory.
-
Uncompress and untar it with :
# tar -jxvf cdbk-0.6.x.tar.bz2
|
Or...
# tar -Ixvf cdbk-0.6.x.tar.bz2
|
depending on your version of tar.
If your version of tar doesn't support either of them, then
you can also do it in two steps, with :
# bunzip2 cdbk-0.6.x.tar.bz2
# tar -xvf cdbk-0.6.x.tar |
-
Change to CdBk home directory :
-
Edit following files (at least cdbk.include) for your specific
needs:
-
cdbk.conf : This is the main configuration
file for CdBk. All the documentation needed for configuration is also included
in this file. Just edit it with your favorite editor, read instructions
for each parameter and change parameter values as needed. Note that if
you have a fairly standard system configuration, then in most cases leaving
cdbk.conf
as shipped should be sufficient (and some times more efficient) for proper
operation. To name them specifically, you need NOT touch cdbk.conf
if all the statements below are true for your system.
-
Your CD-RW drive's mount point is /mnt/cdrom
-
It is defined in the system (/etc/fstab) so that it is mountable
by just mentioning its mount point (i.e. it can be mounted with "mount
/mnt/cdrom" command).
-
Its pseudo SCSI address (as reported by cdrecord -scanbus ) is
0,0,0 which is so if you have no SCSI device/controller on your machine,
and you have only one CD-Writer of ATAPI (IDE) type.
-
You will write in 4X speed. (Either you use 4X CDRW discs or your CD-Writer's
max write speed is 4X)
-
Your /tmp directory has approximately 1.4 GB of free space. (Otherwise,
you must either specify another directory with such free space and/or run cdbk
with "-i" option (equivalent to specifying ISO_IMAGE=0) to reduce free space
requirement to 700M)
-
You prefer to burn in buffered mode. That is, prepare ISO image of CD first,
and then burn it from the ISO image, which helps prevent buffer underruns
in stressed conditions (e.g. 20X or faster CD-R media, high I/O load on
system, etc.). Buffered burning has a second advantage of fast (and with less
strain on CDRW drive) verification. However, this doubles the size of free
space requirement in /tmp directory, compared to unbuffered burning.
Note that, this buffered burning has nothing to do
with on-the-fly compression. A CD-shadow tree on hard-disk, representing
CD content with individually compressed files is prepared separately before
creating (and burning) the ISO image.
-
cdbk.include : This file contains the
list of directories to be included in backup (recursively with subdirectories),
one directory name per line. Directory names must be given here as absolute
(complete) paths. Automatic "TEMP" directory exclusion mentioned below
also overrides cdbk.include file. Editing this file is usually
necessary. If you don't, then following directories will be backed up by
default :
-
cdbk.exclude : Contains regexp list
of file and subdirectory names to be excluded from the backup, one entry
per line. If an exclusion list entry is not already covered by inclusion
list, that exclusion has no effect. For instance, default inclusion list
doesn't cover /mnt directory, but default exclusion list excludes
/mnt/* elements. Such exclusion of "unincluded" directories has no
effect. Note that, the "TEMP" directory (which is /tmp by default)
given in cdbk.conf will be automatically excluded, regardless of what you
specify in cdbk.exclude or cdbk.include files. Please
also note that regexp notation is somewhat different than shell wild cards.
See 2.3 Specifying Regexp Lists for details. As default,
any core dumps, any lost+found directories, any tmp
directories,
/proc and /mnt directories, as well as files that end
in ".bak" or "~" will be excluded,
-
cdbk.nocompr : Contains regexp list
of file names (case ignored) that are not to be compressed in backup, one
name per line. This is to prevent compressing already-compressed format
files in vain: Hardly produces smaller size, while extending CdBk's run
time. Examples are jpeg, mp3, all sort of zip-like formats, rpm, etc.
Default cdbk.nocompr file as it is shipped, should normally suffice.
If you come accross other compressed file formats that are omitted
from cdbk.nocompr please drop me
a mail about them. Thank you.
-
Setup is done. Note that cdbk executable must not be put into
another directory. Everything that come with the package should reside
in the same installation directory. There is no setup option to arrange
directories, and CdBk will assume the directory in which it resides as
its working directory. Also note that, you must never run
cdbk through a hard or symbolic link! Run it using its real name.
However, it is allowed to use either absolute or relative path-name.
-
Now take an initial full backup online : Label a CD-RW disc as "Backup
CD #1", insert it in the tray and issue,
command. (Use "-m" flag only for CD #1) If everything is fitted in CD #1,
then CdBk takes the backup and exits without requesting you to change CD.
Otherwise, a message similar to the one below is issued at terminal (as
well as message log) after filling CD #1, just before CdBk exits :
CdBk: Please change the CDR/W,
CdBk: Label the old one as 1,
CdBk: And label the new one as 2
CdBk: SUCCESSFULLY DONE. Please see the log with following command
:
bzcat <InstallDir>/cdset/<YYMMDD>-<Level>.log.bz2
| less |
"Successfully Done" message is always given if there were no errors. Here,
<InstallDir> is the directory where you installed (copied) CdBk,
<YYMMDD> is today's date in YYMMDD format, <Level>
is the CD number CdBk has just filled up. For instance, assuming that you
installed CdBk in /root/cdbk-06x , for CD #5 taken on January
03, 2002, this message would look like :
CdBk: SUCCESSFULLY DONE. Please see the log with following
command :
bzcat /root/cdbk-06x/cdset/020103-5.log.bz2
| less |
-
Change CD, and re-run CdBk with,
command (don't use "-m" flag). This will take remaining backup onto CD
#2 and prompt you again to change CD if there is still more to backup.
Going on like this, you will have backed up everything onto a CD-set.
Alternatively, you can complete this procedure in an extended period
of time, backing up in working hours, taking several CD's worth of data
everyday. In an extreme case, you can just take CD #1 with ./cdbk -m
command
online, and proceed to the "server style" incremental backup method described
below.
-
Now that full backup is done, you must decide on the method to take incremental
backups. First method, which is preferred for servers, is that you just
leave that last (partly filled) CD of the set in the tray, schedule
cdbk to run (without "-m" flag!) daily or weekly at night time, and
just forget about it. When CD-RW gets filled up, CdBk will log a similar
CD-change request into the message log. Depending on your backup scheduling
frequency, you must check the message log once in several days or weeks.
You are not immediately required to change the CD though : Don't worry
about CdBk writing over the previous CD at next scheduled run. When a CD
change is due, CdBk checks to see whether CD-RW in the tray is actually
changed, and if not, aborts immediately with an "Oops!" message similar
to below one :
CdBk: OOPS!
CdBk: This is the old CD that we just filled up and 'closed' in
previous run.
CdBk: You must have forgotten to change the CD.
CdBk: Please take this one out, label it 5,
CdBk: and put in a new CD labelled 6. |
-
Second method, which is preferred for personal use, is that once in a while
insert the last CD-RW of the set in the tray and just run CdBk via
./cdbk command. If it issues a CD-change request while exiting, do
that and rerun CdBk again via ./cdbk command. Or, at your option,
you can defer this second run with a fresh CD-RW to a later date.
-
To see the contents of your complete CD-set, either insert the last CD
and issue :
which will take history records from the actual CD-RW inserted.
Or, no matter what is inserted in the tray, just issue :
which will take history records from cdset directory in where
CdBk is installed, which in turn, is exactly the same history contained
in the last CD-RW.
-
To see the contents of just CD #3, either insert CD #3 (or any higher numbered
CD) in the set and issue :
Or, to get history records from cdset directory on hard-disk,
issue :
For more on this, see cdlist documentation.
-
While there is no need for a restore utility, one is provided, mainly to
automatically "bunzip2" only those *.bz2 files which are bzipped
on the fly when they were backed up. In other words this utility leaves
non-bz2 files as they are, as well as *.bz2 files that were
also originally in .bz2 format, and uncompresses other .bz2
files
as they are restored. To restore a single file (/etc/foo/bar.txt) to its
original location, insert the CD-RW containing that file and issue :
# ./cdrest /mnt/cdrom /etc/foo/bar.txt |
Or, to restore the "/etc/foo" directory to its original location :
# ./cdrest /mnt/cdrom /etc/foo |
Note that, contents of a single directory may be spread to several CD-RW
discs. To restore such a directory, put in the CDs one after another, oldest
one first, and latest one last!, and issue the same command above
for each CD.
To restore a complete CD, insert the CD in, and issue :
For more on this, see cdrest documentation.
2.2 Detailed Setup :
Setup is mainly done in cdbk.conf file.
You can edit it using your favorite text editor. Some of the parameters
given in this file can also be overridden by command line options. However,
certain parameters can only be given in cdbk.conf (while others
can only be given from command line). Here is a list of all parameters
and command line options. For detailed explanations of these parameters
please see cdbk.conf itself.
(Command-line options always override configuration parameters.)
Cmd line |
cdbk.conf parameter |
What it does |
Factory default |
-b
-B |
BURN_CD=0
BURN_CD=1 |
Burn CDRW or just generate ISO image file. |
-B : 1 : Burn CDRW. |
-c
-C |
CD_SANITY_CHECK=0
CD_SANITY_CHECK=1 |
Check CD of previous run. |
-C : 1 : Check it. |
-e
-E |
EJECT=0
EJECT=1 |
Eject CD when filled up. |
-e : 0 : Don't eject. |
-h |
N/A |
Help. |
N/A |
-i
-I |
ISO_IMAGE=0
ISO_IMAGE=1 |
Burn CD on-the-fly while generating ISO image (unbuffered). Or,
generate ISO image first, and then burn from it in two steps (buffered).
|
-I : 1 : Burn in buffered mode (Prepare ISO image beforehand) |
-l LEVEL |
N/A |
Take back-level LEVEL backup. That is, return back to LEVEL
(Be careful!) |
Calculate due current level dynamically. |
-m |
N/A |
Take an absolute master backup. Use this flag once and only once per
CD-set, for CD #1 |
N/A |
-M DATE |
N/A |
Take a synced master backup that is relative to another backup started
at DATE (in YYYYMMDDhhmm format) Use only once per CD-set, for CD
#1 |
N/A |
-s SIZE |
SHADOWSIZE=SIZE |
Size of CD shadow partition. |
Calculate dynamically. |
-v
-V |
VERIFY=0
VERIFY=1 |
Verify after burning. |
-V : 1 : Verify it. |
N/A |
CDRWDEV=Device_File |
CD-RW device file |
If not given (Null): /etc/fstab entry for CDRWMNT consulted.
If invalidly given : /dev/cdrom |
N/A |
CDRWMNT=Mount_Point |
CD-RW mount point |
/mnt/cdrom |
N/A |
CDSCSI=SCSI_Address |
CD-RW SCSI address as reported by cdrecord -scanbus command |
0,0,0 |
N/A |
BLANKSPD=Speed |
Blanking speed. |
4 |
N/A |
WRITESPD=Speed |
Writing speed. |
4 |
N/A |
TEMP=Directory |
Directory for temporary files. |
/tmp |
N/A |
CLEANUP=[0|1] |
Clean up upon exit. For debugging purposes. |
1 : Do clean-up. |
N/A |
DEL_CD_SHADOW=[0|1] |
Delete CD shadow upon exit. |
1 : Delete it. |
2.3 Specifying Regexp Lists :
Regexp (Regular Expression) list of file/directory names, or file
name extensions are used in cdbk.exclude and cdbk.nocompr
files.
For detailed explanation of advanced regexp usage you may refer to man
pages of awk or grep programs, but here is a simple and
sufficient explanation on them to cover our scope. Note that the analogies
drawn between regexp and shell pattern matching are not accurate.
2.3.1 Some Background :
-
A plain regexp is any substring in a full-path file name. E.g. regexp
/tmp
will match any "/tmp" string in file's full path name, which is
analogous to "*/tmp*" in shell pattern matching. For instance,
regexp /tmp will match :
-
/tmp
-
/tmp/somedir/somefile
-
/usr/tmp
-
/home/abdullah/tmp/somefile
-
/somedir/tmpdir/somefile
-
/somedir/otherdir/tmpfile
-
To specify a substring that matches from the beginning of full path name
of a file, place a "^" as the first character of regexp. For instance,
regexp ^/tmp will match any file name that begins with "/tmp".
This is analogous to "/tmp*" in shell pattern matching. Regexp
^/tmp
will match :
-
/tmp
-
/tmp/somedir
-
/tmp/everything/otherdir
But it won't match :
-
/usr/tmp
-
/home/abdullah/tmp/somefile
-
/somedir/tmpdir/somefile
-
/somedir/otherdir/tmpfile
-
To specify a substring that matches the end of the full path name of a
file, place a "$" as the last character of regexp. For example,
regexp log$ will match any file name that ends in "log".
This is analogous to "*log" in shell pattern matching. Regexp
log$
will match :
-
/monolog
-
/home/someone/somedir/some.log
-
/usr/bin/dialog
But it won't match :
-
/monologs
-
/home/someone/somedir/some.log.gz
-
/var/log/messages
-
To specify an exact path name, begin the regexp with ^ character,
and end it with $ character. For example, regexp ^/var/log/messages$
will exactly match one and only one file. This is analogous to
"/var/log/messages"
in shell pattern matching.
-
Dot "." character means "any character" in regexp. This is analogous
to "?" character in shell pattern matching. For example, regexp
.log$
will match /somedir/some.log as well as
/somedir/otherlog
or
/usr/bin/dialog files. If you want to match the dot character
itself, you must escape it with \ (backslash). For example, regexp
\.log$
will match /somedir/some.log but will not match
/somedir/otherlog
or /usr/bin/dialog files.
-
Asterisk "*" character means "any number of previous
character" in regexp. It is quite different from "*" in shell
pattern matching, which means "any number of any characters". For example
regexp /tmp* will match :
-
/tmp
-
/tmpppp
-
/tmpp/somedir/somefile
-
/usr/tmp
-
/usr/tmpppppppppppppppppppppp
-
/home/abdullah/tmp/somefile
-
/home/abdullah/tmppppppppp/somefile
-
/somedir/tmpppppdir/somefile
-
/somedir/otherdir/tmpfile
-
/somedir/otherdir/tmppfile
Which is not very useful as you see. To have the effect that shell pattern
matching has, you use .* characters together in a regexp, which
means "any number of any characters". Remembering that a regexp is already
a substring with implied "*" at the start and end, .*
notation is seldom needed for special needs. For example, regexp /home/.*\.gz
is
analogous to "*/home/*.gz*" in shell pattern matching. Regexp
/home/.*\.gz
will match :
-
/home/abdullah/somefile.gz
-
/home/zippie-whip.gzip
-
/home/gizli.gz
-
/tmp/come/home/please.gz
-
/mome/home/some.gz/femme
On the other hand, regexp ^/home/.*\.gz$ (note the "^"
and
"$") is analogous to "/home/*.gz" in shell pattern
matching. For the examples above, this will only match :
-
/home/abdullah/somefile.gz
-
/home/gizli.gz
2.3.2 Regexp Examples
for cdbk.exclude:
-
/core$ : Exclude any file with name "core" in any directory.
Analogous to "*/core" in shell pattern matching.
-
/lost+found/ : Exclude any lost+found directory in anywhere.
Analogous to "*/lost+found/*" in shell pattern matching.
-
/tmp/ : Exclude any tmp directory in anywhere. Analogous
to "*/tmp/*" in shell pattern matching.
-
^/proc/ : Exclude /proc directory. Analogous to "/proc/*"
in
shell pattern matching.
-
~$ : Exclude any file that ends with "~" character (i.e.
backup copy). Analogous to "*~" in shell pattern matching.
-
\.bak$ : Exclude any file that ends with ".bak" (backup
copy). Analogous to "*.bak" in shell pattern matching.
-
/\.cache/ : Exclude any directory named ".cache"
in anywhere. Analogous to "*/.cache/*" in shell pattern matching.
-
/\.netscap.*/cache/ : Exclude any cache directory under
any directory tree that begins with .netscap string. Useful to
exclude both .netscape/cache and .netscape6/cache directories.
Analogous to "*/.netscap*/cache/*" in shell pattern matching.
2.3.3 Regexp Examples
for cdbk.nocompr :
-
Don't compress files with arc, bz, bz2, deb, gif, gz, jpeg, jpg, lha,
mp3, rpm, taz, tgz, tpz, tzg, z, zip, zoo extensions :
\.arc$ : Analogous to
"*.arc" in shell pattern matching
: Don't compress
.arc files.
\.bz$ : Analogous to
"*.bz"
\.bz2$ : Analogous to
"*.bz2"
\.deb$ : Analogous to
"*.deb"
\.gif$ : Analogous to
"*.gif"
\.gz$ : Analogous to
"*.gz"
\.jpeg$ : Analogous to
"*.jpeg"
\.jpg$ : Analogous to
"*.jpg"
\.lha$ : Analogous to
"*.lha"
\.mp3$ : Analogous to
"*.mp3"
\.rpm$ : Analogous to
"*.rpm"
\.taz$ : Analogous to
"*.taz"
\.tgz$ : Analogous to
"*.tgz"
\.tpz$ : Analogous to
"*.tpz"
\.tzg$ : Analogous to
"*.tzg"
\.z$ : Analogous to
"*.z"
\.zip$ : Analogous to
"*.zip"
\.zoo$ : Analogous to
"*.zoo"
-
^/root/downloads/proprietary/ : Don't compress files under
/root/downloads/proprietary directory or subdirectories of it. Analogous
to "/root/downloads/proprietary/*" in shell pattern matching.
Concentrating proprietary-format compressed files into a single directory
and excluding this directory from compression is a good idea, because there
is no other way to decide on whether they are already compressed or not.
For example self decompressing *.exe files of Windows, or downloaded
*.bin files of StarOffice etc.
-
/zipped/ : Don't compress files under zipped directories
anywhere. Analogous to "*/zipped/*" in shell pattern matching.
3. Using CdBk :
3.1 Basic Usage :
This is the full syntax of cdbk command. You can use separate
"-"
for each individual option, or combine options together in a single string.
Options that need a parameter (i.e. -l , -M and -s) must be given separately,
or as the last option in combined options. Most of these options override
the corresponding parameters in cdbk.conf , the
main configuration file of CdBk. The exceptions are, -l , -m , -M and -h
which can only be given as command line options (no equivalent parameter
in cdbk.conf). Options to turn a flag
on or off are provided as capital and small letters. Always, capital letter
turns the option On, and small letter turns it Off. If not specified, the
corresponding configuration parameter in cdbk.conf
is defaulted. If it is neither given there (or an invalid value given),
then factory defaults are used.
cdbk [-b|-B] [-c|-C] [-e|-E] [-h] [-i|-I] [-l LEVEL|-m|-M DATE] [-s SIZE]
[-v|-V]
Where,
-b | -B : Whether to burn CDRW actually or not. With this option set,
CdBk will write the ISO image file into "$TEMP/cdbk-iso" instead of
burning CDRW. This comes handy if your system does not have a CD burner, but you
have access to another machine on your LAN with a burner. Overrides BLANKSPD
, EJECT , ISO_IMAGE , VERIFY , CD_SANITY_CHECK configuration parameters.
(Presets them to certain values, regardless of what you specify for them.)
-c | -C : Whether to employ consistency checks on mounted CD
in case CdBk continues on a partially filled CD. -c skips checking,
and -C does checking. Overrides CD_SANITY_CHECK configuration
parameter.
-e | -E : Whether to eject CD when it gets filled up. -e
doesn't
eject, -E does eject. Overrides EJECT configuration
parameter.
-h : Display help and exit without
doing anything. Can only be given from command line.
-i | -I : Whether to burn CD on the fly while preparing ISO
image in one go (-i), or prepare an ISO image before burning it
onto CD in two steps (-I). Overrides ISO_IMAGE configuration
parameter.
-l LEVEL | -m | -M DATE : Backup level to be taken. Use one of these
options only once to reset the backup cycle in some way, and then never use one
of these options again: Without one of these options, current level will be
calculated automatically. (Which is the normal way for everyday usage.)
These options can only be given from command line. Additionally,
-m or -M option can be given once and only once per CD-Set,
for CD #1.
-
To start a master backup use -m only once (for CD number 1).
-
To invalidate CDs later than level LEVEL and to return back to
LEVEL and continue from there on, use -l LEVEL option.
Note that all your history records for later than LEVEL will be
cleared, and CdBk will begin creating level LEVEL CD and up. This
can be used to re-create CDs later than a specific level, either to handle
a corrupted or lost CD situation, or to combine several CDs in order to
clean up duplicate files in CDs (which is equivalent to taking a differential
backup relative to LEVEL-1 backup set.)
-
If you use -M DATE
option, which is in YYYYMMDDhhmm format, CdBk will take an incremental
backup relative to DATE, and will call it "master backup", constituting
a base for all the following incremental backups. This option is provided
for using another means for full backup, and using CdBk solely for incremental
backups. To use this option, you must
first take a full backup, probably onto tape, with other means. Then, run
CdBk with -M option, supplying the DATE parameter pointing to one minute
earlier than the time tape-backup has started. For instance, if you have
started tape backup on 2002/01/13 at 17:49, then run CdBk with: ./cdbk
-M 200201131748 command.
-s SIZE : Size of virtual partition to hold CD shadow. It is
recommended not to supply this parameter at all, which lets CdBk calculate
it adaptively depending on the actual CDRW capacity in tray. Overrides
SHADOWSIZE configuration parameter.
-v | -V : Whether to verify CDRW against its shadow on hard
disk, after burning CD. Overrides VERIFY configuration parameter.
Normally, you must run CdBk as ./cdbk -m to start a master
backup. (Don't forget that a master backup is just CD#1, in CdBk terms.)
This clears all history kept on hard disk by CdBk, and starts a fresh backup
cycle with CD#1. After this first run, no matter whether CD#1 is partially
filled or not, you must run CdBk without -m , -M DATE or -l LEVEL options.
This is all there is to use CdBk. Other options can be specified with different
values at different runs in a backup cycle, without affecting the cycle.
-l LEVEL option is provided for freeing up CDs from level
LEVEL up to current level, and recreating them again. It effectively
reunifies different versions of same files on different CDs into a single
(latest) version, and thus, compacts CD-set. This is not an option to be
used everyday. It effectively takes a semi-master backup, depending on
to which CD number (level) you are returning back. So be careful with this
option.
3.2 Advanced Usage :
[ To be completed. ]
3.2.x Backing-up Live Inter-related
Files :
Inter-related set of files normally cannot be backed up while they are
live (in business hours). Otherwise there is a high risk of having a backup
set with inconsistent state of these files. This restriction also applies
to singly active files, though corruption risk may not be as much as file-sets.
So, do we have to stay night just to do full backup, or hire a night-shift
"backup operator"? Even so, what if full backup doesn't finish till morning
(quite possible with CD based backups), lock the doors? Here is a couple
of simple and easy ways to deal with long lasting full backups. They are
applicable to both CdBk and other backup managers. (Ideas welcome)
-
The most common approach is scheduling a disk-to-disk snapshot
or, if you mirror, stop mirroring (freezing the mirror) at night time on
a still system. Then, you can backup that snapshot or mirror in
working hours in an extended period of time. For most situations this is
a proven solution, but for a CD based backup it depends: If you have tens
of CDs' worth of backup data, then you won't be able to complete snapshot-to-CD
backup before several days. In this case you can either night-schedule
additional disk-to-disk incremental snapshots in the meantime, or you can
simply wait until full backup CD-set is completed before going on to incremental
night runs. In either case you need lots of disk space for the snapshot(s).
And if you opt for the latter case, then you might also consider the second
method as well :
-
Do your full backup in day time. As an added convenience, with CdBk, you
can take your time, getting 5 CDs' worth of backup a day and finishing
your full backup throughout the week - in working hours. Don't care about
the consistency of live files or file-sets for the time being. Note that,
since CdBk adapts itself to then-current state of all files at each and
every run, end of full backup will guarantee that the last-moment copy
of all files are contained in the resulting CD-set. After the full backup
finishes, schedule your incremental backups to nights, making sure that
none of the live files or file-sets are active at that time in the night.
If any of the files in the set has been changed during or after the creation
of the last CD of the full backup, then that file will go into incremental
backup. When you need your file-set back, just restore the latest
versions of all files in the set.
Up to this point there was an assumption that an incremental backup taken
at night would sync everything into order. In reality though, with the
postponing feature of CdBk, there is a problem with incremental backups of
inter-related file sets :
If any one (or more) file of a file-set is left out because of postponing,
then the others on CD will be of no value. That is, effectively, you will
have not backed up your file-set at all. Nevertheless, see below for a
work around before this is fixed in a future release.
In a future release
there will be an option to use more than one drives, and CdBk will switch
between them in a round-robin fashion. Using 2 drives, you will be able
to change a CD-RW, while the other one is being gradually filled up day
by day. Then there will be no postponing because of boundary effects, i.e.
because CD-RW was already on its last legs when you start an incremental
backup.
(OOPS! : E.g. in a file-set of 5 files, the first 3 gets on the CD#1, then
drive switching takes place, and remaining 2 gets
on the CD#2. Since CD#1 content is frozen while CD#2 is not, the next
incremental backup will get all 5 files onto CD#2, squashing the last 2
files from previous backup. This renders the first 3 files on CD#1 useless.
This is a big wastage, especially if we are talking about really big files.)
Of course, this will not prevent postponing if an incremental backup
takes more than the free space on the CDs combined. Such postponing can
only be prevented by using enough number of drives. So, since there is
always a possibility of postponing even with multiple drives, and since
the "Oops" above about drive switching, the circumvention
below is suggested even when multi-drive option becomes available :
-
Put all your file-sets under such top level directories that, when all
the backup eligible files are sorted with their full path names, the file-sets
will be at the top of the list : At each incremental backup, CdBk first
generates an exhaustive list of files that need to be backed up. Then it
sorts them by their full-path names, and starts backing up them onto CD
one by one, from the top of the list downwards. By placing the file-sets
under top level directories with such first-sorting names, you will guarantee
that they will be backed-up first before anything. This will prevent the
risk of CDRW getting full with part of the set is in the CD, while the
other part is left out.
A note to the future :
While it is early yet,
be warned that when multisession and multidrive options both become available
in the future, this work-around will not always work with "single-drive,
multi-session" combination. Because, if the CD was already
almost full before you start an incremental backup on top of it, then it is
possible that CdBk can only save -e.g.- 5MB of data before CD gets full,
causing level (CD) change and postponing with part of a file-set left out.
With single-session there is no such risk, because CD is blanked and then
filled up again, with first-sorting directories going onto CD first.
So, you must either use single-session (regardless of drive count), or use
multi-session with multi-drive option (2 drives).
Below are some "first-sorting" top level directory examples to put the
file-sets into :
/.Sorts/first/set1/*
/An/example/set2/*
/Sorts/third/set3/*
/_fileset4/*
/_save/me/fifth/filesets/set5/*
/not/recommended/set6/*
/.Naive/symlink/attempt -> /real/fileset/
(only the symlink itself will be backed up first!)
3.2.y Database Backup Strategies :
Short one : Never incrementally back-up database volumes. Otherwise,
you will end up backing up the whole database everyday. Instead, first
use the backup utility provided with your database to take a full / differential
/ incremental backup of the database onto disk. Then, use CdBk (or any
other backup utility) to incrementally backup your disk. Make sure that
the directory containing the output of incremental database backup is included
in, but the directory containing database volumes themselves is excluded
from the backup list.
4. Listing CD Contents -
cdlist :
Syntax : cdlist [N [CDRW_Mount_Point [CDRW_Dev]]]
List contents of CD number N, or if N is zero or not given, then list
contents of all CDs combined. For this, consult CD in tray (which must
have level-number N or more), or consult backup records kept on hard-disk
(which is equivalent to using latest CD in the set). CDRW_Dev is
normally not given. It is only needed if CDRW mount point is not defined
in /etc/fstab.
If "CDRW_Dev" is given, which is the device file for CDRW drive,
then "CDRW_Mount_Point" must also be given.
If "CDRW_Mount_Point" is given, which must be an existing
directory, then "N" must also be given (even if it is 0).
CDRW_Dev defaults to what is defined in /etc/fstab for
CDRW_Mount_Point.
CDRW_Mount_Point defaults to home dir of CdBk on hard disk.
(which means CdBk will consult the records kept on hard-disk, instead of
CD)
N defaults to 0 (which means "contents of all CDs").
Without N (or with N=0) it lists all CD's combined, in which case CD
number and backup date is reported for each entry. With a nonzero N (between
1 and the number of CD in tray) it lists specified CD's content, in which
case backup date of the CD is reported only once.
If CDRW_Mount_Point is not given, then history records kept
at [CdBk_Base_Dir]/cdset directory is consulted, which is equivalent
to consulting the latest CD in the set.
Typically you would either run it as,
to list everything in the whole CD-set, or you would use,
to list everything in CD #5. Since the records kept on hard-disk by CdBk
covers all CDs, there is no need to consult the CD itself. But if you suspect
a corruption in
[CdBk_Base_Dir]/cdset directory (where the records
are kept) then you may prefer consulting the actual CD itself. Another
reason for consulting CD is that, you may want to browse the CD-set of
another machine, or an ancient CD-set of the same machine. For example,
you have been maintaining CD-set-1 for a while, and then you archived those
CDs, brought brand new CDRW discs, and started a new -master- backup cycle
on them (CD-set-2) via
"./cdbk -m" command. Now your history on
hard-disk is in sync with CD-set-2. Now, if you ever want to browse CD-set-1,
you must first insert the highest numbered CD of CD-Set-1 in tray, then
run cdlist like this :
to browse whole CD-Set-1. Alternatively, if you only want to browse CD
#12 in the set, then use :
Note that, you don't have to insert the highest numbered CD in this case.
Any CD (of CD-Set-1) numbered 12 or higher would suffice.
A note about directory scattering: Since CdBk produces scattered directories,
you must browse into the whole CD-Set to warrant correct listing of complete
directories. There may be duplicate entries in some CDs, due to a modification
to a file after CD level switch occured : Contents of older CDs are frozen,
and all modified files are saved onto the latest CD, creating duplicate
entries between CDs. In such case, all duplicate files are listed together
with their backup date and the number of CD they reside on. From there
on, you can restore whichever version you want. When restoring scattered
directories from multiple CDs, to restore the latest version of each file,
make sure that you first restore from the oldest CD, and last from the
newest one! We will revisit this topic in cdrest chapter.
User is expected to either pipe the result to grep or less
and
search by "/" or "?" etc. commands, or, redirect output
to a file and process thereon. Currently no search facility comes with
CdBk package, and none is really needed either. Unix has all the bells
and whistles for that. See the examples below. For more advanced uses see
grep(1), as well as the brief regexp background
in this manual.
4.1 Examples on cdlist
Usage
:
You don't need to sort the output list, for it is already sorted by
path-name.
First set of examples aim to demonstrate how cdlist works :
List the contents of complete CD-set, consulting records kept by CdBk
on hard-disk (in cdset directory) :
List only the contents of CD #5, consulting records on hard-disk :
List the contents of all CDs, starting from CD #1 through to the actual
CD inserted in the drive represented by "/mnt/cdrom", consulting
the records added by CdBk on CD. CD-drive must already be defined in
/etc/fstab (must be mountable by "mount /mnt/cdrom" command).
Note that if the CD inserted is the highest numbered one in the CD-Set,
then it is equivalent to consulting hard-disk records :
List only contents of CD #17, consulting records on actual CD inserted,
which must be at least CD #17 (or greater numbered CD) :
List the contents of all CDs, starting from CD #1 through to the actual
CD inserted in the drive accessible via /dev/hdb device, consulting
the CD itself :
# ./cdlist 0 /my/mounts/cd /dev/hdb |
Here,
/dev/hdb must be working (e.g. you must be able to eject
CD-drive's tray via
eject /dev/hdb command), and
/my/mounts/cd
directory
must exist. The only benefit of this notation is that, it works even if
CD-drive mount point is not defined in the system. However this notation
adds little value, while spoiling simplicity. Most probably you will never
need it, so you may forget about it and use
"cdlist 0 /mnt/cdrom"
instead.
List only the contents of CD #3, consulting records on actual CD inserted
(at least CD #3 or higher), which is accessible through /dev/hdd device
:
# ./cdlist 3 /other/cdrw /dev/hdd |
Similar to the one above. Use this format if you have to. Otherwise, use
"cdlist 3 /mnt/cdrom" instead.
Second set of examples aim to demonstrate how cdlist can be
used :
Look for home directory elements of user "betul" in the whole
set :
# ./cdlist | grep '^/home/betul/'
| less
# ./cdlist | grep '^/home/betul/' |
tee /tmp/betuls.list | less |
Look for any "works" directory in CD #9 :
# ./cdlist 9 | grep '/works/'
| less |
Look for any "core" file in CD #10 :
# ./cdlist 10 | grep '/core$'
> corefiles.list |
Look for any file or directory name starting with "ISO-8859-9" or
"ISO_8859-9" ignoring the case (ISO or iso or Iso, ...) in CD #4 :
# ./cdlist 4 | grep -i '/iso[_-]8859-9'
| less |
Look for any directory name ending in ".d" in the whole set
:
# ./cdlist | grep '\.d/' |
less |
Look for any file name ending in ".dba" in the whole set :
# ./cdlist | grep '\.dba$' |
less |
You are looking for various versions of "/home/abdullah/cdbk-manual.html"
in
the whole CD-set.
# ./cdlist | grep '^/home/abdullah/cdbk-manual\.html$'
| less |
You don't like regexp and/or you don't need power of grep.
You want to search using the find feature of your favorite text
editor. To search in CD #9, do this first...
# ./cdlist 9 > editme.txt |
...and then edit it using your favorite text editor :
# vi editme.txt
# kedit editme.txt
# kwrite editme.txt |
Beware that in some cases (e.g. unfiltered listing of a full CD-set) output
list may take tens of megabytes!
5. Restoral - cdrest
:
Syntax : cdrest CDRW_MountPoint [CD_DIR|CD_FILE]
[HD_DestDir]
NOTE: Parameter positioning (order) is important. Don't change them.
CDRW_MountPoint : Where CDR/W is mounted. Must be given.
CD_DIR / CD_FILE : Source directory/file on CDR/W to be restored.
If not given, "/" defaulted which causes the whole CD to be restored. Optional.
HD_DestDir : Destination directory on hard disk. If given,
restores relative to given (alternate) location. Good for testing and prudence.
If not given, defaults to "CD_DIR / CD_FILE" which restores the
files ditto to their original locations. Optional.
-
Restores given directory/file on CD, back to hard-disk.
-
All parameters must be given as absolute paths with leading "/".
Enforced silently.
-
All files are restored with their paths relative to HD_DestDir.
-
Destination (HD_DestDir) is defaulted to source (CD_DIR or CD_FILE). So
without HD_DestDir parameter, files are restored ditto to their
original locations.
-
Files/directories on hard-disk are not deleted if they don't exist on CD.
-
Files on hard-disk are overwritten by those on CD without confirmation.
-
For these reasons, it is recommended to remove/rename existing destination
directories before restore.
-
Only files that are bzipped on the fly while being backed up are bunzipped.
So, if a file's original name was foo.bz2 before backup, it is left untouched
with same name (foo.bz2) -- This is the main reason this script is written
for. Otherwise, restoration could simply be done as :
# cd $CDRW_MountPoint
# cp -afv { ./$CD_DIR | ./$CD_FILE } $HD_DestDir
# find $HD_DestDir -name "*.bz2" -exec bunzip2
{} \; |
-
Also files without ".bz2" suffix (zip, gif, tgz etc.) are not touched.
-
Perhaps it is better not to use this script for user data directories,
but just copy back them and leave them as-is (compressed). So, trash user
files cumulated over time won't be a burden on disk space. Needed files
can always be uncompressed by the user anyway.
-
List of files actually restored are reported in /tmp/cdrest.log file.
Do "tail -f /tmp/cdrest.log" parallel to
cdrest to monitor its progress.
5.1 Examples on cdrest
Usage
:
Restore complete CD to its original location (both are equivalent) :
# ./cdrest /mnt/cdrom /
# ./cdrest /mnt/cdrom |
Restore /mnt/cdrom/home/abdullah/ directory to its original
location /home/abdullah/ (both are equivalent) :
# ./cdrest /mnt/cdrom /home/abdullah /home
# ./cdrest /mnt/cdrom /home/abdullah |
Restore /mnt/cdrom/root/sistem/notes.txt file to its original
location (/root/sistem/notes.txt) :
# ./cdrest /mnt/cdrom /root/sistem/notes.txt |
Now look at the example below.
Restore /mnt/cdrom/sistem/notes.txt.bz2 file as above, except
that do not decompress automatically after restoral (not very useful) :
# ./cdrest /mnt/cdrom /root/sistem/notes.txt.bz2 |
This is only applicable for single-file restorals (i.e. not applicable
for directories). Directories are always recursively restored and recursively
auto-decompressed.
(By the way, cdrest is intelligent enough
not to decompress ".bz2" files that were already in .bz2 format originally
on hard-disk, before backup.) This feature can also be used when
both compressed and non-compressed versions of a file exist on the same
directory on CD. This is possible (depending on CdBk setup) if they also
existed that way in hard-disk when they were backed up. In which case,
the version you mentioned will be restored as-is. If there is only one
notes.txt.bz2 file exists in CD (as it should be) then this feature
only affects whether it will be auto-decompressed or not.
Restore /mnt/cdrom/usr/share/wallpaper.jpg to its original
location :
# ./cdrest /mnt/cdrom /usr/share/wallpaper.jpg |
Restore /mnt/cdrom1/abdullah/sistem/ directory to an alternate
location (/tmp/abdullah/sistem/) :
# ./cdrest /mnt/cdrom1 /abdullah/sistem /tmp/abdullah |
Restore /mnt/cdrom1/etc/inittab file to an alternate location
(/tmp/etc/inittab) :
# ./cdrest /mnt/cdrom1 /etc/inittab /tmp/etc |
Restore a device file to its original location (both are equivalent)
:
# ./cdrest /mnt/cdrom /dev/loop15 /dev
# ./cdrest /mnt/cdrom /dev/loop15 |
Device files are not (and can not be) compressed on the fly, so no .bz2
suffix here.
Restore a symlink (both are equivalent) :
# ./cdrest /mnt/cdrom /etc/X11/X
# ./cdrest /mnt/cdrom /etc/X11/X /etc/X11 |
Symbolic links are not (and can not be) compressed on the fly, so no .bz2
suffix here.
Restore a symlink to an alternate location (/tmp/dir1/dir2/dir3/mouse)
:
# ./cdrest /mnt/cdrom /dev/mouse /tmp/dir1/dir2/dir3 |
Note that, because symlinks are correctly restored as-is (with no-dereference),
a relative symlink restored to an alternate location will be broken.
6. Features & To-Do
List :
[ To be completed. ]
6.1 Features :
[ To be completed. ]
6.2 To Do :
[ To be completed. ]
7. Limitations & Bugs
:
[ To be completed. ]
7.1 Limitations :
[ To be completed. ]
7.2 Known Bugs :
[ To be completed. ]
8. Changelog
9. License & No-Warranty :
Copyright © 2001 Abdullah Ramazanoglu <ar018@yahoo.com>
This program is free software; you can redistribute it and/or modify
it under the terms of the GNU General Public License
- Version 2 as published by the Free Software Foundation.
This program is distributed in the hope that it will be useful, but
WITHOUT ANY WARRANTY; without even the implied warranty of MERCHANTABILITY
or FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public License
for more details.
You should have received a copy of the GNU General Public License along
with this program; if not, write to the Free
Software Foundation, Inc., 59 Temple Place - Suite 330, Boston, MA
02111-1307, USA.