CdBk Manual


v 0.6.x - 2401
http://cdbk.sourceforge.net/
Licensed via GPL-v.2
Abdullah Ramazanoglu <aramazan@users.sourceforge.net>


CdBk is a CD-RW backup utility aiming convenience and efficiency : Needs minimum CD intervention. Fills CDs 100% with individual .bz2 files. Saves all special files and attributes. Works online or batched : No realtime CD change needed even for multi-CD backups.


 

    1. Introduction

    2. Installation And Initial Setup

    3. Using CdBk

    4. Listing CD Contents - cdlist

    5. Restoral - cdrest

    6. Features & To-Do List

    7. Limitations & Bugs

    8. Changelog

    9. License & No-Warranty



 

1. Introduction :

There are quite a few backup utilities out there on the net. So it can be a time consuming search to find the most suitable one for specific needs. In this intro I will try my best to describe CdBk as plain and clear as possible so that you should see whether it is suitable for your needs without having to study it further, or without having to download and try it yourself.


1.1 A Quick Glance :

This is CdBk, a server class CD based backup utility to manage gigantic amounts of backup data with the least administration hassle possible. It is written in bash, uses bzip2 to compress files individually, fills up CDs to full capacity, uses postponed-work scheme (phasing) for excess data that doesn't fit in the CD, doesn't need online CD change, backs-up in ISO-9660 (RockRidge) format so that you can access the backup as a regular file-system, also backs up special files (devices, sockets, symlinks etc.) and full attributes (dates, ownerships, full permissions including SUID/SGID/Sticky bits etc.) so that you can back up system as well as user data, can be run in background as a cron job, as well as online from a terminal. It is very simple to use.

While it is originally written for best CD exploitation in server backups, it turned out to be equally well suited for personal uses.

It is open sourced, and for developers, the code is well commented in case you need to customize it for your own specific needs.

Among other features, CdBk has several major features which distinct it from other CD based backup utilities :

  1. It fills up the CDs to full capacity with individually compressed files : Having both of these features together is unique to CdBk, so far as I know.
  2. If a backup run cannot fill the last CD to full capacity, it continues on the same CD in following incremental runs, and doesn't require a new CD unless current one gets filled up : No partially filled "last CD in the set" per backup run. And no unnecessary intervention (CD change) per backup run.
  3. It doesn't need CD change on the fly (uses no more than one CD per run) : No administrator intervention is needed while it is running. (However admin intervention is "loosely" required (see 5) to change the CD between two runs of CdBk when the CD is filled up in previous run.)
  4. While it postpones excess data to following runs, it adapts itself to then-current state of system at each run. So, assuming average 1.5GB per CD via compression, it is perfectly OK to back up a 75GB live system in working hours, 10 CDs-worth of data a day throughout the week, without having to freeze the system, and without any discrepancy in the resulting backup set. (See "Advanced Usage" chapter for live system and database backups.)
  5. Once full backup is done, you can throw in the last (partially filled) CDRW, schedule CdBk to run periodically at nights for incremental backups, and just forget the CDRW there in the tray. CdBk will not ask you to change the CD unless it gets full, and even when it does, CdBk will request you so in the message log and postpone excess data for the next run. From then on, CdBk will not take further backups on that CDRW anymore unless you change it, protecting filled-up and "closed" CDRW against omittance. So if you forget to change CD for several days (or several months for that matter) it is no problem : Postponed incremental to-do list will just grow bigger and bigger. When the CDRW is changed, all cumulated data gets flushed to CDRW at the next run.
These were also the main reasons why I wrote CdBk. For a more detailed discussion read 1.3 Why CdBk .
Look at the following links for full features and to-do list , as well as known limitations and bugs .

These features have some side effects though, which are:

  1. It doesn't generate a CD-set that contains everything to be backed-up in each run. Instead, it backs up at most one CD-full of data, and postpones the rest of the work for the next run. The next run can be started right after the previous one, or the next day, or next week, etc. What does it boil down to? Consider a file "save.me" that is to be backed up, which is last modified on 20th day of the month. We run CdBk on 21st. Because CD has gotten full before this "save.me" file can be backed up, it is postponed for the next run of CdBk, which is due one week later (28th). Now this postponed "save.me" file is re-modified on 23rd. What happened to 20th-day copy? It has not been backed up, so it is lost.. But here is some relief: Normally CdBk should be scheduled with such a frequency that average amount of modified files between two runs of CdBk would not exceed a CD capacity. Then "postponing" will happen seldom. And "losing a previous version due to postponing" should be still lesser a probability (But possible). Note that, since CdBk currently creates single-session CDs (squashing older version files on CD with the newest one), this effect is hidden for the time being. But it will surface when CdBk starts supporting multi-session CDs in future. See [3] for that.
  2. Backup levels don't reflect a snapshot in time. They simply refer to the number of CD in a set. "Master backup" (Level 1) simply means CD-1, even though there were 10-CDs worth of data eligible for backup. In this sense, there is no "master-backup" CD-set and no "incremental backup" CD-sets. Instead, there is one backup set that grows up. In other backup utilities, one usually takes a master backup and gets e.g. a 10-CD backup set that snapshots a point in time. Next week an incremental backup is run and 2-CD incremental backup set is produced. So on, each backup run gets an absolute or relative snapshot of current time. But in contrast with those utilities, in CdBk, one runs CdBk initially and gets one and only one CD (i.e. CD-1) filled to full capacity (so called master backup in CdBk terms). Remaining 9-CDs' worth of data postponed to next run. Next run, which is "incremental" in CdBk terms, produces again one CD (CD-2) with 8-CDs' worth of postponed data. So on, in 10 runs there is no more data to be backed up. These 10 runs can take 10 hours if CdBk is run back to back, or 10 days if it is scheduled everyday, or 10 weeks if scheduled each week.
  3. While CdBk utilizes the same CDRW in repeated runs (until CDRW is filled up), backup operations on partly filled CDRWs will cause previous version of a file be replaced by the newest version. Consider the "save.me" file that gets modified all the time (say, twice a day). Let us run CdBk everyday, which fills a CD in 5 days on average, with incremental backups. The "save.me" file will get backed up in day-1. Then newer version will be saved on day-2 replacing previous day's version. This way, when the CD is filled up in day-5, all the previous versions of "save.me" will be lost except the latest one. If "save.me" has been deleted from system by the 4th day, then no save.me file will exist in backup CD. This is because CD is blanked and content is recreated each time CdBk is run (it has the benefit of using up less CD space). This, in turn, is because CD is created as single-session. In the future a multi-session option will probably be added, which will keep each incremental backup in a separate session on the same CD. Then the user will be able to choose preferred behavior (single or multi session). Until then, multi session effect can be obtained (in a somewhat inefficient manner) by using a fresh CD each time cdbk is run.
  4. CdBk is currently geared towards CD-RW media, because of this single-session capability and blank-recreate cycles. CD-R media is also OK, but either CdBk must be run less frequently so that CD is (mostly) filled up with incremental data, or some CD space wasting will result. This will also change in future releases with multi-session feature.



1.2 Typical Uses :

Typically you will use CD-RW discs, and use cron to schedule CdBk daily or weekly past midnight. (You can also use it online for personal use.) CdBk will grow incremental backups on the same CD until it gets full. When CD fills up, it will not prompt you a "Please Change CD" sort of message and wait for you to change the CD. Instead, it will log a message (if running online from a terminal, it will also issue the same message on screen) requesting you to change the CD and simply exit, postponing remaining part of the backup for the next run. Unless you change the CD, it will not take further backups at following runs over the filled-up CD (protecting the filled up and "closed" CD against delayed admin intervention).

This is where CdBk shows one of its strengths: It will not require you to change the CD at 5:00 am in the morning! Instead, it keeps an exhaustive to-do list (list of files to be backed up). At each run CdBk grows this list incrementally with new entries (if any), and writes this list on CD till it gets full (with rest of to-do list postponed to next run) or the list is successfully backed up (fitted) completely on CD. As long as your average daily incremental backup size is less than the compressed capacity of a CD (avg. ~1.4G), daily fluctuations in backup size, be it 5 MB or 5 GB, won't affect your usual working pattern : Each morning (or each Monday) you just look at the message log, and change the CD if so requested.

Here is how I use it for personal machines and servers :

On my personal PC at home, for a full backup I run it online (with -m flag for the very first run), and if it issues a "Change CD" message, I change CD and run it again (without -m flag), until it doesn't issue a CD change request. After the full backup is done, I run cdbk (without -m) once a week, online from console, putting the last CD in tray. If/when it issues a CD change request, I put a new CD in tray and run it again (no -m). In short, I don't do scheduled backups on home PC.

On servers, for a full backup I get sufficient number of blank CDRWs, insert the first one, and run CdBk online (with -m flag for the very first run) in the morning of a working day. It issues CD change request while exiting. I insert the second CDRW and re-run CdBk (no -m) and go on like this. Is it evening before doing the whole backup work? No problem. I leave for home, and continue in the morning of next day from where I left off. I go on like this until no more CD-change request issued, i.e. full backup is done. Then, I keep the last CDRW in tray, and schedule CdBk to run everyday at 4 am. Then I check CdBk message log once or twice a week and put in a new CDRW if so requested. That's it.

As an extreme example, you can even build up an initial backup in a very extended period of time: Assume that you have a system with 65 G of backup-worthy data. (With a conservative average of 2:1 overall compression ratio, this should fit in at most 50 CD's of 650M each). And assume that each day your system produces an average of an additional 100M incremental data to be backed up : You just insert a blank CD/RW in tray at working hours, clear history of CdBk (to force a master backup), and cron cdbk to run every day at 4:00 AM and leave for home. Tomorrow morning you will find the first CD filled up, and a message in CdBk message log asking you to change the CD. You change CD, and the next morning you will find it filled up too. Going on like this, in 55 days or so you will have your whole system backed up (including incremental newcomer files). Now that you have everything backed up, you put the final CD-RW in tray, and just forget it there for two weeks, until it gets full. Now your administrative burden is simply looking at the message log once or twice a week, and change CD approximately twice a month. By the way, you can leave work for 2 months for a trip, and when you return back, you will find that the CD you left in tray has been filled up correctly, and 2 months' worth of to-do list (4 CDs in our example) has been grown, waiting for a CD-change. Changing a CD a day, in 4 days you're again on the track.

Starting with CdBk-0.6.2 it is also possible to sync to an external backup. With this scheme, you would first take a full backup of your system using another backup manager (probably a tape backup). Then, with CdBk you would take master backup synced to the full backup on tape. Such a synced master backup is actually an incremental backup relative to the full backup on tape. From that point on, you go on with usual incremental backup cycles using CdBk. The net result is, you will have a full backup on tape, and all the increments on CDs. See "-M" option for details.

Alternatively, since there are ways to consistently backup a system in working hours (Never live databases! We will discuss database backups later) you may start master backup in Monday morning, run cdbk back to back and fill up 9 CDs at working hours in Monday, and leave the 10th for night run. Continuing like this, 10 CDs a day, by Friday evening you will have the full backup set. And with the last backup run by Friday night, you will have synced back your backup set to a consistent state. From then on, since each incremental backup run will (assumingly) be scheduled to night times, you will always have a consistent backup set.

This scheme has some peculiar characteristics too. For one, backup levels does not necessarily represent a specific date, as in regular incremental backups. Instead, each backup level refers to a CD, which can have files with way different time-stamps. In this sense, "master backup" simply means CD number-1. All the other CDs are "incremental" from CdBk's point of view. In other incremental backup schemes, incremental means "since the last time a backup was taken". In CdBk it means "since the time previous CD has been filled up and archived" (which is equivalent to "since the last time a backup was taken on previous CD" or, "since the freeze time of previous CD")

BTW, while it is primarily designed to run in background for servers, it is equally at home for online foreground running on a personal machine, (or on an attended backup of a server) as long as you are willing to wait till it finishes, change the CD, run it again, wait again... until it doesn't request CD-change anymore. Once everything on your home machine is backed up, insert the last (partly filled) CD/RW in tray every other week and just run cdbk : It will add last two weeks' incremental backup onto CD.

Lastly, there are two additional utilities provided, one for listing CD-set (or a specific CD) contents, and one for restoral. Since backup CDs are in ISO-9660 RockRidge format, you can simply mount them and copy back whichever files you want. But, both to warrant that files are restored back ditto with their original attributes, and to automatically decompress only those files that were compressed on-the-fly while backing up, a special restore utility (cdrest) comes handy, especially when restoring high-level or crowded directories. See cdlist and cdrest for more information on them.


1.3 Why CdBk :

I have a server and limited budget, which translates into using CD/RW for backup. Apparently there are a lot of backup utilities out there on the net, so I studied most of them for a week, downloading and trying more promising ones. Unfortunately all of them had at least one of the following drawbacks :

  1. Some were ex-tape-backup tools modified as an afterthought to backup on CD too: With clumsy CD handling capabilities and most of the drawbacks below.
  2. Some use CD to its full capacity, at the cost of chopped tarballs : Few use tar, most use afio, with "multivolume" flag turned on. With afio's "compress on the fly" feature it is possible to produce compressed backups. And since afio can also be instructed to produce chops of 650M files with multivolume feature, it is fairly easy and efficient way to both have compressed CDs, and fill them upto full capacity too. Here comes the drawbacks: Since output is actually a chopped-up set of one giant monolithic chunk, afio (or tar) needs many CDs together sequentially for a simple file restoral. Also, a lost CD or sometimes a corruption makes the whole CD set useless. Also, since it generates all the CDs in one run, an afio/tar based utility stops short and asks (and waits for) you to change the CD every once in a while : You are effectively tied-up to the machine until the whole backup operation finishes. Also, in most cases the last CD is utilized average 50% capacity: When afio is done with a backup, remaining space on the last CD is wasted. I would still have gone for an afio based utility, if only afio would allow to close and open an archive at multivolume boundaries, i.e. produce multiple afio archives, one per volume, instead of -alas- chopping them : I could then modify the utility so that I would get an afio archive consisting of self-contained CDs. Then I could try to live with other drawbacks, instead of writing CdBk.
  3. At least one of them (as I recall) was filling CD to full capacity with individual files, but without compression.
  4. They invariably require you to cater for their CD change needs online. They stop short, (don't exit) requesting you to change the CD - now!.. If you have a rather big data to backup (say, tens of CDs) this can be a "weekmare".
  5. In most of them, each incremental backup run needs its own CD set, with the last CD having an average of 50% utilization. In the end, for e.g. 30 daily increments, an average of 15 CDs wasted. Also, since a busy file will be included in most/all incremental backups, there is a high level of duplication (which is also a blessing sometimes). Yes, CD is cheap, but juggling is not: For each incremental backup, user is expected to change at least one CD. It may still be OK to feed another CD for each run, regardless of whether previous one is fully utilized or not. But what if an incremental backup didn't fit on a single CD and you are asked to change CD in the middle of the night? Perhaps I could use 2 CD/RW drives, preload them with blank CDRWs, and tell the backup utility to switch to next drive when the first one is filled? Well, I don't know any CD based backup utility with such a feature. Even if it does exist, then what if a specific incremental backup needs 5 CDs to complete? (E.g. several users "saved" most of their local disk to net-drive that day, and you are incrementally backing up that net-drive at 4:00 am in the morning, by scheduled / unattended run of the backup tool.)
In server world, people are used to DLT tapes, magazines etc. fancy stuff: There was no really usable CD/RW solution for a server on the net. Existing ones were rather geared towards personal use, to be run online and to handle at most 10 CDs or so, with some intervention requirements and CD space wasting. This is why I wrote CdBk: It efficiently manages backing-up of very big amounts (relative to CD capacity) of live data onto CD/RW discs, requiring next to no administration (again relative to CD capacity).

Additionally, this server side approach doesn't affect its usability for online personal use at all. Perhaps a GUI and some context sensitive help could have been added (ideas welcome). But CdBk is so simple to use, a GUI is hardly needed (perhaps for initial setup). Currently there is no setup utility : Just untar it, specify which directories to be backed up in cdbk.include file, and just run it. While there is cdbk.conf file for configuration , and there are command line options to override them on per-run basis, most parameters, (e.g. CD drive's emulated SCSI address, its device file, mount point, writing/blanking speed etc) are defaulted to such values that it will work out of the box 80% of the time. A typical use for master backup (i.e. CD #1):
# ./cdbk  -m
which is equivalent to ./cdbk -l 1 (hyphen ell one)
or, if you just installed it, (empty history will force master backup)
# ./cdbk
A typical use for following (incremental) backups:
# ./cdbk
For each incremental backup just issue that command: It will automatically decide which level you were at, whether a CD change was due, whether it has actually been changed, etc. and act accordingly. These auto-decisions are done depending on cumulative history records kept in "cdset" subdirectory of CdBk. Specifying "-m" option, or just deleting (or corrupting the contents of) this directory forces a master backup (not without user's confirmation!), which also clears and recreates "cdset" from scratch. It is that simple.


So, this concludes introduction part of the manual. If you are still unsure whether CdBk is for you or not, or if you decide to give it a try and later on see that it is not really what you have expected, that means this intro didn't live up to its promise (to be sufficient enough to give full perspective of the utility). In this case please drop me a mail about it, so that I could either update the manual or include the feature or fix the bug. Thank you.



 

2. Installation and Initial Setup :


2.0 Requirements :

[To be completed.]

CdBk is written on Gelecek-Linux 1.1 (a RedHat 7.1 derivative) and initially tested on Gelecek-1.1 and RedHat-7.1 on Intel architecture only. However, it is a processor independent bash code. Currently known platform status is listed in Table-1. If you run it on other versions / systems / architectures please let me know about it (you can use Table-1 as a template) so that I can add your platform in next release of this manual.
 
Table-1 : Currently known platforms that CdBk is reported to run out of the box.
Operating System
Version
Tested Architecture
Gelecek Linux 1.1 x86
RedHat Linux 7.1 , 7.2 x86

Here are the components used, with their versions on original development platform (Gelecek-1.1) :
Components below are already installed normally on most of the Linux systems with a CD-RW drive, I guess. Older version components may work (newer versions should work) though not tested. Not all programs that come with these components are actually used by CdBk. Used ones are given in parenthesis :

kernel-2.4.2-2
bash-2.04-21
cdrecord-1.9-6 (cdrecord readcd)
mkisofs-1.9-6 (mkisofs)
bzip2-1.0.1-3 (bzip2)
diffutils-2.7-21 (cmp diff)
sh-utils-2.0-11 (date nice)
textutils-2.0e-7 (comm cut sort tail tr uniq ...)
fileutils-4.0w-3 (cp dd df ln ...)
findutils-4.1.5-4 (find)
losetup-2.10r-5 (losetup)
mount-2.10r-5 (mount)
e2fsprogs-1.19-3 (mke2fs)
sed-3.02-8 (sed)
grep-2.4.2-3 (grep)

Other requirements:



2.1 Quick Start :



2.2 Detailed Setup :

Setup is mainly done in cdbk.conf file. You can edit it using your favorite text editor. Some of the parameters given in this file can also be overridden by command line options. However, certain parameters can only be given in cdbk.conf (while others can only be given from command line). Here is a list of all parameters and command line options. For detailed explanations of these parameters please see cdbk.conf itself.
 
(Command-line options always override configuration parameters.)
Cmd line cdbk.conf parameter What it does Factory default
-b 
-B
BURN_CD=0 
BURN_CD=1
Burn CDRW or just generate ISO image file. -B : 1 : Burn CDRW.
-c 
-C
CD_SANITY_CHECK=0 
CD_SANITY_CHECK=1
Check CD of previous run. -C : 1 : Check it.
-e 
-E
EJECT=0 
EJECT=1
Eject CD when filled up. -e : 0 : Don't eject.
-h N/A Help. N/A
-i 
-I
ISO_IMAGE=0 
ISO_IMAGE=1
Burn CD on-the-fly while generating ISO image (unbuffered). Or, generate ISO image first, and then burn from it in two steps (buffered). -I : 1 : Burn in buffered mode (Prepare ISO image beforehand)
-l LEVEL N/A Take back-level LEVEL backup. That is, return back to LEVEL (Be careful!) Calculate due current level dynamically.
-m N/A Take an absolute master backup. Use this flag once and only once per CD-set, for CD #1 N/A
-M DATE N/A Take a synced master backup that is relative to another backup started at DATE (in YYYYMMDDhhmm format) Use only once per CD-set, for CD #1 N/A
-s SIZE SHADOWSIZE=SIZE Size of CD shadow partition. Calculate dynamically.
-v 
-V
VERIFY=0 
VERIFY=1
Verify after burning. -V : 1 : Verify it.
N/A CDRWDEV=Device_File CD-RW device file If not given (Null): /etc/fstab entry for CDRWMNT consulted.
If invalidly given : /dev/cdrom
N/A CDRWMNT=Mount_Point CD-RW mount point /mnt/cdrom
N/A CDSCSI=SCSI_Address CD-RW SCSI address as reported by cdrecord -scanbus command 0,0,0
N/A BLANKSPD=Speed Blanking speed. 4
N/A WRITESPD=Speed Writing speed. 4
N/A TEMP=Directory Directory for temporary files. /tmp
N/A CLEANUP=[0|1] Clean up upon exit. For debugging purposes. 1 : Do clean-up.
N/A DEL_CD_SHADOW=[0|1] Delete CD shadow upon exit. 1 : Delete it.


2.3 Specifying Regexp Lists :

Regexp (Regular Expression) list of file/directory names, or file name extensions are used in cdbk.exclude and cdbk.nocompr files. For detailed explanation of advanced regexp usage you may refer to man pages of awk or grep programs, but here is a simple and sufficient explanation on them to cover our scope. Note that the analogies drawn between regexp and shell pattern matching are not accurate.


2.3.1 Some Background :

  1. A plain regexp is any substring in a full-path file name. E.g. regexp /tmp will match any "/tmp" string in file's full path name, which is analogous to "*/tmp*" in shell pattern matching.  For instance, regexp /tmp will match :
  2. To specify a substring that matches from the beginning of full path name of a file, place a "^" as the first character of regexp. For instance, regexp ^/tmp will match any file name that begins with "/tmp". This is analogous to "/tmp*" in shell pattern matching. Regexp ^/tmp will match :
  3. But it won't match :
  4. To specify a substring that matches the end of the full path name of a file, place a "$" as the last character of regexp. For example, regexp log$ will match any file name that ends in "log". This is analogous to "*log" in shell pattern matching. Regexp log$ will match :
  5. But it won't match :
  6. To specify an exact path name, begin the regexp with ^ character, and end it with $ character. For example, regexp ^/var/log/messages$ will exactly match one and only one file. This is analogous to "/var/log/messages" in shell pattern matching.
  7. Dot "." character means "any character" in regexp. This is analogous to "?" character in shell pattern matching. For example, regexp .log$ will match /somedir/some.log as well as /somedir/otherlog or /usr/bin/dialog files. If you want to match the dot character itself, you must escape it with \ (backslash). For example, regexp \.log$ will match /somedir/some.log but will not match /somedir/otherlog or /usr/bin/dialog files.
  8. Asterisk "*" character means "any number of previous character" in regexp. It is quite different from "*" in shell pattern matching, which means "any number of any characters". For example regexp /tmp* will match :
  9. Which is not very useful as you see. To have the effect that shell pattern matching has, you use .* characters together in a regexp, which means "any number of any characters". Remembering that a regexp is already a substring with implied "*" at the start and end, .* notation is seldom needed for special needs. For example, regexp /home/.*\.gz is analogous to "*/home/*.gz*" in shell pattern matching. Regexp /home/.*\.gz will match : On the other hand, regexp ^/home/.*\.gz$ (note the "^" and "$") is analogous to "/home/*.gz" in shell pattern matching. For the examples above, this will only match :



2.3.2 Regexp Examples for cdbk.exclude:



2.3.3 Regexp Examples for cdbk.nocompr :

\.arc$ : Analogous to "*.arc" in shell pattern matching : Don't compress .arc files.
\.bz$ : Analogous to "*.bz"
\.bz2$ : Analogous to "*.bz2"
\.deb$ : Analogous to "*.deb"
\.gif$ : Analogous to "*.gif"
\.gz$ : Analogous to "*.gz"
\.jpeg$ : Analogous to "*.jpeg"
\.jpg$ : Analogous to "*.jpg"
\.lha$ : Analogous to "*.lha"
\.mp3$ : Analogous to "*.mp3"
\.rpm$ : Analogous to "*.rpm"
\.taz$ : Analogous to "*.taz"
\.tgz$ : Analogous to "*.tgz"
\.tpz$ : Analogous to "*.tpz"
\.tzg$ : Analogous to "*.tzg"
\.z$ : Analogous to "*.z"
\.zip$ : Analogous to "*.zip"
\.zoo$ : Analogous to "*.zoo"




 

3. Using CdBk :


3.1 Basic Usage :

This is the full syntax of cdbk command. You can use separate "-" for each individual option, or combine options together in a single string. Options that need a parameter (i.e. -l , -M and -s) must be given separately, or as the last option in combined options. Most of these options override the corresponding parameters in cdbk.conf , the main configuration file of CdBk. The exceptions are, -l , -m , -M and -h which can only be given as command line options (no equivalent parameter in cdbk.conf). Options to turn a flag on or off are provided as capital and small letters. Always, capital letter turns the option On, and small letter turns it Off. If not specified, the corresponding configuration parameter in cdbk.conf is defaulted. If it is neither given there (or an invalid value given), then factory defaults are used.

cdbk [-b|-B] [-c|-C] [-e|-E] [-h] [-i|-I] [-l LEVEL|-m|-M DATE] [-s SIZE] [-v|-V]

Where,

-b | -B : Whether to burn CDRW actually or not. With this option set, CdBk will write the ISO image file into "$TEMP/cdbk-iso" instead of burning CDRW. This comes handy if your system does not have a CD burner, but you have access to another machine on your LAN with a burner. Overrides BLANKSPD , EJECT , ISO_IMAGE , VERIFY , CD_SANITY_CHECK configuration parameters. (Presets them to certain values, regardless of what you specify for them.)

-c | -C : Whether to employ consistency checks on mounted CD in case CdBk continues on a partially filled CD. -c skips checking, and -C does checking. Overrides CD_SANITY_CHECK configuration parameter.

-e | -E : Whether to eject CD when it gets filled up. -e doesn't eject, -E does eject. Overrides EJECT  configuration parameter.

-h      : Display help and exit without doing anything. Can only be given from command line.

-i | -I : Whether to burn CD on the fly while preparing ISO image in one go (-i), or prepare an ISO image before burning it onto CD in two steps (-I). Overrides ISO_IMAGE configuration parameter.

-l LEVEL | -m | -M DATE : Backup level to be taken. Use one of these options only once to reset the backup cycle in some way, and then never use one of these options again: Without one of these options, current level will be calculated automatically. (Which is the normal way for everyday usage.)
These options can only be given from command line. Additionally, -m or -M option can be given once and only once per CD-Set, for CD #1.

-s SIZE : Size of virtual partition to hold CD shadow. It is recommended not to supply this parameter at all, which lets CdBk calculate it adaptively depending on the actual CDRW capacity in tray. Overrides SHADOWSIZE configuration parameter.

-v | -V : Whether to verify CDRW against its shadow on hard disk, after burning CD. Overrides VERIFY configuration parameter.

Normally, you must run CdBk as ./cdbk -m to start a master backup. (Don't forget that a master backup is just CD#1, in CdBk terms.) This clears all history kept on hard disk by CdBk, and starts a fresh backup cycle with CD#1. After this first run, no matter whether CD#1 is partially filled or not, you must run CdBk without -m , -M DATE or -l LEVEL options. This is all there is to use CdBk. Other options can be specified with different values at different runs in a backup cycle, without affecting the cycle.

-l LEVEL option is provided for freeing up CDs from level LEVEL up to current level, and recreating them again. It effectively reunifies different versions of same files on different CDs into a single (latest) version, and thus, compacts CD-set. This is not an option to be used everyday. It effectively takes a semi-master backup, depending on to which CD number (level) you are returning back. So be careful with this option.


3.2 Advanced Usage :

[ To be completed. ]


3.2.x Backing-up Live Inter-related Files :

Inter-related set of files normally cannot be backed up while they are live (in business hours). Otherwise there is a high risk of having a backup set with inconsistent state of these files. This restriction also applies to singly active files, though corruption risk may not be as much as file-sets. So, do we have to stay night just to do full backup, or hire a night-shift "backup operator"? Even so, what if full backup doesn't finish till morning (quite possible with CD based backups), lock the doors? Here is a couple of simple and easy ways to deal with long lasting full backups. They are applicable to both CdBk and other backup managers. (Ideas welcome)

  1. The most common approach is scheduling a disk-to-disk snapshot or, if you mirror, stop mirroring (freezing the mirror) at night time on a still system. Then, you can backup that snapshot or mirror in working hours in an extended period of time. For most situations this is a proven solution, but for a CD based backup it depends: If you have tens of CDs' worth of backup data, then you won't be able to complete snapshot-to-CD backup before several days. In this case you can either night-schedule additional disk-to-disk incremental snapshots in the meantime, or you can simply wait until full backup CD-set is completed before going on to incremental night runs. In either case you need lots of disk space for the snapshot(s). And if you opt for the latter case, then you might also consider the second method as well :
  2. Do your full backup in day time. As an added convenience, with CdBk, you can take your time, getting 5 CDs' worth of backup a day and finishing your full backup throughout the week - in working hours. Don't care about the consistency of live files or file-sets for the time being. Note that, since CdBk adapts itself to then-current state of all files at each and every run, end of full backup will guarantee that the last-moment copy of all files are contained in the resulting CD-set. After the full backup finishes, schedule your incremental backups to nights, making sure that none of the live files or file-sets are active at that time in the night. If any of the files in the set has been changed during or after the creation of the last CD of the full backup, then that file will go into incremental backup.  When you need your file-set back, just restore the latest versions of all files in the set.
Up to this point there was an assumption that an incremental backup taken at night would sync everything into order. In reality though, with the postponing feature of CdBk, there is a problem with incremental backups of inter-related file sets : If any one (or more) file of a file-set is left out because of postponing, then the others on CD will be of no value. That is, effectively, you will have not backed up your file-set at all. Nevertheless, see below for a work around before this is fixed in a future release.

In a future release there will be an option to use more than one drives, and CdBk will switch between them in a round-robin fashion. Using 2 drives, you will be able to change a CD-RW, while the other one is being gradually filled up day by day. Then there will be no postponing because of boundary effects, i.e. because CD-RW was already on its last legs when you start an incremental backup. (OOPS! : E.g. in a file-set of 5 files, the first 3 gets on the CD#1, then drive switching takes place, and remaining 2 gets on the CD#2. Since CD#1 content is frozen while CD#2 is not, the next incremental backup will get all 5 files onto CD#2, squashing the last 2 files from previous backup. This renders the first 3 files on CD#1 useless. This is a big wastage, especially if we are talking about really big files.)
Of course, this will not prevent postponing if an incremental backup takes more than the free space on the CDs combined. Such postponing can only be prevented by using enough number of drives. So, since there is always a possibility of postponing even with multiple drives, and since the "Oops" above about drive switching, the circumvention below is suggested even when multi-drive option becomes available :


3.2.y Database Backup Strategies :

Short one : Never incrementally back-up database volumes. Otherwise, you will end up backing up the whole database everyday. Instead, first use the backup utility provided with your database to take a full / differential / incremental backup of the database onto disk. Then, use CdBk (or any other backup utility) to incrementally backup your disk. Make sure that the directory containing the output of incremental database backup is included in, but the directory containing database volumes themselves is excluded from the backup list.
 



 

4. Listing CD Contents - cdlist :

Syntax : cdlist  [N  [CDRW_Mount_Point  [CDRW_Dev]]]

List contents of CD number N, or if N is zero or not given, then list contents of all CDs combined. For this, consult CD in tray (which must have level-number N or more), or consult backup records kept on hard-disk (which is equivalent to using latest CD in the set). CDRW_Dev is normally not given. It is only needed if CDRW mount point is not defined in /etc/fstab.

If "CDRW_Dev" is given, which is the device file for CDRW drive, then "CDRW_Mount_Point" must also be given.
If "CDRW_Mount_Point" is given, which must be an existing directory, then "N" must also be given (even if it is 0).

CDRW_Dev defaults to what is defined in /etc/fstab for CDRW_Mount_Point.
CDRW_Mount_Point defaults to home dir of CdBk on hard disk. (which means CdBk will consult the records kept on hard-disk, instead of CD)
N defaults to 0 (which means "contents of all CDs").

Without N (or with N=0) it lists all CD's combined, in which case CD number and backup date is reported for each entry. With a nonzero N (between 1 and the number of CD in tray) it lists specified CD's content, in which case backup date of the CD is reported only once.
If CDRW_Mount_Point is not given, then history records kept at [CdBk_Base_Dir]/cdset directory is consulted, which is equivalent to consulting the latest CD in the set.

Typically you would either run it as,
# ./cdlist
to list everything in the whole CD-set, or you would use,
# ./cdlist  5
to list everything in CD #5. Since the records kept on hard-disk by CdBk covers all CDs, there is no need to consult the CD itself. But if you suspect a corruption in [CdBk_Base_Dir]/cdset directory (where the records are kept) then you may prefer consulting the actual CD itself. Another reason for consulting CD is that, you may want to browse the CD-set of another machine, or an ancient CD-set of the same machine. For example, you have been maintaining CD-set-1 for a while, and then you archived those CDs, brought brand new CDRW discs, and started a new -master- backup cycle on them (CD-set-2) via "./cdbk -m" command. Now your history on hard-disk is in sync with CD-set-2. Now, if you ever want to browse CD-set-1, you must first insert the highest numbered CD of CD-Set-1 in tray, then run cdlist like this :
# ./cdlist  0  /mnt/cdrom
to browse whole CD-Set-1. Alternatively, if you only want to browse CD #12 in the set, then use :
# ./cdlist  12  /mnt/cdrom
Note that, you don't have to insert the highest numbered CD in this case. Any CD (of CD-Set-1) numbered 12 or higher would suffice.

A note about directory scattering: Since CdBk produces scattered directories, you must browse into the whole CD-Set to warrant correct listing of complete directories. There may be duplicate entries in some CDs, due to a modification to a file after CD level switch occured : Contents of older CDs are frozen, and all modified files are saved onto the latest CD, creating duplicate entries between CDs. In such case, all duplicate files are listed together with their backup date and the number of CD they reside on. From there on, you can restore whichever version you want. When restoring scattered directories from multiple CDs, to restore the latest version of each file, make sure that you first restore from the oldest CD, and last from the newest one! We will revisit this topic in cdrest chapter.

User is expected to either pipe the result to grep or less and search by "/" or "?" etc. commands, or, redirect output to a file and process thereon. Currently no search facility comes with CdBk package, and none is really needed either. Unix has all the bells and whistles for that. See the examples below. For more advanced uses see grep(1), as well as the brief regexp background in this manual.


4.1 Examples on cdlist Usage :

You don't need to sort the output list, for it is already sorted by path-name.

First set of examples aim to demonstrate how cdlist works :

List the contents of complete CD-set, consulting records kept by CdBk on hard-disk (in cdset directory) :
# ./cdlist

# ./cdlist  0

List only the contents of CD #5, consulting records on hard-disk :
# ./cdlist  5

List the contents of all CDs, starting from CD #1 through to the actual CD inserted in the drive represented by "/mnt/cdrom", consulting the records added by CdBk on CD. CD-drive must already be defined in /etc/fstab (must be mountable by "mount /mnt/cdrom" command). Note that if the CD inserted is the highest numbered one in the CD-Set, then it is equivalent to consulting hard-disk records :
# ./cdlist  0  /mnt/cdrom

List only contents of CD #17, consulting records on actual CD inserted, which must be at least CD #17 (or greater numbered CD) :
# ./cdlist  17  /mnt/cdrom

List the contents of all CDs, starting from CD #1 through to the actual CD inserted in the drive accessible via /dev/hdb device, consulting the CD itself :
# ./cdlist  0  /my/mounts/cd  /dev/hdb
Here, /dev/hdb must be working (e.g. you must be able to eject CD-drive's tray via eject /dev/hdb command), and /my/mounts/cd directory must exist. The only benefit of this notation is that, it works even if CD-drive mount point is not defined in the system. However this notation adds little value, while spoiling simplicity. Most probably you will never need it, so you may forget about it and use "cdlist  0  /mnt/cdrom" instead.

List only the contents of CD #3, consulting records on actual CD inserted (at least CD #3 or higher), which is accessible through /dev/hdd device :
# ./cdlist  3  /other/cdrw  /dev/hdd
Similar to the one above. Use this format if you have to. Otherwise, use "cdlist  3  /mnt/cdrom" instead.


Second set of examples aim to demonstrate how cdlist can be used :

Look for home directory elements of user "betul" in the whole set :
# ./cdlist  |  grep  '^/home/betul/'  |  less

# ./cdlist  |  grep  '^/home/betul/'  |  tee /tmp/betuls.list  |  less

Look for any "works" directory in CD #9 :
# ./cdlist  9  |  grep  '/works/'  |  less

Look for any "core" file in CD #10 :
# ./cdlist  10  |  grep  '/core$'  >  corefiles.list

Look for any file or directory name starting with "ISO-8859-9" or "ISO_8859-9" ignoring the case (ISO or iso or Iso, ...) in CD #4 :
# ./cdlist  4  |  grep  -i  '/iso[_-]8859-9'  |  less

Look for any directory name ending in ".d" in the whole set :
# ./cdlist  |  grep  '\.d/'  |  less

Look for any file name ending in ".dba" in the whole set :
# ./cdlist  |  grep  '\.dba$'  |  less

You are looking for various versions of "/home/abdullah/cdbk-manual.html" in the whole CD-set.
# ./cdlist  |  grep  '^/home/abdullah/cdbk-manual\.html$'  |  less

You don't like regexp and/or you don't need power of grep. You want to search using the find feature of your favorite text editor. To search in CD #9, do this first...
# ./cdlist  9  >  editme.txt
...and then edit it using your favorite text editor :
# vi  editme.txt

# kedit  editme.txt

# kwrite  editme.txt

Beware that in some cases (e.g. unfiltered listing of a full CD-set) output list may take tens of megabytes!



 

5. Restoral - cdrest :

Syntax : cdrest  CDRW_MountPoint  [CD_DIR|CD_FILE]  [HD_DestDir]

NOTE: Parameter positioning (order) is important. Don't change them.

CDRW_MountPoint : Where CDR/W is mounted. Must be given.
CD_DIR / CD_FILE : Source directory/file on CDR/W to be restored. If not given, "/" defaulted which causes the whole CD to be restored. Optional.
HD_DestDir : Destination directory on hard disk. If given, restores relative to given (alternate) location. Good for testing and prudence. If not given, defaults to "CD_DIR / CD_FILE" which restores the files ditto to their original locations. Optional.
 



5.1 Examples on cdrest Usage :

Restore complete CD to its original location (both are equivalent) :
# ./cdrest  /mnt/cdrom  /

# ./cdrest  /mnt/cdrom

Restore /mnt/cdrom/home/abdullah/ directory to its original location /home/abdullah/ (both are equivalent) :
# ./cdrest  /mnt/cdrom  /home/abdullah  /home

# ./cdrest  /mnt/cdrom  /home/abdullah

Restore /mnt/cdrom/root/sistem/notes.txt file to its original location (/root/sistem/notes.txt) :
# ./cdrest  /mnt/cdrom  /root/sistem/notes.txt
Now look at the example below.

Restore /mnt/cdrom/sistem/notes.txt.bz2 file as above, except that do not decompress automatically after restoral (not very useful) :
# ./cdrest  /mnt/cdrom  /root/sistem/notes.txt.bz2
This is only applicable for single-file restorals (i.e. not applicable for directories). Directories are always recursively restored and recursively auto-decompressed. (By the way, cdrest is intelligent enough not to decompress ".bz2" files that were already in .bz2 format originally on hard-disk, before backup.)  This feature can also be used when both compressed and non-compressed versions of a file exist on the same directory on CD. This is possible (depending on CdBk setup) if they also existed that way in hard-disk when they were backed up. In which case, the version you mentioned will be restored as-is. If there is only one notes.txt.bz2 file exists in CD (as it should be) then this feature only affects whether it will be auto-decompressed or not.

Restore /mnt/cdrom/usr/share/wallpaper.jpg to its original location :
# ./cdrest  /mnt/cdrom  /usr/share/wallpaper.jpg

Restore /mnt/cdrom1/abdullah/sistem/ directory to an alternate location (/tmp/abdullah/sistem/) :
# ./cdrest  /mnt/cdrom1  /abdullah/sistem  /tmp/abdullah

Restore /mnt/cdrom1/etc/inittab file to an alternate location (/tmp/etc/inittab) :
# ./cdrest  /mnt/cdrom1  /etc/inittab  /tmp/etc

Restore a device file to its original location (both are equivalent) :
# ./cdrest  /mnt/cdrom  /dev/loop15  /dev

# ./cdrest  /mnt/cdrom  /dev/loop15

Device files are not (and can not be) compressed on the fly, so no .bz2 suffix here.

Restore a symlink (both are equivalent) :
# ./cdrest  /mnt/cdrom  /etc/X11/X

# ./cdrest  /mnt/cdrom  /etc/X11/X  /etc/X11

Symbolic links are not (and can not be) compressed on the fly, so no .bz2 suffix here.

Restore a symlink to an alternate location (/tmp/dir1/dir2/dir3/mouse) :
# ./cdrest  /mnt/cdrom  /dev/mouse  /tmp/dir1/dir2/dir3
Note that, because symlinks are correctly restored as-is (with no-dereference), a relative symlink restored to an alternate location will be broken.



 

6. Features & To-Do List :

[ To be completed. ]


6.1 Features :

[ To be completed. ]


6.2 To Do :

[ To be completed. ]



 

7. Limitations & Bugs :

[ To be completed. ]


7.1 Limitations :

[ To be completed. ]


7.2 Known Bugs :

[ To be completed. ]



 

8. Changelog



 

9. License & No-Warranty :

Copyright © 2001 Abdullah Ramazanoglu <ar018@yahoo.com>

This program is free software; you can redistribute it and/or modify it under the terms of the GNU General Public License - Version 2 as published by the Free Software Foundation.

This program is distributed in the hope that it will be useful, but WITHOUT ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public License
for more details.

You should have received a copy of the GNU General Public License along with this program; if not, write to the Free
Software Foundation, Inc., 59 Temple Place - Suite 330, Boston, MA 02111-1307, USA.