This directory used to hold my hot-backup-over-network software. It had a very old version, though, and I've no longer been updating it. This README still exists for two purposes: (1) to preserve the text below, which is documentation I don't think I have anywhere else, and (2) to tell people trying to use this directory where to look for the current version. [Date of this writing is 2022-05-28. The previous update was in 2010.] There are two major pieces to the code: livebackup, which is the userland code, and diskwatch, which is the kernel code. livebackup I now distribute principally as a git repo (git://git.rodents-montreal.org/livebackup is the thing to git clone), with an unpacked view of it available in my FTP area (also available over HTTP), ftp.rodents-montreal.org:/mouse/git-unpacked/livebackup/. diskwatch I also distribute as part of git repos. I have current versions for my NetBSD systems derived from 1.4T, 4.0.1, and 5.2. I export git repos holding my NetBSD /usr/src trees (and /usr/xsrc for 4.0.1 and 5.2, though those aren't relevant to livebackup). If $OS is 1.4T, 4.0.1, or 5.2, then the thing to git clone is git://git.rodents-montreal.org/Mouse/netbsd-fork/$OS/src, and an unpacked view of it is available on ftp.rodents-montreal.org in /mouse/git-unpacked/Mouse/netbsd-fork/$OS/src/. For people just wanting the diskwatch code, look in .../$OS/src/HEAD-link/tree/sys/dev/pseudo/ - you want the diskwatch* files from there. There are small hooks in the various disk drivers, too; for those, at the moment, all I have to suggest is cloning the relevant git repo and use git log to look for commits that touch the disk driver(s) you care about. The rest of this file (after the line of equal signs below) is the original README. Some of it is out of date (for example, where it talks about stuff in various directories here) but much of it is not (such as where it talks about backup files and vnconfig). /~\ The ASCII Mouse \ / Ribbon Campaign X Against HTML mouse@rodents-montreal.org / \ Email! 7D C8 61 52 5D E7 2D 39 4E F1 31 3E E8 B3 27 4B ================ Your attention is drawn to, in particular, the CHANGELOG file in 1.4T/user/livebackup/livebackup-*/ - the * represents the date I last worked on the code, and will change as I put up new versions. 1.4T/user/ is further broken down into the various pieces involved. The build system I use uses some additional stuff, which is in 1.4T/aux/; if you don't want to set up a build system like mine, you can just dump the .c and .h files in the same directory and use cc -I. when compiling. 1.4T/kaux/ contains the patches to add timer sockets. While these are used by the livebackup code, they are not strictly part of it, so I've kept their patches separate. Since I don't have a 2.0 system really set up properly for my build system, I've just dumped everything into one directory there. The aux/ directory holds bits that don't really have anything to do with the livebackup software (at this writing, only one file). The kernel/ and kaux/ subdirectories are patch trees, with directory structure parallel to /usr/src. (For example, kaux/sys/sys/socket.h is a patch for /usr/src/sys/sys/socket.h.) The port to 2.0 is a bit rough - the code really wants to use timer sockets for timing, and rather than redesign it, I have it fork another process with a pipe between them which acts enough like a timer socket to make the code work. I suspect someone who knows kqueue could use it to implement something much like timer sockets. The 2.0 port now includes the server side. It works for me, in that it starts without crashing and exhibits basic functionality; if you have reason to think it's busted somehow, please let me know! Note that after applying the kernel patches, you will need to create entries in /dev. For 2.0, assuming you used the patch to sys/conf/majors as distributed, these should look like crw------- 1 root 165, 0 May 4 18:31 diskwatch0ctl crw------- 1 root 165, 1 Feb 10 22:00 diskwatch0data crw------- 1 root 165, 2 Feb 10 22:00 diskwatch0dbg crw------- 1 root 165, 4 May 5 07:20 diskwatch1ctl crw------- 1 root 165, 5 Feb 10 22:00 diskwatch1data crw------- 1 root 165, 6 Feb 10 22:00 diskwatch1dbg (for two diskwatch devices; add 4 more to the minor numbers for each additional device). For 1.4T, it will depend on which architecture you're using - check your architecture's conf.c (eg, sys/arch/alpha/alpha/conf.c for an Alpha, sys/arch/i386/i386/conf.c for i386, etc) for the diskwatch major number. (The minor number scheme is the same for 1.4T and 2.0.) It's entirely possible I've missed something; if you think you've found something I've missed, please let me know. Because of time constraints, I have to get this stuff up for FTP before testing it nearly as thoroughly as I'd like, so if you think something is missing it is reasonably likely that you're right; don't waste a lot of time looking for your mistake on the theory that I couldn't possibly have got it wrong. :-) After corresponding with someone trying to use this, there are some things that need remark. - The code uses labeled control structure. These take the form of strings in angle brackets <"like this"> on various flow control constructs. When building native, these are recognized by the compiler (I've added them to the gcc I use); I have a program which is designed to run between the preprocessor and the compiler proper which converts them to semantically equivalent gotos and labels, for the benefit of those who don't want to do likewise to their compilers. Look in ../mouseware/ (that is, a sibling directory to the one this README is in) and read the README and PACKAGES files; the piece you want is called lcs-cvt. - There is a bug in the diskwatch code: if a user process sets a diskwatch unit watching a partition, and then (the same or another) user process tries to set the same diskwatch unit watching a different partition, the code will get confused: it will be watching both partitions but will have lost track of the first one. This is usually only a minor issue, since (unless you're crazy enough to let non-root users access the diskwatch devices), you have to be root to exploit it, but it does need to be fixed. - There is no way to watch a single partition multiple times. This is actually a shortcoming of the kernel diskwatch code, not of lb/lbd, and I know of nothing even approximating a workaround in the strict sense. But if you just want to keep multiple simultaneous backups of a partition, you can put the backup copy in its own partition on the server and then use lb on the server to back that partition up somewhere else. - When a client starts a rescan, it always scans from the beginning of the disk to the end. If it never completes a rescan, it will never have a complete backup. It would probably be better to start at a random place, so that after a suitably large number of partial rescans it probably will have hit all the disk. (Of course, this is at best a stopgap measure; you really should let it finish a rescan to have a good copy.) - For use in environments where network bandwidth is relatively precious, it should be possible to have a rescan send over checksums of a whole chunk of blocks, breaking it down into individual blocks only if the checksums over the chunk differ. Pro: less network bandwidth used (important for roaming users who have bandwidth caps). Con: checksums some blocks twice (slow); if a chunk is too large, most chunks will differ and it will be a net lose - but if a chunk is too small, too many identical checksums will be sent which could be collapsed into larger chunks. Finding the sweet spot here is somewhat nontrivial. Thought: try to find it automatically?) Someone asked me if you can do a "loopback" backup, a backup where the partition being backed up and the backup copy are both on the same machine. Yes, you can do this (though of course there's little point unless they're on different spindles). If you want to be extra-sure that the data won't actually escape onto the network, you can use 127.0.0.1 in both lb's "address to contact lbd at" field and lbd's "address to listen for connections at" field. You need to build a kernel with "pseudo-device diskwatch 2" (the number needs to be at least the number of partitions you're siccing lb on simultaneously) and boot that kernel before lb will start. If you get whines like lb: can't open diskwatch control device /dev/diskwatch0ctl: Device not configured then this is probably what's wrong. (It could also be that you have the device major/minor numbers mismatched between your /dev entries and your kernel, so check those if you think your kernel is supposed to have diskwatch in it.) At first sight it may look as though you can use lb on RAW_PART (partition c on most ports, partition d on i386 and maybe a few others) to back up an entire drive. While this will initially appear to work, it will not work right, because the mechanism that sidetracks copies of writes works only when the partition being written to is the partition being watched. Watching one partition will not notice writes occurring to another partition, even if the sectors actually written belong to both partitions. (RAW_PART is the commonest case of such overlap, but this is actually true more generally.) Thus, using RAW_PART will update the backup partition whenever it does a rescan, but unless that's the partition that's actually being used, you won't get *live* backups. It is reasonable to take a backup image as written by lbd and use vnconfig to attach a vnd to it to peek inside. However, there are two caveats. The first is the gotcha mentioned in one of the "to be fixed" items above: the file may not be quite as large as it should be. The other is that stretches of 0x00s on the disk tend to turn into holes in the backup image, since lbd won't write them during its initial scan if it's creating the file - and vnd does not get along well with holes in its backing file. There are at least three ways to deal with this. - Fix vnd. This is the best long-term fix, but is probably beyond most users of my software. - Pre-create the backup file, at least as big as the partition being backed up, before first having lbd update it - but use a tool that writes the entire file (such as dd if=/dev/zero) - tools that merely set the file size without writing the whole file won't help with this problem. - Fill in the holes with something like dd if=thefile of=thefile conv=notrunc bs=1048576 before attaching the vnd to the file. The name livebackup is perhaps unfortunate. While preparing to talk about this software at BSCan 2005, I discovered there is a commercial product called LiveBackup, from storactive.com; the name is apparently too obvious a name for the functionality (while their code is file-level rather than filesystem-level and for Windows rather than NetBSD, it sounds as though it's otherwise philosophically similar, backing up changes in real time or near-real-time). My code has nothing to do with their product except for the coincidence of the internal name. (Directories have to have _some_ name.) /~\ The ASCII Mouse \ / Ribbon Campaign X Against HTML mouse@rodents-montreal.org / \ Email! 7D C8 61 52 5D E7 2D 39 4E F1 31 3E E8 B3 27 4B