lb - live backup

This documentation file is in the public domain.

lb is designed to do, basically, over-the-net disk mirroring (usually
to files, but potentially to disks).

There are two modes of operation: client and server.  As a client, it
receives notifications from the kernel of writes and mirrors them over
the net to the server.  If it loses the connection to the server, it
retries persistently to reestablish it until it succeeds.  As a server,
it listens for the client to connect, then handles the server side of
the mirroring.  If it loses the connection to the client, it goes back
to waiting for the client to connect.

You will need one <address,port> pair for each filesystem to be
mirrored, since there is no other demultiplexing layer.

Since lb is intended to be useful even over long-haul networks, it
encrypts all communication, leveraging a shared secret in a manner
vaguely akin to (but quite different in detail from) Diffie-Hellman, to
generate a key used for encryption in each direction.

The network protocol between the client and server is fairly simple.
At startup, the client and server exchange data from which shared
encryption keys are derived.  After these, all communication is
encrypted.  A verifier is exchanged, so that each end can verify that
the shared keys are in fact shared.  Then, a simple packetized protocol
is run in each direction.

In detail:

On startup, each end generates and sends, unencrypted, 16 bytes of
random data.  If the data sent by the server is S, the data sent by the
client is C, and the shared key is K, then each end computes thus:

Let M[0] = SHA1( K || S || 0x73 0x2d 0x3e 0x63 || C || K )
Let M[i] = SHA1( K || S || 0x73 0x2d || M[i-1] || 0x3e 0x63 || C || K )
	for i in 1..31

Compute a 237-byte string A by overlapping the M[] values thus (ie,
each shifted by 7 bytes relative to the last) and adding overlapped
bytes modulo 256 (ie, with carries within, but not between, bytes):

M[0] = x x x x x x x x x x x x x x x x x x x x
M[1] =               x x x x x x x x x x x x x x x x x x x x
M[2] =                             x x x x x x x x x x x x x x...
M[3] =                                           x x x x x x x...

The 256-byte string A || SHA1(A)[0..18] is used as an arcfour key;
after keying, the first 65536 bytes of the key stream are discarded.
The result is used to encrypt the server->client data stream.

The same computation is repeated, with S and C interchanged and the
0x73 and 0x63 in the M[] computation interchanged, for client->server
encryption.

After this is done, each end generates another 16 bytes of random data,
which it writes to the peer, encrypted with the first 16 bytes of the
encryption stream for that direction.  The peer decrypts them and echos
them back encrypted suitably for the other direction - thus, each side
should get its random token back unchanged, after all encryption and
decryption has been done.  If this is not so, the end that discovers
the problem summarily drops the connection.

All multi-byte numbers below are sent big-endian (network byte order).

Assuming the crypto exchange passes, each end drops into a simple
packetizing protocol.  A packet consists of a one-byte type, followed
by additional data whose quantity and interpretation depend on the
type.  The types are (see lb.h for numeric values):

LB_DATA (client->server only)
	This represents a block of data to be written to the backup.
	It is followed by 516 bytes: first 4 bytes of block number,
	then 512 bytes of block data.

LB_RQSUMS (client->server only)
	This is a request to send back checksums of some blocks.  It is
	followed by 10 bytes: first 4 bytes of starting block number,
	then 4 bytes of block count, then 1 byte of blocking factor,
	then 1 byte of checksum type.  The server responds with a
	stream of LB_SUMS packets.  If the block count is N and the
	blocking factor is F, then ceil(N/F) LB_SUMS packets will be
	generated.  All but the last will contain F checksums; the last
	will contain N-(F*(ceil(N/F)-1)) checksums.  (This is modified
	if the request is aborted with LB_STOPSUM.)

	Checksum types are given helow.

LB_STOPSUM (client->server only)
	This aborts an LB_RQSUMS the server is still responding to.  It
	carries no additional data.  It always provokes an LB_ABORTED;
	if the server is still generating LB_SUMS packets when it
	processes the LB_STOPSUM, it stops doing so as soon as it sees
	the LB_STOPSUM packet.  It is an error for the client to send
	another LB_RQSUMS before a previous one has finished, without
	doing an LB_STOPSUM first.  It is not an error to send an
	LB_STOPSUM when no LB_RQSUMS is in progress, so the race
	between the server finishing an LB_RQSUBS and the client
	aborting it is a non-issue.

LB_SIZE (client->server only)
	This passes the size from the client to the server.  It is
	normally sent when the client finishes a rescan.  It carries 4
	bytes of data, which gives the partition size in blocks.

LB_STATUS (client->server only)
	This reports client status - a string of 0 to 255 octets - to
	the server, for reporting to humans.  It carries 1-256 bytes of
	data: the first byte is the string length, with that many more
	bytes holding the string.

LB_SUMS (server->client only)
	This carries checksums.  It is followed by a variable number of
	checksums, as described under LB_RQSUMS.

LB_ABORTED  (server->client only)
	This indicates that an LB_STOPSUM has been processed.  It
	carries no additional data.  If an LB_RQSUMS was in progress,
	no further LB_SUMS will be generated for it after the
	LB_ABORTED.

LB_PING (either direction)
LB_PONG (either direction)
	Thse form an are-you-alive test.  Neither type carries any
	additional data.  LB_PING always provokes an LB_PONG response,
	without affecting anything else that may be in progress;
	LB_PONG is never sent except in response to LB_PING.

LB_CAP (either direction)
	This negotiates capabilities.  It comes in three forms: query,
	negative response, and positive response.  Query is used to
	inquire whether the other end supports (and wishes to perform)
	some capability.  Negative responses are used for "no" answers
	to queries; positive, for "yes".

	The first byte of data indicates which kind of packet it is
	(LB_CAP_QUERY, LB_CAP_NEGATIVE, or LB_CAP_POSITIVE).  This is
	then followed by one octet of capability name length, then that
	many octets of capability name.  For QUERY and NEGATIVE, there
	is no further data.  For POSITIVE, there is one more octet,
	which is either zero, if the capability does not need an opcode
	octet, or an opcode octet, chosen from the dynamically
	allocated range, which is to be used for that capability.  This
	opcode applies only to traffic sent by the end which queried;
	if the capability involves traffic in the other direction, it
	either needs to be negotiated with a QUERY/POSITIVE exchange in
	the other direction or it needs to choose and communicate a
	suitable opcode octet some other way.

	Capabilities are, in general, enabled separately in each
	direction, though of course a capability definition could
	specify that enabling it in one direction does in some sense
	imply enabling it in the other direction as well.

Checksum types:

CKT_SHA1
	Each checksum is 20 bytes, containing the SHA-1 hash of the
	block.

CKT_SUM_SHA1
	Each checksum is 21 bytes, the first containing the mod-256 sum
	of all bytes in the block, the rest containing the SHA-1 hash
	of the block.