New cloud member connects to any existing cloud member Cloud members have: - cloud-wide master key, K - cloud IP Cloud members attempt to establish a complete graph of connections Delays get flooded throughout the cloud in an attempt to use well-working paths B(x) is a one-byte string whose byte contains x. HASH(s) is the SHA256 hash operation. EECB(p,k) is ECB Rijndael encryption of p with key k. p and k must each be 32 bytes long. DECB(p,k) is ECB Rijndael decryption of p with key k. p and k must each be 32 bytes long. CRC32(s) is CRC-32 with polynomial edb88320 and initial value 31bb29b0. on connection: - listener sends 16 random bytes, Rl. - Connecter computes Rc = 16 random bytes t = CRC32(K||Rl||K||Rc||K) b = one-byte XOR of all bytes of t xs = B(b) || B(b+1) || ... || B(b+31) s = Rl || t || Rc hk = HASH(K) for (i=0;i<64;i++) s = EECB(EECB(0,HASH(K||B(i)||s||B(i)||K)) XOR xs, hk) and sends Rc || b || s - Listener checks this. On failure, listener silently throws away all state, leaving the connecter with a half-open connection. On success, listener sleeps for a short time (a few seconds) before continuing. - Each end computes Kl2c = HASH( "->" || K || Rl || "->" || Rc || K || "->" ) Kc2l = HASH( "<-" || K || Rl || "<-" || Rc || K || "<-" ) - Kl2c and Kc2l are master keys in the listener-to-connecter and connecter-to-listener directions. In each direction, this key is called Kd; the other direction's Kd is Kd'. - Immediately after the above, each direction rekeys. To rekey, an end: - Generates 32 random bytes, Rk, and sends EECB(Rk,Kd) (in the clear). - Computes t = Kd || Rk || Kd' k = "" for (i=0;i<8;i++) t = EECB(HASH(t||Rk||B(i)||Rk'||t),Kd) k = k || t The resulting k is used as an arcfour key for data that end sends, with the first 64KB of the arcfour keystream discarded. The receiver must of course mimic the computation using the decoded Rk and the Kd and Kd' values the other end used, to obtain the key to decrypt the data stream. - Generates two random bytes, L1 and L2, and sends L1 and L2 as the next two bytes of data (ie, the first two bytes encrypted with the new arc4 keystream), L1 first. The number of bytes before the next rekey, ie, the number of bytes sent between L2 and the next rekey's EECB(Rk,Kd), is 0x3fff0000 + (L1 * 256) + L2. Senders and receivers must keep track of bytes sent and received to know where rekeys occur in the data stream. Data layered atop the above low-level protocol consists of a stream of packets. A packet consists of length, type, and contents; the contents may be up to 65535 bytes long. The type can be one of IGNORE The whole packet is ignored by its receiver. PING PONG These allow a member to (a) determine connectivity to and (b) find out the round-trip time to another member. WHERE-ARE-YOU I-AM-AT These allow cloud members to learn where they might be able to connect directly to other could members. DISTANCE This carries cloud members' reachabilities. IP This is an IP packet for another cloud member. In detail: A packet consists of: 1 T Type 6d IGNORE 46 PING 1a PONG 91 WHERE-ARE-YOU 37 I-AM-AT e3 DISTANCE 8c IP (These were chosen randomly under the constraint that each pair of codes have a Hamming distance at least 4.) 2 L Length (big-endian) L Contents Detailed packet descriptions: IGNORE Contains anything. Ignored by its receiver. PING Contains anything. Provokes a PONG from its receiver. PONG Sent only in response to a PING. Contains the same thing as the PING which provoked it. WHERE-ARE-YOU Contains dst cloud member IP address src cloud member IP address nonce zero or more bytes chosen by sender This is routed as if it were an IP packet for dst; that cloud member then generates an I-AM-AT in response. I-AM-AT Sent only in response to a WHERE-ARE-YOU. Contains dst cloud member IP address N one-byte count of following addresses loc[] N IP address/port pairs nonce zero or more bytes; copied from WHERE-ARE-YOU This is routed as if it were an IP packet for dst, which is the src from the provoking WHERE-ARE-YOU. N and loc[] are IP addresses and ports at which the I-AM-AT's originator thinks it might be reachable by dst. nonce is copied from the WHERE-ARE-YOU. If packet length limits prevent including the entire nonce, it should be truncated. If N would naïvely be more than 255, the cloud member generating the I-AM-AT must choose no more than 255 addresses to include in the packet. DISTANCE Contains addr cloud member IP address delay delay, a 4-byte integer in network byte order This indicates that the delay to addr via the member from which the DISTANCE was received is delay. Delays are given in milliseconds. Upon receiving a DISTANCE from a sender S giving distance D to addr A, a receiver R updates its record of the last delay received from S for A to D and computes the minimum delay to A across all its neighbours as its delay to A. If the delay to A changes as a result of this, R generates DISTANCEs to its neighbours for A, with the delay in the packet to neighbour N being R's delay for A plus R's PING-measured delay to N. Every cloud member has a permanently-fixed zero delay value for itself. IP Contains a cloud member's IP address (see below), followed by zero or more bytes of packet contents. This packet is routed to the cloud member whose IP it bears; that member then treats everything after the address as a reeceived IP packet. If the receiver believes the destination member is unreachable, it should silently drop the packet. An IP address is represented as a protocol ID byte followed by zero or more bytes depending on the protocol ID. When IP addresses are accompanied by port IDs, the port information is also protocol-specific and follows the address. Currently defined protocol ID bytes, with descriptions of their address and port formats: 0 IPv4 Address: four bytes in network byte order. Port: two bytes in network byte order. 1 IPv6 Address: sixteen bytes in network byte order. Port: two bytes in network byte order. Each member monitors the delay to each of its neighbours with PING/PONG packets. Each member also periodically sends each of its neighbours a DISTANCE packet, with addr set to its own address and delay set to the PING-measured delay to that neighbour. In the absence of topology changes, these are sent out approximately once a minute, with some randomness to avoid periodic traffic spikes: the time between each DISTANCE and the next, in seconds, is a number uniformly distributed between 60 and 65. Given reason, such as a topology change, this time may be cut short and a new DISTANCE generated immediately.