[Author Prev][Author Next][Thread Prev][Thread Next][Author Index][Thread Index]
[freehaven-cvs] Adjust pynchon-gate message-encrypting-and-packing f...
Update of /home/freehaven/cvsroot/doc/pynchon-gate
In directory moria.mit.edu:/tmp/cvs-serv1935
Modified Files:
pynchon-spec.txt
Log Message:
Adjust pynchon-gate message-encrypting-and-packing format to explain how to summarize undelivered messages for receipt/decryption in later cycles. Still not perfect.
Index: pynchon-spec.txt
===================================================================
RCS file: /home/freehaven/cvsroot/doc/pynchon-gate/pynchon-spec.txt,v
retrieving revision 1.5
retrieving revision 1.6
diff -u -d -r1.5 -r1.6
--- pynchon-spec.txt 15 Aug 2004 20:43:29 -0000 1.5
+++ pynchon-spec.txt 1 Sep 2004 19:47:22 -0000 1.6
@@ -5,6 +5,32 @@
original design by Bram Cohen and Len Sassaman)
+
+Table of Contents
+
+ 1. Overview
+ 1.1. Notation and terminology
+ 1.2. Cryptographic functions
+ 2. Encrypting and packaging messages
+ 2.1. Key maintenance
+ 2.2. Packaging messages
+ 2.2.1. Message synopses
+ 2.3. Managing pending messages
+ 2.4. Decrypting packaged messages
+ 3. The bucket pool
+ 3.1. Bucket pool formats
+ 3.2. Constructing the bucket pool
+ 3.3. Distributing the bucket pool
+ 4. Retrieving the bucket pool
+ 4.1. PIR protocol
+ 4.1.1. Error codes
+ 4.1.2. Optimizing the distributor implementation
+ 4.2. Retrieving a bucket
+ 4.3. Retrieving a cycle's worth of messages
+ 5. Account administration
+ 6. System information
+
+
1. Overview
The Pynchon Gate Protocol (PynGP) uses Private Information Retrieval to
@@ -54,21 +80,41 @@
R(n) -- n octets of random data.
INT(v,n) -- The nonnegative integer "v", expressed as a n-octet value
in network (big-endian) order.
- PRNG(s,n) -- A cryptographically string pseudorandom sequence of n octets,
+ PRNG(s,n) -- A cryptographically strong pseudorandom sequence of n octets,
generated using the seed s.
Z(n) -- A sequence of n zero octets.
ENC(s,k) -- Encrypt the string s using a stream cipher, with the key k.
DEC(s,k) -- Decrypt the string s using a stream cipher, with the key k.
+ COMPRESS(s) -- Compress the string s using the "deflate" algorithm and the
+ zlib message format as defined in RFCs 1950 and 1951.
+
+ "abc" -- A three-byte string, containing the ASCII values for the
+ characters "a", "b", and "c". Unless otherwise noted, strings are
+ _not_ NUL-terminated.
+
+ [61 62 63] -- A three-byte string, containing bytes with the hexadecimal
+ values 61, 62, and 63.
+
+ x[a:b] -- A substring of the string x, starting on the a'th character of
+ x, and ending with the b-1'th character of x. (Thus, x[0:LEN(x)] =
+ x.)
HASH_LEN -- The length of the output of our cryptographic digest function.
SEED_LEN -- The length of the seed for our PRNG.
- K -- A client-selected security parameter: how many distributors will the
- client connect to?
+ KEY_LEN -- The length of the keys used by our encryption functions.
- "abc" -- A three-character string. Unless otherwise noted, strings are
- _not_ NUL-terminated.
+ K -- A client-selected security parameter determining how many
+ distributors the client will connect to.
-2. Packaging messages
+ (We require that HASH_LEN >= KEY_LEN and HASH_LEN >= SEED_LEN)
+
+1.2. Cryptographic functions
+
+ To implement PynGP, we recommend AES-128 in counter mode for encryption
+ (so that ENC == DEC); SHA-256 for our hash function; and ENC(Z(n),s) as
+ our PRNG.
+
+2. Encrypting and packaging messages
2.1. Key maintenance
@@ -85,6 +131,9 @@
SUBKEY(j+1,i) = H(SUBKEY(j,i) | "NEXT SECRET")
SUBKEY(0,i) = H(S[i] | "NEXT SECRET")
+ {Rationale: We use a separate chain of keys for each cycle so that it is
+ easier for a user to resynchronize after missing a few cycles.}
+
Also, the nymserver derives an opaque UserID for each recipient each
cycle:
UserID[i] = H(S[i] | "USER ID")
@@ -102,27 +151,46 @@
nymserver. These messages can be emails, padding, or other status
messages.
- The messages in a cycle i are encrypted in order, with the j'th message on
- i encrypted using SUBKEY(j,i). Encryption uses a stream cipher, so that
- encrypted message length is the same as plain-text length. Since keys are
- not re-used, no IV is necessary.
+ The messages in a cycle i are encrypted in order, with the j'th message
+ received on i encrypted and identified using derivatives of SUBKEY(j,i) as
+ follows:
+
+ MsgID(j,i) = H(SUBKEY(j,i) | "MESSAGE ID")
+ MsgKey(j,i) = H(SUBKEY(j,i) | "MESSAGE KEY")
+
+ MsgKey(j,i) is used as a key to encrypt the message, and MsgID(j,i) is
+ prepended to the encrypted message in order to identify it to the user
+ (Message IDs are necessary because messages may be packaged out of order.)
+ Encryption uses a stream cipher, so that encrypted message length is the
+ same as plain-text length. Since keys are not re-used, no IV is
+ necessary.
+
+ {Rationale: We derive MsgKey from SUBKEY rather than using SUBKEY
+ directly, so that a user can store MsgKey(j,i) without worrying that a
+ compromise will expose MsgKey(j+1,i).}
+
+ [XXXX Should we just have MsgID(j,i) = H(MsgKey(j,i)) ? - NM ]
+
+ The 0'th subkey is reserved for the INDEX message. The 1'st subkey is
+ reserved for the SUMMARY message.
All messages begin with a single-octet 'TYPE' field. The values for TYPE
are:
- PAD_REST [00]
- MAIL [01]
- ACK [02]
+ INDEX [00]
+ PAD_REST [01]
+ MAIL [02]
+ ACK [03]
+ SUMMARY [04]
ERROR [FF]
All message types except for PAD_REST use in the following format:
TYPE Message type [1 octet]
- LEN Message length [4 octets]
DATA Message body [LEN octets]
HASH Hash of message [HASH_LEN octets]
- HASH is equal to H(TYPE|LEN|DATA). It is redundant.
+ HASH is equal to H(TYPE|DATA). It is redundant.
PAD_REST uses a variable length format:
@@ -130,38 +198,120 @@
PADDING Random data [Up to end of the current bucket]
PAD_REST can only appear as the last of a user's messages in a given
- cycle; thus, it does not need a length field. (In fact, it had better
- *not* have a length field, since it needs to be able to fit in one octet.)
+ cycle; thus, it does not need a length field.
Other message types have bodies as follows:
- TYPE: MAIL [01]
+ TYPE: INDEX [00]
+ DATA: A description of the contents of all messages in today's
+ buckets. It begins with the number of messages in today's
+ buckets, encoded as a 4-byte big-endian integer. Then,
+ for each message, in includes:
+ ID -- The MsgID for the message [HASH_LEN octets]
+ LEN -- The length of the encrypted message [4 octets]
+ NOTES: There MUST be exactly one index message in each cycle's worth
+ of buckets, and it MUST appear first. The index message
+ itself is not included in the count of messages.
+
+ The index message is always encrypted with MsgKey(0,i).
+
+ TYPE: MAIL [02]
DATA: One or more email messages, concatenated, then compressed.
- Each message M is packaged as INT(LEN(M),4)|M .
+ Each message M is prefixed with a length field, as
+ INT(LEN(M),4)|M .
NOTES: Nymservers SHOULD generate MAIL messages for incoming email
as greedily as possible, to minimize the amount of time that
unencrypted mail is stored.
- TYPE: ACK [02]
+ TYPE: ACK [03]
DATA: C Cookie [HASH_LEN octets]
NOTES: Acknowledges that the nymserver has received a control
message from the user and processed it. The 'cookie'
must match the cookie sent in the control message being
acknowledged.
+ TYPE: SUMMARY [04]
+ DATA: Message synopses for messages not included in this cycle's
+ bucket. See 2.2.1 below.
+
TYPE: ERROR [FF]
DATA: ERR Error code [2 octets]
MSG Message [Variable]
+2.2.1. Message synopses
- {XXXX Add more message types, including 'summary of pending mail'.}
+ When it is not possible for the nymserver to package all of a user's
+ pending mail, it instead packages as much as it can, and uses the rest of
+ the user's space to include a synopsis for each undelivered message it
+ received in that cycle.
- {XXXX There's a problem with our encryption scheme and our long-term
- goals. We'd like to be able to queue pending mail if there's too much
- mail, or set rule for delivering mail according to user-set priorities.
- But this seems to conflict with the idea of chaining encryption keys and
- greedily encrypting everything as soon as it comes in. This needs an
- answer!}
+ The synopsis for an email consists of one or more of the email's headers,
+ in RFC822 format. The following headers MUST be included in synopsis when
+ present: "From", "To", "Cc", "In-Reply-To", "Message-ID", "Subject".
+ Other headers MAY be included or omitted. Nymservers MAY add additional
+ headers for user convenience, for example by running SpamAssassin or
+ similar tools to mark suspicious messages as probably spam.
+
+ The nymservers SHOULD compress and encrypt each synopsis at the same time
+ as it encrypts its corresponding messages. To encrypt the synopsis for
+ message j, received on cycle i, the nymserver uses the key: SynopKey(j,i)
+ = H(SUBKEY(j,i) | "SYNOPSIS KEY")
+
+ To pack multiple synopses into a SUMMARY message, each is first prefixed
+ with its corresponding message ID, then a 4-octet integer size field.
+
+2.3. Managing pending messages
+
+ When a nymserver receives an email for a user, it encrypts that message
+ and its synopsis as described in 2.2 and 2.2.1. above, and stores them,
+ indexed by the user, by the message ID (described in 2.2) and by the cycle
+ in which they were received.
+
+ When building a user's buckets for a given cycle, the nymserver tries
+ to include all the pending messages for that user. If this is not
+ possible, the nymserver instead includes:
+
+ - As many pending messages as possible.
+ - All ACK and ERR messages.
+ - A SUMMARY message listing all pending messages *not* included in
+ the current cycle.
+
+ [XXXX Should we list everything that's pending, or should we list
+ everything that's pending and new? -NM]
+
+ When choosing which messages to include, the nymserver SHOULD by default
+ include oldest first, but MAY allow users to configure other preferences,
+ and MUST allow users to specifically override its choice of messages by
+ requesting particular messages to be delivered or deleted (see section 5
+ below).
+
+ The nymserver SHOULD delete pending messages that it has not been able
+ to deliver for a long time, and MAY delete messages to force disk quota
+ compliance.
+
+2.4. Decrypting packaged messages
+
+ Upon receiving their messages for a cycle, clients decrypt the messages
+ as follows.
+
+ 1. First, decrypt the INDEX message with MsgKey(0,i). Use the
+ INDEX message to split the bucket into messages; associate each
+ message with its message ID.
+
+ 2. For each message ID in the INDEX message, check to see whether we
+ have stored that message ID from a previous cycle. If so, use the
+ corresponding message key to decrypt the message.
+
+ 3. See whether we have a message with MsgID(1,i). If so, this is
+ the SUMMARY message. Decrypt it.
+
+ 4. Generate MsgID(2,i) ... MsgID(N,i) until a message ID is found
+ which is not listed in the SUMMARY message or the INDEX message.
+ Decrypt all messages listed in the INDEX message. For each
+ MsgID in the summary, store it and its corresponding key until
+ the message is received in a later cycle, or until the user tells
+ the nymserver to delete the message, or until the message is "too
+ old"
3. The bucket pool
@@ -563,7 +713,8 @@
5. Account administration
- {XXXX Write me.}
+ {XXXX Write me. Clients need a way to set preferences, create accounts,
+ request messages to be included, request message to be deleted.}
6. System information
***********************************************************************
To unsubscribe, send an e-mail to majordomo@seul.org with
unsubscribe freehaven-cvs in the body. http://freehaven.net/