[Author Prev][Author Next][Thread Prev][Thread Next][Author Index][Thread Index]
[freehaven-cvs] Checkpoint spec notes and performance notes in CVS

To: freehaven-cvs@freehaven.net
Subject: [freehaven-cvs] Checkpoint spec notes and performance notes in CVS
From: nickm@seul.org (Nick Mathewson)
Date: Fri, 7 May 2004 02:13:14 -0400 (EDT)
Delivered-to: archiver@seul.org
Delivered-to: freehaven-cvs-outgoing@seul.org
Delivered-to: freehaven-cvs@seul.org
Delivery-date: Fri, 07 May 2004 02:13:59 -0400
Reply-to: freehaven-dev@freehaven.net
Sender: owner-freehaven-cvs@freehaven.net
Update of /home/freehaven/cvsroot/doc/pynchon-gate
In directory moria.mit.edu:/tmp/cvs-serv3116

Added Files:
	pynchon-performance.txt pynchon-spec.txt 
Log Message:
Checkpoint spec notes and performance notes in CVS

--- NEW FILE: pynchon-performance.txt ---
   N -- number of users
   S -- number of distributors
   K -- security parameter for PIR.
   B -- bucket size
   Vol[i] -- Volume of message received on a given "day" by user i.
   CVol -- Volume of messages received on a given "day" by user i,
       after compression.
   MCVol[i] -- Maximum compressed volume that user i can receive in
        one "day".
   W -- Window in which users must retrieve their messages (in "days").

   ME -- metaindex entry size (36 bytes)
   IE -- index entry size (40 bytes)
   SS -- stream seed size (32 bytes)

   Number of message buckets needed (m):
     SUM( CEIL( CVol[i] / B ) )

   Number of index buckets needed (I):

     CEIL( (N * IE) / (B-IE) )

   Size of metaindex (MS):

     ME * I

   Total storage needed per pool (PS):

     B * (m+I) + MS

   Storage needed at each distributer:

     PS * W

   Bandwidth of nymserver/collator per "day" (assuming idealized
      Bittorrent):

     SUM(Vol[i]) + PS

     (I'm assuming an idealized Bittorrent where every participant has
     equal bandwidth; where everybody but the original seed downloads
     the entire file once; where everybody uploads the entire file
     once; and where tracker bandwidth is negligible.)

   Client bandwidth per "day":

     User i needs to fetch
         CEIL( MCVol[i] / B ) + 1
     buckets per day.  Call this Buckets[i].  Retrieving each bucket
     takes K-1 requests of SS bytes each, and one request of (m+I)
     bits.  Each such request yields B bytes.  So for PIR, User i uses
     bandwidth of:

         Buckets[i] * [ (K-1)*SS + (m+I)/8 + B ]

     Call the above volume ClientPIRVol[i].

     Additionally, each user needs to fetch the metaindex via
     bittorrent.  This should take (assuming the idealized bittorrent
     characteristics above), an additional (2*MS) bytes per "day".

   Distributer bandwidth per "day", assuming idealized Bittorrent:

     PS + SUM( ClientB[i]) ) / S

===============
Comparison: Type I nymservers

   (Ignoring Type I overhead.)

   r -- reply block size
   L -- reply block path length
   S -- number of mixes on mix net.

   Storage needed at nymserver:

      r * N

   Bandwidth needed at nymserver, per "day":

      SUM(Vol[i]) + SUM(CVol[i])

   Bandwidth needed by user i, per "day":

      CVol[i]

   Bandwidth needed across entire mix net, per day:

      2*L*SUM(CVol[i])

      (every node in each reply block must receive and send every
      message.)

   Bandwidth used by mix node, per "day":

      2*L*SUM(CVol[i])/S

===============
Comparison: alt.anonymous.messages, used securely.

   S -- number of news servers

   Storage needed at each news server:

      W * SUM( CVol[i] )

   Bandwidth needed at news server, per day:

      SUM( CVol[i] ) * (N/S + 1)

      (Each server retrieves each message once, and relays all posts
      to N/S users.)

   Client bandwidth per "day":

      SUM( CVol[i] )

===============
Comparison: Type III nymserver design (Underhill)

   r -- reply block size (2K)
   P -- payload size (28K)
   M -- packet size (32K)
   W -- maximum interval at which users must replenish reply blocks,
        in "days".
   S -- number of mixes on mix net.

   All of this stuff is *very* approximate.

   Storage needed at nymserver:

      (best case -- holding only reply blocks.)
      >= r * SUM( CEIL(CVol[i]/P) * W )

      (worst case -- out of reply blocks for everybody.)
      < SUM( CEIL(CVol[i]/P) * W)

   Bandwidth needed at nymserver, per day:

      (Nymserver receives messages, sends compressed messages in
      packets, and receives SURBs.)

      SUM(Vol[i]) + SUM( CEIL(CVol[i]/P) * M) + SUM(CEIL(CVol[i]/P)*r)
      == SUM(Vol[i]) + (M+r)*SUM(CEIL(CVol[i]/P))

   User i bandwidth, per "day":

      CVol[i]*CEIL(CVol[i]/P)

   Bandwidth per mix on mix net, per "day"

      [2 * L * (M+r) * SUM(CEIL(CVol[i]/P)) ] / S

--- NEW FILE: pynchon-spec.txt ---
$Id: pynchon-spec.txt,v 1.1 2004/05/07 06:13:12 nickm Exp $
                  Pynchon Gate Protocol Draft Specification

                       (Spec draft by Nick Mathewson;
               original design by Bram Cohen and Len Sassaman)


1. Overview

   The Pynchon Gate Protocol (PynGP) uses Private Information Retrieval to
   provide strong pseudonymous email delivery.  The protocol works (roughly)
   as follows:

       ----------     ----------
       I Sender I ... I Sender I
       ----------     ----------
            I  email    I
            V           V
       ----------------------
       I Nymserver/Collator I<--------------------
       ----------------------                    I Administrative
           I bucket pool  I                      I messages
           V              V                      I
    ---------------     ---------------          I
    I Distributor I ... I Distributor I          I
    ---------------     ---------------          I
          I  PIR buckets I                       I
          V              V                       I
          ----------------                       I
          I   Recipient  I------------------------
          ----------------

   Messages arrive at a nymserver, addressed to pseudonymous identities.  The
   nymserver encrypts the messages as they arrive, and stores them until the
   end of the current fixed "cycle".  Once a cycle has elapsed, the nymserver
   collates the messages into a "bucket pool," and relays the entire bucket
   pool for the cycle to a set of distributors.

   To retrieve her mail, the recipient retrieves a description of each
   cycle's bucket pool, and performs a PIR operation with several
   distributors to retrieve "her" buckets.  If she performs the operation
   correctly, and if any of the distributors she chooses are honest, then the
   distributors cannot learn which buckets she has downloaded.

   The recipient can administer her account by sending "control messages" to
   the nymserver through an anonymizing layer such as the Type III remailer
   network.

1.1. Notation and terminology

   H(x) -- A cryptographic digest of a string x.
   x|y  -- The string x concatenated with the string y.
   LEN(x) -- The length of a string x.
   R(n) -- n octets of random data.
   INT(v,n) -- The nonnegative integer "v", expressed as a four-octet value
       in network (big-endian) order.
   PRNG(s,n) -- A cryptographically string pseudorandom sequence of n octets,
       generated using the seed s.
   Z(n) -- A sequence of n zero octets.
   ENC(s,k) -- Encrypt the string s using a stream cipher, with the key k.
   DEC(s,k) -- Decrypt the string s using a stream cipher, with the key k.

   HASH_LEN -- The length of the output of our cryptographic digest function.
   SEED_LEN -- The length of the seed for our PRNG.
   K -- A client-selected security parameter: how many distributors will the
        client connect to?

   "abc" -- A three-character string.  Unless otherwise noted, strings are
            _not_ NUL-terminated.

2. Packaging messages

2.1. Key maintenance

   For every recipient, the nymserver stores a long-term public key, and a
   short-term shared secret.  Shared secrets are set at account creation
   time, and can be reset by recipients later on.

   The shared secret is updated every cycle, such that, if S[i] is the
   shared secret in cycle i, then
      S[i+1] = H(SS[i]|"NEXT CYCLE").

   From each SS[i], the nymserver derives a set of sub-secrets for individual
   messages received that cycle.  The j'th sub-secret on day i is:
      SUBKEY(j+1,i) = H(SUBKEY(j,i) | "NEXT SECRET")
      SUBKEY(0,i) = H(S[i] | "NEXT SECRET")

   Also, the nymserver derives an opaque UserID for each recipient each
   cycle:
      UserID[i] = H(S[i], "")

   Implementations SHOULD discard old secrets as soon as they are no longer
   needed.  Thus, at the start of each cycle i, a nymserver SHOULD derive
   S[i+1], UserID[i], and SUBKEY(0,i), and immediately discard S[i].  After
   using each SUBKEY(j,i), the nymserver SHOULD discard calculate
   SUBKEY(j+1,i) and discard SUBKEY(j,i).  After each cycle, the nymserver
   SHOULD discard the last SUBKEY, and UserID[i].

2.2. Packaging messages

   Every cycle, a user receives zero or more encrypted 'messages' from the
   nymserver.  These messages can be emails, padding, or other status
   messages.

   The messages in a cycle i are encrypted in order, with the j'th message on
   i encrypted using SUBKEY(j,i).  Encryption uses a stream cipher, so that
   encrypted message length is the same as plain-text length.  Since keys are
   not re-used, no IV is necessary.

   All messages begin with a single-octet 'TYPE' field.  The values for TYPE
   are:

         PAD_REST  [00]
         MAIL      [01]
         ACK       [02]
         ERROR     [FF]

   All message types except for PAD_REST use in the following format:

         TYPE     Message type    [1 octet]
         LEN      Message length  [4 octets]
         DATA     Message body    [LEN octets]
         HASH     Hash of message [HASH_LEN octets]

   HASH is equal to H(TYPE|LEN|DATA).  It is redundant.

   PAD_REST uses a variable length format:

         TYPE     PAD_REST type   [1 octet]
         PADDING  Random data     [Up to end of the current bucket]

   PAD_REST can only appear as the last of a user's messages in a given
   cycle; thus, it does not need a length field.  (In fact, it had better
   *not* have a length field, since it needs to be able to fit in one octet.)

   Other message types have bodies as follows:

         TYPE:  MAIL [01]
         DATA:  One or more email messages, in /var/spool/mail 'mbox' format,
                compressed.
         NOTES: Nymservers SHOULD generate MAIL messages for incoming email
                as greedily as possible, to minimize the amount of time that
                unencrypted mail is stored.

         TYPE:  ACK [02]
         DATA:  C         Cookie         [HASH_LEN octets]
         NOTES: Acknowledges that the nymserver has received a control
                message from the user and processed it.  The 'cookie'
                must match the cookie sent in the control message being
                acknowledged.

         TYPE:  ERROR [FF]
         DATA:  ERR      Error code     [2 octets]
                MSG      Message        [Variable]


   {XXXX Add more message types, including 'summary of pending mail'.}

   {XXXX Can we _have_ any idea of pending mail and mail priorities that
    still works with our chained encryption keys, and doesn't stop us from
    greedily encrypting?  Must re-think.}

3. The bucket pool

3.1. Bucket pool formats

   Users find and validate their buckets through a three-level indexing
   structure:

          ------------------------------
          I        Meta-index          I
          ------------------------------
           I                      I
           V                      V
      ----------------     ----------------
      I Index Bucket I ... I Index Bucket I
      ----------------     ----------------
       I ... I ... I ...
       I     V     V
       V
   ------------------   ------------------
   I Message bucket I-->I Message bucket I--> ...
   ------------------   ------------------

   For each cycle, all users download a common signed meta-index, from which
   they learn:
         1. Which index bucket points to their messages.
         2. The hash of that index bucket.

   Users retrieve the correct index bucket via PIR, and learn:
         1. Which message bucket(s) contain their messages
         2. The hash of the first such message bucket.

   Finally, users retrieve their message buckets via PIR.  Each message
   bucket contains:
         1. A portion of the user's packaged messages.
         2. A hash of the user's next message bucket (if this message bucket
            is not the user's last).

   Metadata:

      V      Protocol version        [2 octets]
      NSID   Nymserver ID            [HASH_LEN octets]
      CYC    Cycle number            [4 octets]
      BS     Bucket size             [4 octets]
      MLen   Length of metaindex     [4 octets]
      MI     Metaindex               [MLen octets]
      SLen   Length of signature     [2 octets]
      SIG    Signature               [SLen octets]

   The protocol version for this version of the spec is 0.  The cycle number
   is the index of cycle for which this metadata applies; cycle numbers
   should begin at 0 and increase by 1 for each cycle.

   SIG is an RSA signature of the hash of the metadata up through MI, using
   OAEP padding and PKCS1 encoding.

   Metaindex entry

      UID    First UserID in index bucket [HASH_LEN octets]
      IDX_H  Hash of index bucket         [HASH_LEN octets]

   Index entry

      UID    UserID                               [HASH_LEN octets]
      M_IDX  Index of user's first message bucket [4 octets]
      M_H    Hash of user's first message bucket  [HASH_LEN octets]

   Message bucket

      H      Hash of next message bucket          [HASH_LEN octets]
      DATA   Packaged message data                [BS-HASH_LEN octets]

3.2. Constructing the bucket pool

   At the end of every cycle, the collator takes each recipient's packaged
   messages for that cycle and assembles them into buckets, as follows:

      1. Choose a size in bytes for the cycle's buckets.  Call this size
         BUCKET_SIZE.

      2. Sort all users who will receive messages in this cycle by their
         UserIDs.  Let USER[i] = the i'th such user.

         Let N be the number of users who will receive any messages in this
         cycle.

      3. Determine the number N_INDEX of "index buckets" that
         will be needed.  This will be equal to

            N_INDEX = CEIL( (N * INDEX_ENTRY_LEN) / USERS_PER_BUCKET )

         where
            USERS_PER_BUCKET = FLOOR(BUCKET_SIZE / INDEX_ENTRY_LEN)

      4. For each user USER[i], determine the number of buckets needed to
         hold all of U's packaged messages.  This will be equal to:

            N_BUCKETS[i] = CEIL( VOLUME[i] / (BUCKET_SIZE - HASH_LEN) )

         where VOLUME[i] is the total length of User[i]'s packaged messages.

         Let FIRST_MSG_BUCKET[i] = N_INDEX + SUM{j=0..i-1} N_BUCKETS[j]

         Let TOT_MSG_BUCKETS be the sum of N_BUCKETS[i] for all U.

      6. Construct the message buckets.

         For each element of USER[i], in reverse order:

             Concatenate the packaged messages of USER[i] into a string M.

             Let PAD_LEN = BUCKET_SIZE-HASH_LEN *
                           CEIL( LEN(M) / (BUCKET_SIZE-HASH_LEN) ) - LEN(M)

             Append a PAD_REST message to M, of length PAD_LEN.

             Divide M into N_BUCKETS[i] strings of length BUCKET_SIZE-HASH_LEN

             For j = N_BUCKETS[i]-1 down to 0:

                Let INDEX = FIRST_MSG_BUCKET[i] + j

                Let BUCKET[INDEX] = H(BUCKET[INDEX+1]) | S[j]

      7. Construct the index buckets.

         For j = 0 to N_INDEX-1,

             Let FIRST_USER[j] = j*(USERS_PER_BUCKET)
             Let LAST_USER[j] = (j+1)*(USERS_PER_BUCKET) - 1

             Let B = ""

             For i = FIRST_USER[j] through LAST_USER[j],

                 Let IdxEnt = UserID[USER[i]]            |
                              INT(FIRST_MSG_BUCKET[i],4) |
                              H(BUCKET[FIRST_MSG_BUCKET[i]])

                 Let B = B | IdxEnt

             Let BUCKET[j] = B.

      8. Construct the meta-index.

         Let MI = ""

         For j = 0 to N_INDEX-1:

            Let MetaIdxEnt = UserID[FIRST_USER[j]] | H(BUCKET[j])

            Let MI = MI | MetaIdxEnt

         The meta-index is equal to the final value of MI.


   Note: in practice, collators should run steps 6, 7, and 8 in parallel,
   so that they only need to process each message bucket once.

3.3. Distributing the bucket pool

   Every distributor needs a copy of the bucket pool.  To get it, at the
   end of each cycle, the distributors download it via Bittorrent.

   {XXXX Specify how they know what URL to use; how they know when it's
    ready, etc.}

   After downloading the pool, distributors check that all of its hashes
   match the values computed above.

4. Retrieving the bucket pool

4.1. PIR protocol

   The protocol's transport layer is TLS: users negotiate a TLS connection
   with a given distributor, and then relay PIR messages as described below.

   Implementations SHOULD prefer cipher-suites that support ephemeral keys,
   and MUST NOT support weak TLS cipher-suites.

   Distributors MUST provide a two-level certificate chain with their TLS
   handshake.  One certificate MUST be a self-signed certificate for their
   long-term public key, and the other certificate should be signed by the
   first one.  The connection MUST use the private key of the second
   certificate.  Distributors SHOULD rotate these private keys regularly.

   Clients MUST validate these certificates when they first connect to each
   distributor, and MUST close any connection to a distributor whose
   certificates are non-conformant.  Clients MAY cache the results of
   certificate validation.

   [XXXX Say something about TLS sessions.]

   The PIR message format is as follows:

       TYPE   [1 octet]
       LEN    [4 octets]
       DATA   [Len octets]
       HASH   [HASH_LEN octets]

   HASH is equal to H(TYPE | LEN | DATA).

   The format of 'DATA' is dependent on 'TYPE'.  Recognized values of TYPE
   are:

       0 -- VERSION
       1 -- SHORT_PIR_REQUEST
       2 -- LONG_PIR_REQUEST
       3 -- PIR_RESPONSE
       4 -- GET_METADATA
       5 -- METADATA
     255 -- ERROR

   The corresponding formats of DATA are:

       TYPE: VERSION [0]
       DATA: Sequence of
                 VER  Protocol version  [2 octets]
                 VER  Protocol version  [2 octets]
                   ...

       TYPE: SHORT_PIR_REQUEST [1]
       DATA: NYMID   Nymserver ID    [HASH_LEN octets]
             CYC     Cycle number    [4 octets]
             SEED    PRNG seed       [SEED_LEN octets]

       TYPE: LONG_PIR_REQUEST [2]
       DATA: NYMID   Nymserver ID    [HASH_LEN octets]
             CYC     Cycle number    [4 octets]
             MASK    Bitmask         [CEIL(N_BUCKETS / 8) octets]

       TYPE: PIR_RESONSE [3]
       DATA: XB      Xor-ed buckets  [BUCKET_SIZE octets]

       TYPE: GET_METADATA [4]
       DATA: NYMID   Nymserver ID    [HASH_LEN octets]
             CYC     Cycle number    [4 octets]

       TYPE: METADATA [5]
       DATA: M       Metadata        [variable length]

       TYPE: ERROR [255]
       DATA: CODE    Error code      [2 octets]
             MSG     Message         [variable length]


   The protocol exchange proceeds as follows.

      1.  The first message MUST be sent by the client, and must be a VERSION
          message listing all the protocol versions that the client
          understands.  (Currently, the only known version is 0.)  The server
          MUST reply with a VERSION message choosing a single one of those
          versions, or with an ERROR message (code 'BAD_VERSION').

          Once the client receives the VERSION message from the server, it
          then sends one or more messages to the server.  The server replies
          to each of these messages in sequence.  The client need not wait
          for a reply before sending its next message.

      2.  If the client sends a GET_METADATA message, the server tries to
          find the signed metadata for the identified nymserver and cycle.
          If the metadata is found, the server answers with a METADATA
          message containing the metadata.

          If the server does not recognize the nymserver ID, it answers with
          an ERROR message (code 'BAD_NYMSERVER').  If it does recognize the
          nymserver, but it no longer has the data for the requested cycle,
          or it has not yet retrieved the data for the requested cycle, it
          answers with an ERROR message (code 'CYCLE_EXPIRED' or
          'CYCLE_NOT_YET') respectively.

      3.  If the client sends a LONG_PIR_REQUEST message, the server first
          tries to find the bucket pool for the requested nymserver/cycle
          pair.  If it cannot, it answers with an ERROR message as in [2]
          above.

          Assuming the bucket pool is found, the server checks whether
          Len(MASK) == CEIL(N_BUCKETS/8).  If not, it answers with an ERROR
          message (code 'BAD_MASK_LEN').

          Finally, the server treats MASK as sequence of bits, running from
          the MSB of the first byte of MASK to the LSB of the last byte of
          mask.  Let Bit[i] be the i'th such bit.  The server computes a PIR
          response as follows:

              Let XB = Z(BUCKET_SIZE).

              For i = 0 .. N_BUCKETS-1:
                 If Bit[i] is 1:
                    Let XB = XB XOR BUCKET[i]

          The server then answers the client with a PIR_RESPONSE message
          whose body is XB.

      4.  If the client sends a SHORT_PIR_REQUEST message with a given SEED,
          the server computes

               MASK = PRNG(SEED, CEIL(N_BUCKETS/8))

          and responds as if the client had sent a LONG_PIR_REQUEST message
          with the computed mask.

4.1.1. Error codes

   The values for error codes are given below.

   BAD_VERSION    [00 00]  (No version was recognized)
   BAD_NYMSERVER  [00 01]  (The specified nymserver was not recognized)
   CYCLE_EXPIRED  [00 02]  (The requested cycle's data is no longer retained)
   CYCLE_NOT_YET  [00 03]  (The requested cycle's data is not yet retrieved)
   BAD_MASK_LEN   [00 04]  (The provided mask was too long or too short)

   OTHER          [FF FF]  (Something else happened.)

4.1.2. Optimizing the distributor implementation.

   To avoid thrashing disk and memory caches, distributors should consider
   handling all pending PIR requests in parallel, as follows:

   Repeat:
      For i = 0 .. N_BUCKETS-1:
        For j = 1 .. N_PENDING_REQUESTS:
          If Request[j] has Bit[i]=1, then:
             Let XB[j] = XB[j] XOR BUCKET[i]
          If End_bucket[j] == i:
             Send XB[j] as a response to Request[j], and remove Request[j]
             from the pending list.

        If any new request has arrived:
          Put it at some index r.
          Let XB[r] = Z(BUCKET_SIZE).
          Let End_bucket[r] = i.

4.2. Retrieving a bucket

   To download a specific bucket B from cycle CYC of nymserver NYMID, clients
   should behave as follows:

      1. Connect to K chosen distributors and negotiate protocol versions, if
         not already connected.
      2. Generate K-1 random seeds SEED[1..K-1].
      3. Let MASK = SEED[1] XOR SEED[2] ... XOR SEED[K-1]
      4. Flip the B'th bit of MASK.
      5. Permute the chosen distributors into a randomly selected order.

         Send the first K-1 distributors SHORT_PIR_REQUEST messages, each
         with a different one of the K-1 elements of SEED

         Send the last distributor a LONG_PIR_REQUEST message with MASK.

      6. Compute the XOR of the responses.  This is the value of the bucket.

4.3. Retrieving a cycle's worth of messages.

   To download _all_ of its messages for a given cycle CYC of nymserver
   NYMID, clients should behave as follows:

      1. Connect to K randomly chosen distributors.

      2. Ask a randomly chosen distributor for the metadata for NYMID,CYC.
         Make sure that the metadata is signed correctly, and has the correct
         values for NYMID and CYC.

      3. Compute our UserID for the CYC.  Call this "U".

      4. Find the largest I such that the I'th element of the metaindex has
         UID <= U.

      5. Use PIR to retrieve index bucket I.  Verify that the hash of this
         bucket is equal to IDX_H in the I'th element of the metaindex.

      6. Find the largest J such that the J'th element of the index bucket
         has UID <= U.

         If the UID at J is U, then we have messages today.  Otherwise, we
         don't, but we need to retrieve messages anyway to avoid leaking our
         identity.

         Set M_IDX and M_H from the selected index entry.

      7. Use PIR to retrieve MAX_BUCKETS buckets starting at M_IDX.

         Verify that, for each downloaded BUCKET[j]
             M_H = HASH(BUCKET[j])                      {if j == M_IDX}
             BUCKET[j-1][0:HASH_LEN] == H(BUCKET[j])    {otherwise}

         (Clients MUST verify all buckets, even buckets for messages they
         don't want, so that compromised distributors can't learn whether the
         client really wanted MAX_BUCKETS buckets or not.)

      8. Unpack the messages in the bucket.

   {XXXX We say 'verify' above, but not what to do on failure.}

5. Account administration

   {XXXX Write me.}

6. System information

   {XXXX Write me.}

Performance analysis:

   N -- number of users
   S -- number of distributors
   K -- security parameter for PIR.
   B -- bucket size
   Vol[i] -- Volume of message received on a given "day" by user i.
   CVol -- Volume of messages received on a given "day" by user i,
       after compression.
   MCVol[i] -- Maximum compressed volume that user i can receive in
        one "day".
   W -- Window in which users must retrieve their messages (in "days").

   ME -- metaindex entry size (36 bytes)
   IE -- index entry size (40 bytes)
   SS -- stream seed size (32 bytes)

   Number of message buckets needed (m):
     SUM( CEIL( CVol[i] / B ) )

   Number of index buckets needed (I):

     CEIL( (N * IE) / (B-IE) )

   Size of metaindex (MS):

     ME * I

   Total storage needed per pool (PS):

     B * (m+I) + MS

   Storage needed at each distributer:

     PS * W

   Bandwidth of nymserver/collator per "day" (assuming idealized
      Bittorrent):

     SUM(Vol[i]) + PS

     (I'm assuming an idealized Bittorrent where every participant has
     equal bandwidth; where everybody but the original seed downloads
     the entire file once; where everybody uploads the entire file
     once; and where tracker bandwidth is negligible.)

   Client bandwidth per "day":

     User i needs to fetch
         CEIL( MCVol[i] / B ) + 1
     buckets per day.  Call this Buckets[i].  Retrieving each bucket
     takes K-1 requests of SS bytes each, and one request of (m+I)
     bits.  Each such request yields B bytes.  So for PIR, User i uses
     bandwidth of:

         Buckets[i] * [ (K-1)*SS + (m+I)/8 + B ]

     Call the above volume ClientPIRVol[i].

     Additionally, each user needs to fetch the metaindex via
     bittorrent.  This should take (assuming the idealized bittorrent
     characteristics above), an additional (2*MS) bytes per "day".

   Distributer bandwidth per "day", assuming idealized Bittorrent:

     PS + SUM( ClientB[i]) ) / S

=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D
Comparison: Type I nymservers

   (Ignoring Type I overhead.)

   r -- reply block size
   L -- reply block path length
   S -- number of mixes on mix net.

   Storage needed at nymserver:

      r * N

   Bandwidth needed at nymserver, per "day":

      SUM(Vol[i]) + SUM(CVol[i])

   Bandwidth needed by user i, per "day":

      CVol[i]

   Bandwidth needed across entire mix net, per day:

      2*L*SUM(CVol[i])

      (every node in each reply block must receive and send every
      message.)

   Bandwidth used by mix node, per "day":

      2*L*SUM(CVol[i])/S

=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D
Comparison: alt.anonymous.messages, used securely.

   S -- number of news servers

   Storage needed at each news server:

      W * SUM( CVol[i] )

   Bandwidth needed at news server, per day:

      SUM( CVol[i] ) * (N/S + 1)

      (Each server retrieves each message once, and relays all posts
      to N/S users.)

   Client bandwidth per "day":

      SUM( CVol[i] )

=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D
Comparison: Type III nymserver design (Underhill)

   r -- reply block size (2K)
   P -- payload size (28K)
   M -- packet size (32K)
   W -- maximum interval at which users must replenish reply blocks,
        in "days".
   S -- number of mixes on mix net.

   All of this stuff is *very* approximate.

   Storage needed at nymserver:

      (best case -- holding only reply blocks.)
      >=3D r * SUM( CEIL(CVol[i]/P) * W )

      (worst case -- out of reply blocks for everybody.)
      < SUM( CEIL(CVol[i]/P) * W)

   Bandwidth needed at nymserver, per day:

      (Nymserver receives messages, sends compressed messages in
      packets, and receives SURBs.)

      SUM(Vol[i]) + SUM( CEIL(CVol[i]/P) * M) + SUM(CEIL(CVol[i]/P)*r)
      =3D=3D SUM(Vol[i]) + (M+r)*SUM(CEIL(CVol[i]/P))

   User i bandwidth, per "day":

      CVol[i]*CEIL(CVol[i]/P)

   Bandwidth per mix on mix net, per "day"

      [2 * L * (M+r) * SUM(CEIL(CVol[i]/P)) ] / S

===============================================================================


On Sun, Apr 18, 2004 at 10:34:31PM -0700, Len Sassaman wrote:
 [...]
> 
> Reviewing the rest of your comments now...

FYI, I'm currently msging with bram about this stuff on #p2p-hackers.

We've currently made the following changes in my proposal (correct me
if I'm wrong, Bram):

   1.  The client doesn't use bittorrent to get the metaindex;
       instead, she just fetches it from any distributor.

   2.  Instead of containing a hash of *all* of a user's buckets, the
       index entry for the user just contains a hash of the user's
       first bucket.  The hash of the user's first bucket starts with
       a hash of each of the user's remaining buckets.  This way, if a
       bucket is corrupted, the user can tell which one.

Did I miss anything?



***********************************************************************
To unsubscribe, send an e-mail to majordomo@seul.org with
unsubscribe freehaven-cvs       in the body. http://freehaven.net/
Prev by Author: [freehaven-cvs] Commit preproceedings PDF.
Next by Author: [freehaven-cvs] remove extraneous notes
Previous by thread: [freehaven-cvs] Commit preproceedings PDF.
Next by thread: [freehaven-cvs] remove extraneous notes
Index(es):
- Author
- Thread