[Author Prev][Author Next][Thread Prev][Thread Next][Author Index][Thread Index]

RFC: The Standards Game



Hi, all.  I'd like your suggestions for the following document.  It's
meant to be a cute introduction to our directory agreement problem to
interest people who don't care about anonymity networks.

I'm not looking for _solutions_ to the directory agreement problem from
the mailing list: we've already discussed it a few times and I don't
think we're any closer to a solution than we were in August.  Instead,
what I want to know is this:
    - Do you think this is an interesting example to get people
interested in the problem?
    - How can I make it more accurate, clear, helpful, or amusing?
    - Where should I circulate this to try to get some answers?

Please do _not_ send emails of the form "Doubtlessly, the FooBarBaz
community has solved this."  Every time I've followed up on a suggestion
of this form, I've found a great deal of research on a superficially
similar yet completely unrelated set of problems. ;)

Thanks for your help,
				Nick

============
                           The Standards Game

     (This game describes an interesting design problem we encountered
     while building Mixminion.  It is not meant to be an accurate
     reflection of revolutionaries, standards, or standards committees.)

Down with the tyranny of the capitalist Internet!  The Silicon Valley
People's Revolutionary Front has decided to create a new set of network
protocols from scratch.  Everything will be replaced, from ethernet to
AIM to ASN.1.

There's a small problem, of course: the SVPRF is a very loosely knit
organization, with no official membership list.  The SVPRF has no
leadership, either: only a bunch of old-timers and young turks
squabbling over who best serves the interests of the common programmer.
The SVPRF never meets face-to-face, but conducts all of its affairs
online.  Finally, the SVPRF has probably been infiltrated at every level
by agitators from the IETF oligarchy, who will stop at nothing to ensure
that their networking project fails.

Here's how standardization works.  Standards are composed of "Features".
Each Feature is either:

    * "Broken"  (it won't work or is obviously insecure), 
    * "Hostile" (it has hidden security holes introduced deliberately
                 by the adversary)
    * "Good"    (it is secure and correct).

There is a finite, limited number of Good features.  There is a
potentially limitless number of Broken and Hostile features.  Anybody,
with a little effort, can tell which Features are broken.  Only Experts,
however, can distinguish Hostile features from Good ones.  Most members
know (or think they know) a few Experts.

Features can't be used on their own; instead, they must be grouped
together.  Such a group of Features is called a "Standard".  Two
Standards are compatible if and only if they have _exactly_ the same
Features.

The goal of the standardization process is go put a Standard in the
hands of every SVPRF member.  Each member can use only a single
standard. Obviously, since the SVPRF membership is hopelessly
uncoordinated, different members may decide to choose any standard they
like.  Nonetheless, every honest member (who is not an agent of the
IETF) wants the following:
     1. Uniformity. (The more honest members who decide to use a given
        Standard, the better.)
     2. Richness. (The more Good features in a Standard, the better.)
     3. Correctness. (The fewer Broken features in a Standard, the
        better.)
     4. Security. (The fewer Hostile features in a Standard, the
        better.)

How well can the SVPRF members meet their goals?

Below, I have two scenarios that may help lead to a design.

====== Simplified scenario

    (This situation reflects my current thinking about simple ways to
    solve the problem.)

In this scenario, a Standard is the product of a Standards Committee.
Once a Standards Committee is formed, its members propose Features, and
vote yea or nay on every proposed Feature.  After a limited amount of
time, the floor is closed for new Features, and all the Committee
members sign a Standard containing all the Features they accepted.

Committees are formed of self-proclaimed Experts.  Anybody can claim to
be an Expert.  For a Committee to be formed, every member of the
would-be Committee must agree that every other member is a real Expert.
There can be more than one Committee, but nobody can be a member of more
than one Committee.

Users (SVPRF members) choose Standards based on who is on the committee
that generated them.  A user chooses a Standard if she likes the
composition of the Committee that generated it.

There is a tension between the benefit of having small Committees
(easier to verify that most members are Experts) and having larger
Committees (harder to corrupt a majority of members).

Assume that genuine Experts are usually (but not always) right about
which Features are Good, and that genuine Experts are usually (but now
always) right about who is a real Expert.

Assume that users are sometimes confused about expertise, but they
usually two or three real Experts.  Try to provide the guarantee that
_if_ users choose mostly good experts, they get good Standards.

In this scenario, what process should would-be Committee members use to
pick their committees, in order to make Standards as good as possible?
What process should users follow to choose between competing Committees?

====== General scenario

Assume that users and Experts exist as above, but remove the requirement
that Standards are determined by a vote by a committee.  Have them be
chosen by any process you like.

In order to determine user utility, define the value of a standard S as
follows (with respect to a given security parameter L (say, between 4
and 8)).

    1) [Uniformity] If N users have adopted S, the value of S is at
       most lg N.  Start with V0 = lg N.

    2) [Security and richness] If S contains F features, including K hostile
       features, the value of S increases with F and decreases with K. Let

             V1  = [1-(Comb(K,L) (K/F)**L)-.01*(K/F)**2] * V0

       (Where ** denotes exponentiation, and Comb(K,L) is the number
       of combinations of K taken L at a time.)

    (Ignore correctness for now, since Broken features are
    comparatively easy to detect.  Feel free to simplify the equations
    above if it helps your analysis.)

    (See footnote [3] for justification of these equations.)

Assume that every user wants to maximize their V1.  Assume that
legitimate Experts and to help them, and that fake Experts want to
minimize V1 for as many users as possible without getting caught.  What
should users and good experts do?

====== Motivation

Here's why this matters.  In Mixminion, clients send each of their
messages along a client-selected sequence of mixes.  They learn about
mixes by regularly downloading mix directories.  Consequences of this
design include:

      1) [Uniformity] As many clients as possible should share identical
         directories.  Clients with different directories are
         distinguishable.[1]

      2) [Correctness] Directories should contain only reliable mixes.
         If a client chooses a sequence of mixes including an unreliable
         mix, the client's message might not be delivered.
	
      3) [Security] Directories should not contain hostile mixes.  If a
         client chooses a sequence of mixes including only hostile
         mixes, the client's anonymity is completely compromised.[2]

      4) [Richness] Directories should contain as many good mixes as
         possible.  The more good mixes in the network, the more traffic
         the network can handle, and the harder it is for an adversary
         to compromise a large threshold of the mixes.

A directory generated by a single server is unacceptable, since that
server would be a single point of failure.  Nor can we count on a single
well-coordinated group of directory servers: the remailer community is
hopelessly fragmented and vulnerable to social engineering attacks.
(All an adversary would have to do to break such a group is to convince
half of them to accept a new member, and convince the other half that
the new member is hostile.)

So, can you help us?  For more information, see:
  http://www.mixminion.net/minion-design.pdf
  http://www.mixminion.net/dir-agreement.html
  http://archives.seul.org/mixminion/dev/Aug-2003/msg00011.html
  http://archives.seul.org/mixminion/dev/Aug-2003/msg00012.html


[1] Actually, if client directories overlap to a significant extent,
    their users are not *completely* distinguishable.  For example,
    suppose there are two directories D1 and D2, with K1 and K2 users
    respectively.  If user of D1 chooses an exit server that's in both
    D1 and D2, for his exit node, his anonymity is at most lg(K1+K2).
    Otherwise, his anonymity is at most lg(K1).  Thus [in this
    situation], his expected anonymity is at most

              |D1-D2|          |D1 /\ D2|
	      ------- lg(K1) + ---------- lg(K1+K2).
	        |D1|              |D1|

    Assuming a large number of potential directories exist, the user's 
    expected anonymity is at most:

          1     ---
        ----    >     lg(K_s)
        |D1|    ---
              s in D1

    where K_s is the number of users whose chosen directories include
    K_s.

    (The exit server is probably the only server that matters in this
    calculation, unless a significant number of servers are
    compromised.  Also, the directories should probably have a high
    degree (>50%) of overlap for this to hold: see
          http://freehaven.net/anonbib/author.html#danezis:pet2003
    ).


[2] A client's anonymity can also be slightly reduced if all but one mix
    in their chosen path is compromised, or if the first and last mix
    are compromised.

[3] V1 is supposed to represent a user's anonymity.  Since we don't have
    a good idea of how good some attacks are, however, some of the
    parameters are estimates.

    First, V0 is the anonymity a user gets has if an attacker learns
    which directory he is using.  (I assume that users of different
    directories are totally distinguishable; but see [1] above.) 

    The first factor in V1 is meant to be the probability that a user
    isn't exposed by an attacker.  The term "Comb(K,F)(K/L)**L" is the
    probability that the attacker controls an entire path; the term
    "(K/L)**2" is the probability that the attacker controls the entry
    and exit point.  (I'm ignoring loss of anonymity due to a path
    compromised in all but one hop.)  The ".01" constant reflects a
    somewhat conservative estimate that end-to-end traffic analysis
    probably requires _at least_ on hundreds of messages to succeed
    against real-world mix nets.
============

Attachment: signature.asc
Description: This is a digitally signed message part