[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

First go at directory server details



I promised Nick I'd write up more details of the redundant dirserver
design. I'm not going to have time to do a really thorough job on it
for quite a while; so I'm going to write a start now, so people have a
sense of where it'll go, and then pick it back up when I have more time.
Or feel free to pick it up for me. :)

There are $d$ redundant directory servers, where d is like 5. When a new
remailer wants to join the network, it broadcasts its serverdesc to all
dirservers. When a dirserver receives a serverdesc from a new remailer,
it broadcasts it to all dirservers. Thus either a serverdesc is known
by all honest nodes, or it is known by no honest nodes.

Each honest dirserver now has the same list of alleged serverdescs in
the system. Dishonest dirservers know about this list, and might know
about some more too.

Dirservers locally compute a list of \emph{active} nodes. A remailer node
is active if it passes the reliability (pinging) tests of that dirserver.

Each dirserver then broadcasts its list of active nodes to the other
dirservers. There's a deadline each night by which they must have
done that. Then each dirserver locally computes its version of the
\emph{consensus directory} --- say, it includes all nodes listed by at
least 3 dirservers. Then it broadcasts it to the dirservers, along with
a signature, and hashes of the active node lists it received from the
other nodes. If any the hashes are bad compares to the active node list
that dirserver got, the dirserver operators are notified and freak out
appropriately. Each dirserver now puts the signatures together with the
consensus directory, and offers it to the world.

There's a deadline by which it must be available; this deadline is before
the time when the new directory should be used by clients. All remailer
nodes must go fetch the new one during this meantime.

Now, when the user downloads the mixminion client, it comes with $d$
hard-coded keys, one for each dirserver. When he needs a current
directory, he goes to his favorite node and pulls down the consensus
dir. He verifies the sigs based on the public keys he knows, and if a
threshold of them (3 as above) are correct, he accepts the directory.
It doesn't matter which node he goes to, whether the node is evil, etc:
a good directory is a good directory.

Issues to elaborate on in the future:

1) what happens if you want to add a dirserver?

It seems like you just add his signature to the distribution, and he
starts signing consensus directories. People with the old code won't
care as long as they still meet threshold.

2) what happens if you want to ditch a dirserver?

Again, you remove him from the distrib, and he can stop signing things.
As long as we don't ditch too many at once, we should be ok. Of course,
we shouldn't remove too many dirservers at once -- a removed dirserver
is equivalent to a malicious silent dirserver, for a while.

3) what if the dirservers disagree about who is a dirserver?

The ultimate question is, who decides which dirservers go into the
distrib? And if there are multiple distribs, what the heck do we do?

4) how do we decide if a node is active?

Current theory is that we send 3-hop pings, with the node in question
at either the first, second, or third hop. We need to think more, in
terms of whether this weights our tests towards middle-hops or edge-hops,
and how much we can learn given that one of the nodes we're not testing
might fail. And we want to take some longer-term view of results, so we
avoid having nodes oscillate on and off the list.

See my paper about the problems with this:
http://freehaven.net/doc/casc-rep/casc-rep.pdf. Paul and I have a page
of notes on improvements to this idea and ways to adapt it to free-route
networks. It's going to be many moons before that gets done though.

5) what about talking about how good the node is, not just whether he's
acceptably active?

This may not be wise, for two reasons:
- In the casc-rep paper above, we argue that giving details lets an
adversary control more traffic. We would need to do more research to
find a path selection algorithm that avoids this attack.
- When building the consensus algorithm, we need some way to reconcile
these quantitative values.

Reputation systems are an ongoing research problem. For example: if we
simply list the acceptably active ones and ignore the others, then the
others won't be on the consensus directory for that day, so they'll see
very little use. Fine, except -- then they know that most use they get
is pings from the dirservers?

6) what about DoSing a dirserver, or otherwise screwing things up?

As long as you don't DoS "too many" dirservers, then we don't care. But
that's the problem: we need the threshold of acceptable signatures to
be low so knocking out some dirservers doesn't cripple us, but we need
it to be high so controlling some dirservers doesn't let you include
arbitrary nodes. 

My instinct is to keep the threshold pretty low, on the theory that
if we pick dirservers as our friends, more than a few malicious ones
are unlikely.

7) degraded mode; or, what do you do when you can't get enough signatures
for the day?

Targetted DoS can affect anonymity. If you DoS the dirservers for
even part of a day, you can generally reduce the set of nodes on the
active list, since fewer pings get out, fewer responses return, etc.
Targetted DoS against remailer nodes can push them under the threshold
for being listed.

A successful DoS against enough dirservers for a whole day means they're
not going to create a useful consensus directory for the next day. This
means that clients will fetch a new directory, find that it doesn't have
a threshold of signatures, and freak out. As well they should -- there
are enough unknowns already in terms of how much anonymity clients can
reliably expect -- reducing it even further may mean that users shouldn't
expect the usual anonymity from the system that day.

And actually, it doesn't take a whole day; you just need to do it during
the interval they're supposed to be combining their stats, and they'll
conclude that the others have failed.

We need a more careful investigation of how many dirservers we should
have, what our thresholds should be, etc.

Anonymity is hard. I'm going to get some sleep.
--Roger