[Author Prev][Author Next][Thread Prev][Thread Next][Author Index][Thread Index]

bridge and bridge authority proposal



Hi folks,

Here are some details on my plans for bridges and bridge authorities.
They're still fluid because I haven't actually built it, so it's hard
to know if they will turn out to be the right plans when it comes down
to coding, but it's at least a start.

If you don't know what I'm talking about, go take a look at the
blocking resistance draft design (and slides and video if you prefer)
at https://tor.eff.org/documentation#DesignDoc

There are three components that need to be added: bridge directory
authorities; bridges themselves; and the client side ("bridge users" or
"bridge clients"). (We need a better name than "bridge user" -- perhaps
a suitably ethnic but suitably inoffensive version of Alice? Or does a
name exist that matches those two constraints? :)

Piece one: bridge directory authorities.

  This part is easy. I've added a new config option
  BridgeAuthoritativeDir. I've also revamped the code so you can opt
  to be a V1 or V2 authority, or you can be a bridge authority, but you
  don't need to be both. (In fact, I suspect we will want to think harder
  about our logic ("what exactly do we serve") if some authority wants
  to be both -- but I don't see a need for that quite yet, so I'm leaving
  it for another time.)

  There's a new wrapper authdir_mode_bridge() that tells whether we are
  acting as a bridge directory authority. When we are, we decline to
  generate or serve v1 directories/running-routers or v2 network statuses.
  Otherwise we answer dir queries as normal, and we allow uploads of
  server descriptors as normal.

  One day we'll want some way to enumerate the bridges we hear about,
  rather than just listing them internally and never publishing the
  list. My first plan, once all the work described in this email
  is finished, is to have bridge authorities write a list of bridge
  descriptors to disk, and then the humans can manually tell the IP:ORPort
  of a few bridges to testers and to people in need. After that works
  we can produce a second plan.

Piece two: bridges.

  Bridges are just Tor servers that publish to a different location.

  My next step is to change the DirServer config syntax so it can
  hear that an authority is a 'bridge' flag too.

  Then we need a way to tell servers where they should publish. I was
  thinking of adapting the PublishServerDescriptor config option for
  that. Currently it's only used for controllers like Blossom, and
  it's a boolean, but we might make it more general and let it take
  "v1", "v2", and/or "bridge" arguments too.  We could retain "0"
  for "don't publish to anything" and "1" for "publish to whatever
  you think best" for backward compatibility. Or we could retain "0"
  for "don't publish to anything" and "1" for "a synonym for v2" for a
  different sort of backward compatibility that we mark as deprecated. Or
  if adapting this config option is dumb, we could add a separate
  PublishServerDescriptorToWhere config option, but that seems overkill.

  Bridges would likely want to set RelayBandwidthRate and
  RelayBandwidthBurst. Good thing they mostly work now.

Piece three: clients.

  This is the trickiest part. Users of bridges want to use a set of
  bridges as their first hops -- rather than entry guards. So the easy
  part is a new config option "UseBridges 0|1", and a new LINELIST
  config option
  "Bridge IP:Orport [fingerprint]".

  Now, when UseBridges is set, it is necessary that all circuits
  and dir fetches traverse a bridge as their first hop. In order to
  be able to bootstrap, users need to be able to learn networkstatus
  documents. They could do this by
    a) connecting to the bridge and sending it a begin_dir request. Not
    so good because now every bridge needs to be a dir cache.
    b) connecting to the bridge and sending a begin request to exit to
    a directory authority's port. Not so good because now bridges can't
    just have "reject *:*" as their exit policy.
    c) Doing a create-fast to the bridge, and then some sort of
    extend-fast to the directory authority, and querying the authority
    via begin_dir from there. Not so good because the Tor protocol
    doesn't support that (and it wouldn't get the full security that
    the Tor extend provides, because the bridge could bluff).

  For the first solution, I suggest we go with a) -- if the bridge has
  a defined dirport, then it mirrors dir info quickly and often, and if
  it doesn't, then it mirrors dir info just as a normal Tor client does,
  but in any case the bridge user can dip into the bridge's directory info
  and learn enough to bootstrap. So long as the bridge can make circuits,
  this means the bridge user should be able to make those circuits too.
  To make things simpler for the first go, we can just demand that for
  now bridges must define their dirport.

  (This choice has implications for future designs where Tor clients
  know different pieces of the directory -- it will be harder to keep
  secret which pieces you know if your bridge clients can just query you.)

  As a little bonus, if the bridge user fetches his dir info from the
  bridge, he'll be sure to ask for descriptors that he can get (since
  they're the ones the bridge is trying to get too), and he saves some
  bandwidth for the bridge (though only download bandwidth so that
  doesn't matter as much).

  I'm inclined to keep the "bridges" list on bridge clients separate
  from the "entry guards" list, on the theory that sometimes people will
  require bridges and sometimes they won't and we don't want to mingle
  things. But the parallel between "bridge users use a bridge as their
  first hop and do a begin-dir to it to learn dir info" and the future
  plans of "Tor users use an entry guard as their first hop and do a
  begin-dir to it to learn dir info" is eerie, and I expect that down
  the road we will evaluate whether to merge them somehow.

  So somebody watching a bridge will see it make connections to a fixed
  handful of nodes, and those are the circuits the bridge operator is
  generating, and the other circuits are probably for the relayed traffic
  from the bridge user. This introduces anonymity research questions
  ("what are the implications", "can we do better"), which I leave open
  for now in the interest of getting a first prototype up. Feel free to
  answer them, and we can change our mind down the road.

  The details of keeping state inside Tor, remembering that you need
  to build your circuits through a bridge, having "one-hop" circuits vs
  "three-hop circuits", etc are going to get messy, and that's where the
  bulk of the work will come in. A lot of that work is already underway
  with client-side support for begin_dir.

  We probably want a way to cache bridge descriptors in the datadir and
  keep them separate from "main" Tor server descriptors. Which leads to
  the next section.

Descriptor purposes: how to tell them apart.

  It turns out we've encountered a similar issue in the past, when
  controllers wanted to give us router descriptors that Tor shouldn't use
  when it's making its own paths. We solved it then by adding a 'purpose'
  to descriptors -- 'general' purpose is for normal descriptors, whereas
  'controller' purpose is for others. When Tor chooses nodes for its
  paths, it only chooses from the general-purpose descriptors.

  The controller specifies the purpose it has in mind when it invokes
  postdescriptor. The descriptor itself doesn't contain its purpose --
  after all, a Tor server is a Tor server, and different people can use
  it in different contexts.

  So how does this apply here? When we learn a bridge descriptor,
  e.g. from connecting to the IP:ORPort and using a begin_dir to ask for
  /tor/server/authority, or from asking a bridge authority for a new one,
  we tag it as a 'bridge' purpose so we can remember what to use it for.

  The specific problem we're solving is how to make sure that the first
  hop is a 'bridge' purpose when UseBridges is set. But the more general
  case is that we want a way to tell Tor to use certain purposes in
  certain positions in the path. When we have more purposes out there,
  I can imagine that onion_populate_cpath() and friends could assign a
  desired purpose in each step of the cpath, so when we choose a router
  for that step we choose from among the right pool of routers. This
  would let us handle N different Tor networks down the road, and we
  could build paths that traverse several of them. And we could put tags
  on dirserver lines to specify the purpose that should be assigned to
  all the descriptors we learn from it. And eventually we will need a
  better word than 'purpose' to describe what we're doing with it. But
  no need to solve this stuff until we get closer to it.

  Nick proposed that we add a little header section to each descriptor
  before we write it to disk, explaining its purpose and maybe other
  features about it. I think this is a great idea. Nick is better at
  choosing formats for these things than I am, so I will propose a proof
  of concept and let him improve it:
  "Add the following two lines above the 'router' line:
    local-status version-num
    purpose foo
  where version-num is the version of local-status we're using (always
  1 for now), and foo is the purpose we'd like to remember for this
  descriptor. Later we might add an 'origin' line or some other line. The
  local-status section is over when we reach a 'router' line."

  We also have need of writing other statistics about a given router,
  such as for directory authorities that collect stats about uptime
  periods -- but these stats will change significantly more often than
  the descriptor itself, so we should probably store them in some other
  file, so I'll ignore that topic here.

There. This should be enough to tackle for now.

--Roger