[Author Prev][Author Next][Thread Prev][Thread Next][Author Index][Thread Index]

Re: [tor-talk] Fwd: [Full-disclosure] tor vulnerabilities?



On Sat, Jun 29, 2013 at 4:43 PM, Cool Hand Luke
<coolhandluke@xxxxxxxxxxxxxxxx> wrote:
> -----BEGIN PGP SIGNED MESSAGE-----
> Hash: SHA512
>
> the below text was posted to pastebin.com (see original e-mail to the
> full-disclosure list at the end of this message).
>
>
> - ----- BEGIN PASTEBIN -----
> Tor LOL:
>
> directory authorities are the point of contact for clients to locate
> relays/exit nodes/guard nodes/etc. This is determined by a consensus
> document that goes through an elaborate process to ensure its integrity
> and cause bad directory authorities to be identified also via consensus.
>
> However, Tor developers are not the quickest lot, and this is basically
> the only document that they serve that has integrity control on it. Most
> interestingly, the public keys for every other node in the network is
> served without any form of signature or other form of integrity control.
>
> As such, a rogue directory authority, which anyone can be simply with a
> configuration option and an IP, can introduce path bias and other such
> tricks by serving the wrong keys for relays/guards/exits that it doesnt
> control. This can result in essentially directing clients through the
> network by causing decryption failures, thereby allowing determination
> of the source and end-point of a given tor connection with little more
> than a couple relays and some rogue directory authorities. Moreover, it
> can use the simple-minded metrics made to identify rogue guard nodes and
> couple that together with the behavior of public key cryptography to
> actually cause legitimate guard nodes to be flagged as having excessive
> extend cell failures causing it ultimately to be marked as bad.

I think this guy is confused.  I tried to tell him as much when he
twittered at me last night; you can see more or less the full record
if you look at the @nickm_tors from last night.


tl;dr: relay onion keys are indeed authenticated by the consensus
document.  On discussion, it appears that the guy thinks we aren't
actually authenticating them, though.  He posted
http://i.imgur.com/uVQTKlT.png to try to explain what he has in mind.

The attack doesn't work, though, as far as I can tell.  Her's what I
started writing up about it.



Some preliminary notes to clear up:

  - Being in the microdescriptor cache (the one implemented in
    microdesc.c as microdesc_map) is not sufficient for a
    microdescriptor to actually get *used*. It has to be linked to a
    node_t. The function that does that is the one in nodelist.c.  More
    on this below.

  - Directory authorities and directory mirrors are different.
    Directory authorities are a closed set, whose public keys are
    distributed with the source.  Anybody can be a _directory mirror_
    simply with a configuration option and an IP.

  - There are indeed three paths to the microdescs_add_to_cache()
    function.  One of them (in directory.c, not dirserv.c), passes a
    list "which" of the microdescriptor digests we requested
    microdescriptors for.  The other two don't.  But those are the ones
    that are reading microdescriptors from disk, so those
    microdescriptors were already checked on a previous run of the
    program.  (Also, adding them to microdesc_map is harmless; see
    above.)

  - Note that a corrupt directory mirror could try to influence path
    selection, by simply not answering requests for some nodes'
    microdescriptors, and pretending not to have them. I'll call this
    the "response filtering" attack. (Note also that it has nothing to
    do with cryptographic verification.) To resist it:

        * When clients want a directory resource, and they don't receive
          it, they request it from other directory mirrors until they
          do get it.

        * Clients don't build client circuits until they have
          information for a sufficient fraction of the nodes in the
          network, as calculated in nodelist.c,
          update_router_have_minimum_dir_info().

    So unless the "response filtering" attacker controls all the
    directory mirrors that the client uses, they can't prevent the user
    from learning microdescriptors for all the nodes they want.  And if
    they temporarily prevent the user from learning a given discriptor,
    the extent to which they can distort the user's view of the network
    is limited by the minimum_dir_info check



Okay, so let's walk through the code.

    Here's what's *supposed* to happen.

    The client decides to make a request for microdescriptors.  This
    happens in update_microdesc_downloads, where they call
    microdesc_list_missing_digest256 to get a list of the
    microdescriptor digests listed in the microdesc-flavor consensus
    such that the client does not have and is not already trying to
    fetch a microdescriptor with that digest.  The client passes this
    list to launch_descriptor_downloads, which actually does the work of
    sending the requests to one or more directory mirrors.  The list of
    microdescriptor digests requested is encoded in the
    "requested_resource" field of the directory connection.

    The directory mirror responds with a buffer, which the client hopes
    will contain microdescriptors with those digests.  In directory.c,
    the client reconstructs the list of which digests it asked for (by
    calling dir_split_resource_into_fingerprints) and passes that list
    of requested digests, along with the directory's response, to
    microdescs_add_to_cache().

    In microdesc_add_to_cache, the client first calls
    microdescs_parse_from_string.  Now "descriptors" contains a list of
    the received microdescriptors.  For every microdescriptor,
    md->digest is a digest of all of its textual contents.

    Then, it makes sure that the directory did not tell it any
    microdescriptors it hadn't asked for.  It does this by using a
    temporary map, "requested".  It initializes requested as mapping D
    to 1 for every digest in requested_digests256.  It then iterates
    over the microdescriptors.  If a microdescriptor's digest is in
    "requested", it sets the value in "requested" for that digest to 2,
    indicating that the microdescriptor was found.  If the
    microdescriptor's digest is not in "requested", it frees the
    microdescriptor, removes it from the "descriptors" list, and logs a
    message.

    (The function then removes every digest corresponding to a received
    microdescriptor from the 'requested_digests256' list, so that the
    caller knows what it didn't receive.)

    Notice that at this point, it has not checked whether the
    microdescriptors' digests match the digests listed for particular
    nodes or not -- only that the client actually requested
    microdescriptors with those digests.  It hasn't even matched
    microdescriptors up with nodes!  That comes later.


    Now we move on to microdescs_add_list_to_cache.  Our job here is to
    store the newly received microdescriptors to disk; to insert them
    into microdesc_map, and finally pass them to
    nodelist_add_microdesc.


    Before we pass the nodes to nodelist_add_microdesc(), let's recap
    where we are.  The microdesc_map contains microdescriptors, indexed
    by their digests.  These are all microdescriptors that we read from
    disk cache from an earlier session, or ones we received in reply to
    a request that we made for directory mirror request.  They are not
    yet associated with nodes. We have not yet checked that they still
    match the consensus.


    Now we get to nodelist_add_microdesc.  This part is key.  It looks
    up, in the microdesc consensus, whether we have any routerstatus
    whose listed microdescriptor digest (stored in its descriptor_digest
    field) matches the digest of the microdescriptor we have received.
    If so, it finds the corresponding node_t object, and associates the
    microdescriptor with that node_t.

            [Aside: if we already have a microdescriptor when we get a
            new consensus, it gets associated with the node_t in the
            nodelist_set_consensus function, where we look it up using
            microdesc_cache_lookup_by_digest256 with the microdescriptor
            digest listed in the consensus.]

    Associating the microdescriptor with a node_t might seem like an
    afterthought, but it's actually the security-critical part here.
    Here's why:

    When do we use an onion-key from a microdescriptor?  When we extract
    it in extend_info_from_node().  But that only looks at the
    microdescriptor currently associated with a node by
    nodelist_add_microdesc().  If the microdesc wasn't associated with a
    node there, we wouldn't even find its onion key.




So, what could go wrong?

1. Suppose that we start up with a microdescriptor cache that contains
   some microdescriptors which aren't in the consensus.

   In this case, microdesc_add_list_to_cache will indeed add them to
   microdesc_map, indexed by their digests.  But they won't get
   associated with nodes, so they won't affect client behavior.

2. Suppose that the directory mirror sends the same (requested)
   microdescriptor more than once in a given response.

   In this case, the "md2 = HT_FIND(microdesc_map, &cache->map, md)"
   check in microdesc_add_list_to_cache will make only one copy get
   added.

   In any case, they will only get associated with nodes if they match
   the digest listed for that node in the consensus.


3. Suppose that the directory mirror sends some microdescriptors in the
   directory response that did not have their digests listed in the
   request.

   In that case, they'll not be found among the microdescs in
   requested_digests256, and they'll be dropped.


Okay, now let's try the real thought experiment.

   Suppose that according to the consensus, node N1 with identity ID1
   has a microdescriptor M1 with digest D1, and node N2 with identity
   ID2 has a microdescriptor M2 with digest D2, and so on.  Suppose that
   the client sees a consensus that lists D1 for ID1, D2 for ID2, and so
   on.  Suppose that the client requests D1...Dn.  Suppose that the
   directory mirror sends back ANYTHING OTHER THAN M1...Mn.  What could
   happen?


   First off, any members of the response that are duplicates will get
   dropped, and any whose digests don't appear in D1...Dn will get
   dropped.  It will be as if the directory mirror didn't send them at
   all.

   So the only stuff that will make it into the microdescriptor cache
   can will be microdescriptors whose digests match a subset of
   D1...Dn.  Assuming that SHA256 is collision-resistant, that means
   that a subset of M1..Mn will make it in.

   Can anything cause the client to associate M1 with a node other than
   N1?  No, since this association is done explicitly by the <ID1,D1>
   mapping in the node's routerstatus in the signed consensus.

   So the directory mirror can, at worst, cause the client to have a
   subset of the answers it requested.  This reduces to the "request
   filtering" attack above, which has defenses.
_______________________________________________
tor-talk mailing list
tor-talk@xxxxxxxxxxxxxxxxxxxx
https://lists.torproject.org/cgi-bin/mailman/listinfo/tor-talk