[Author Prev][Author Next][Thread Prev][Thread Next][Author Index][Thread Index]

Re: [tor-bugs] #12505 [BridgeDB]: Refactor Bridges.py and Dist.py in BridgeDB



#12505: Refactor Bridges.py and Dist.py in BridgeDB
-------------------------+-------------------------------------------------
     Reporter:  isis     |      Owner:  isis
         Type:  defect   |     Status:  assigned
     Priority:  major    |  Milestone:
    Component:           |    Version:
  BridgeDB               |   Keywords:  bridgedb-dist, bridgedb-1.0.x,
   Resolution:           |  isis2014Q3Q4, isisExB
Actual Points:           |  Parent ID:
       Points:           |
-------------------------+-------------------------------------------------

Comment (by isis):

 Some problems in `lib/bridgedb/Dist.py` (from a comment on #4771):

 > The function is `bridgedb.Dist.IPBasedDistributor.getBridgesForIP()`,
 and it currently looks like:
 >
 > {{{
 >     def getBridgesForIP(self, ip, epoch, N=1, countryCode=None,
 >                         bridgeFilterRules=None):
 >         """Return a list of bridges to give to a user.
 >
 >         :param str ip: The user's IP address, as a dotted quad.
 >         :param str epoch: The time period when we got this request.
 This can
 >                           be any string, so long as it changes with
 every
 >                           period.
 >         :param int N: The number of bridges to try to give back.
 (default: 1)
 >         :param str countryCode: The two-letter geoip country code of the
 >             client's IP address. If given, the client will be placed in
 that
 >             "area". Clients within the same area receive the same
 bridges per
 >             period. If not given, the **ip** is truncated to it's CIDR
 /24
 >             representation and used as the "area". (default: None)
 >         :param list bridgeFilterRules: A list of callables used filter
 the
 >                                        bridges returned in the response
 to the
 >                                        client. See
 :mod:`~bridgedb.Filters`.
 >         :rtype: list
 >         :return: A list of :class:`~bridgedb.Bridges.Bridge`s to include
 in
 >                  the response. See
 >
 :meth:`bridgedb.HTTPServer.WebResource.getBridgeRequestAnswer`
 >                  for an example of how this is used.
 >         """
 >         logging.info("Attempting to return %d bridges to client %s..."
 >                      % (N, ip))
 >
 >         if not bridgeFilterRules:
 >             bridgeFilterRules=[]
 >
 >         if not len(self.splitter):
 >             logging.warn("Bailing! Splitter has zero bridges!")
 >             return []
 >
 >         logging.debug("Bridges in splitter:\t%d" % len(self.splitter))
 >         logging.debug("Client request epoch:\t%s" % epoch)
 >         logging.debug("Active bridge filters:\t%s"
 >                       % ' '.join([x.func_name for x in
 bridgeFilterRules]))
 >
 >         area = self.areaMapper(ip)
 >         logging.debug("IP mapped to area:\t%s"
 >                       % logSafely("{0}.0/24".format(area)))
 >
 >         key1 = ''
 >         pos = 0
 >         n = self.nClusters
 >
 >         # only one of ip categories or area clustering is active
 >         # try to match the request to an ip category
 >         for category in self.categories:
 >             # IP Categories
 >             if category.contains(ip):
 >                 g = filterAssignBridgesToRing(self.splitter.hmac,
 >                                               self.nClusters +
 >                                               len(self.categories),
 >                                               n)
 >                 bridgeFilterRules.append(g)
 >                 logging.info("category<%s>%s", epoch, logSafely(area))
 >                 pos = self.areaOrderHmac("category<%s>%s" % (epoch,
 area))
 >                 key1 = getHMAC(self.splitter.key,
 >                                "Order-Bridges-In-Ring-%d" % n)
 >                 break
 >             n += 1
 >
 >         # if no category matches, use area clustering
 >         else:
 >             # IP clustering
 >             h = int( self.areaClusterHmac(area)[:8], 16)
 >             # length of numClusters
 >             clusterNum = h % self.nClusters
 >
 >             g = filterAssignBridgesToRing(self.splitter.hmac,
 >                                           self.nClusters +
 >                                           len(self.categories),
 >                                           clusterNum)
 >             bridgeFilterRules.append(g)
 >             pos = self.areaOrderHmac("<%s>%s" % (epoch, area))
 >             key1 = getHMAC(self.splitter.key,
 >                            "Order-Bridges-In-Ring-%d" % clusterNum)
 >
 >         # try to find a cached copy
 >         ruleset = frozenset(bridgeFilterRules)
 >
 >         # See if we have a cached copy of the ring,
 >         # otherwise, add a new ring and populate it
 >         if ruleset in self.splitter.filterRings.keys():
 >             logging.debug("Cache hit %s" % ruleset)
 >             _, ring = self.splitter.filterRings[ruleset]
 >
 >         # else create the ring and populate it
 >         else:
 >             logging.debug("Cache miss %s" % ruleset)
 >             ring = bridgedb.Bridges.BridgeRing(key1,
 self.answerParameters)
 >             self.splitter.addRing(ring,
 >                                   ruleset,
 >
 filterBridgesByRules(bridgeFilterRules),
 >                                   populate_from=self.splitter.bridges)
 >
 >         # get an appropriate number of bridges
 >         numBridgesToReturn = getNumBridgesPerAnswer(ring,
 >
 max_bridges_per_answer=N)
 >         answer = ring.getBridges(pos, numBridgesToReturn)
 >         return answer
 > }}}
 >
 > A couple things to note:
 >
 >  * The `countryCode` parameter is entirely unused. What was it for?
 >
 >    The only place in BridgeDB's code where
 `IPBasedDistributor.getBridgesForIP()` is called is in
 `bridgedb.HTTPServer.WebResourceBridges.getBridgeRequestAnswer()`, where
 the `countryCode` is passed in, and it is the two-letter geoIP countryCode
 for the client's IP address.
 >
 >    Are we supposed to be grouping clients by which country they are
 coming from? Shouldn't we group them by whether or they are coming from
 Tor or another known open proxy, and if not, what country they are coming
 from?
 >
 >    Or was the `countryCode` parameter supposed to be the country which
 the bridge shouldn't be blocked in?
 >
 >    (By the way, the docstring was written by me. It was my best guess as
 to what `countryCode` was supposed to be for. There was previously no
 documentation on it.)
 >
 >  * Should we still be grouping clients by `/24`s? What adversary is that
 effective against? I realise that it isn't very difficult to get a class C
 subnet, but it isn't very difficult to get addresses in different `/24`s.
 Should we make the groups bigger, i.e. group clients by which `/16` they
 are coming from?
 >
 >  * Why are we still using the `/24` (the `area`) in the code for serving
 bridges to clients coming from Tor exits? This means that changing your
 exit node would get you different bridges. (But not the same bridges as
 people not using Tor.)

 (Fixed as part of #4771)

 >  * It seems like a lot of these bugs come from
 [https://gitweb.torproject.org/bridgedb.git/commit?id=f022b905ca01a193aabd4d78107f27fce85c40cd
 commit f022b905ca01a193aabd4d78107f27fce85c40cd] which implemented #4297
 and is where this whole business of making an "IPv5 hashring" and putting
 Tor users in it came fromâ we should probably look at the other changes in
 that commit and review them.

--
Ticket URL: <https://trac.torproject.org/projects/tor/ticket/12505#comment:4>
Tor Bug Tracker & Wiki <https://trac.torproject.org/>
The Tor Project: anonymity online
_______________________________________________
tor-bugs mailing list
tor-bugs@xxxxxxxxxxxxxxxxxxxx
https://lists.torproject.org/cgi-bin/mailman/listinfo/tor-bugs