[Author Prev][Author Next][Thread Prev][Thread Next][Author Index][Thread Index]

Re: [tor-dev] Defending against guard discovery attacks by pinning middle nodes



>> And yes again. In this model, an ultra-mega-secret HS should use a
>> long chain of guards. Of course, at some point, it is easier to do a
>> congestion attack to identify the first guard being used by the HS.
>> That is still a win, though, in that such an attack takes more
>> technical skill and effort.
> 
> I think this brings up another good point about hidden services that the
> "you must use a single guard forever" model causes us to ignore.
> 
> Imagine if we had k-Conflux routing, where you could pick 1-k guards,
> and send packets to a single exit or RP.
> 
> Under this model, congestion attacks would be more difficult, because
> you have to also do the combinatorics to choke out the Choose(N, k)
> combinations of k of N total guards. If you only choked a subset, conflux
> would rebalance to the remaining free paths (assuming good, responsive
> flow control).

That is a good point, and if you had a set of guards for which you had high collective trust, then I think multiple guards for load balancing and making congestion attacks more difficult would be a good idea. I think you need them to be collectively trustworthy because otherwise each added guard gives the adversary another chance to be selected, which multiplies his ability to observe user traffic in the network. I am skeptical that you wouldn’t be able to apply the congestion attack to the guards one-by-one, because the load balancing would take some time to notice the throughput drop and adjust. If you had multiple guards, then you could use them to provide stronger protection against discovery via congestion by redundantly sending the same packet to each one (PF Syverson and I analyzed this idea in a PETS10 paper <http://ohmygodel.com/publications/active-pets10.pdf>).

> After reading your back-of-the-envelope calculations on node rotation
> and guard lifetime for multi-tiered guards in your parent reply, I think
> that it stands to reason that some topology like this is best (with
> Conflux):
> 
> HS -> Guard_1 -> Guard_2 -> Guard_3 -> RP
> 
> The idea would be that Guard_3 would rotate on the order of hours,
> Guard_2 would come from a set that is rotated on the order of days
> (based on the expected duration for the adversary to become Guard_3), and
> Guard_1 would rotate on the order of months (based on the expected
> duration for the adversary to become Guard_2).

Why set guard 2 to expire in days? If that is less than surveillance speed, then once the adversary had guard 3, it’s game over.

> I would prefer if we had closed-form or code-based versions of this
> calculation, though, so we could play with set sizes and lifespans for
> the 3 hops. We also want to play with questions like "How many circuits
> should we use at a time?" and "How big should the set of middle nodes we
> choose from be?" Making all of that parameterized will help us tune it.
> Heck, the surface might even be plottable or traversable with gradient
> descent.

I took a stab at this by writing a quick Monte Carlo simulator (in python). Calculating exact probabilities gets incredibly complicated when you have several parameters and somewhat complicated compromise scenarios, and simulation in this case is nearly as accurate and reasonably fast. The simulator considers the HS to use just one circuit, and each node on the circuit expires at an individual rate. It outputs the distribution of how long it takes for the adversary to identify the HS. The parameters you might modify (set near the end of the script) are
  1. node_expiration_times: a list of node expirations in order from HS (in hours), e.g. [3,2,1] means the first hop (aka the HS’s entry guard) expires in 3 hours, the second hop expires 2 hours, and the third hop in 1 hour
  2. surveillance_time: time needed by the adversary to accomplish surveillance (in hours), e.g. if 24 then the adversary compromises a node that expires in 24 or more hours from the current time
  3. adv_relay_probs: list containing the probability of selecting an adversarial relay in each position, ordered from HS, e.g. [0.05, 0.01, 0.01] means that the adversary controls 0.05 of entry guard consensus weight, 0.01 of second (aka middle) hop weight, and 0.01 of third-hop weight

The script is attached. I hope it is useful. If so, maybe I can develop it more next week.

Best,
Aaron
import random
import math

def check_for_compromise(sample_expiration_times, cur_time, adv_position,
    surveillance_time):
    """Checks if adversary in adv_position has surveillance_time left in each previous node
    before they expire. If so, the HS is compromised via repeated surveillance."""
    if (adv_position == 0):
        return True
    else:
        if (sample_expiration_times[adv_position-1] >=\
            cur_time + surveillance_time):
            return check_for_compromise(sample_expiration_times,
                cur_time + surveillance_time, adv_position-1, surveillance_time)
        else:
            return False


def simulate_guard_attacks(num_samples, node_expiration_times, surveillance_time,
    adv_relay_probs, simulation_length):
    """Simulates HS compromise via guard surveillance or adversarial relay selection.
    Returns list of simulated times until compromise, where inf indicates no compromise."""
    compromise_times = []
    for sample in xrange(num_samples):
        if (sample % 1000 == 0):
            print('{} samples finished.'.format(sample))
        sample_expiration_times = [0]*len(node_expiration_times)
        compromise_time = float('inf')
        for cur_time in xrange(simulation_length):
            for i in xrange(len(sample_expiration_times)):
                if (sample_expiration_times[i] <= cur_time):
                    # relay expired, choose new one
                    sample_expiration_times[i] = cur_time + node_expiration_times[i]
                    # check if compromised
                    p = random.random()
                    if (p <= adv_relay_probs[i]):
                        # node compromised, adversary attempts surveillance toward the client
                        if (check_for_compromise(sample_expiration_times, cur_time, i,
                            surveillance_time)):
                            compromise_time = cur_time
                            break
            if (compromise_time != float('inf')):
                break
        compromise_times.append(compromise_time)
    return compromise_times

def print_compromise_time_stats(compromise_times, simulation_length):
    compromise_times_sorted = sorted(compromise_times)
    try:
        first_uncomp_sample = compromise_times_sorted.index(float('inf'))
        prob_compromise = float(first_uncomp_sample) / len(compromise_times_sorted)
    except ValueError:
        prob_compromise = 1
    print('Probability of compromise over {} days: {}'.format(simulation_length/24.0,
        prob_compromise))
    print('Min time to compromise: {} days'.format(compromise_times_sorted[0]/24.0))
    print('Median time to compromise: {} days'.format(\
        compromise_times_sorted[(len(compromise_times_sorted)-1)/2]/24.0))
    print('Mean time tompromise: {} days'.format(\
        sum(compromise_times)/len(compromise_times)/24.0))
    print('Max time to compromise: {} days'.format(compromise_times_sorted[-1]/24.0))

num_samples = 10000
node_expiration_times = [90*24, 5*24, 12] # node expirations in order from HS (in hours)
surveillance_time = 1*24 # time for adversary to accomplish surveillance (in hours)
adv_relay_probs = [0.05, 0.01, 0.01] # node probability in each position, ordered from HS
prob_compromise_during_simulation = 0.99999 # allowed prob of uncompromised sample (affects sim len)
simulation_length = min(node_expiration_times) *\
    int(math.ceil(math.log(1-prob_compromise_during_simulation) / math.log(1-min(adv_relay_probs))))
compromise_times = simulate_guard_attacks(num_samples, node_expiration_times, surveillance_time,
    adv_relay_probs, simulation_length)
print_compromise_time_stats(compromise_times, simulation_length)

_______________________________________________
tor-dev mailing list
tor-dev@xxxxxxxxxxxxxxxxxxxx
https://lists.torproject.org/cgi-bin/mailman/listinfo/tor-dev