[Author Prev][Author Next][Thread Prev][Thread Next][Author Index][Thread Index]
Re: [tor-bugs] #31788 [Core Tor/Tor]: Circuit padding trace simulator
#31788: Circuit padding trace simulator
-----------------------------------------------+------------------------
Reporter: mikeperry | Owner: (none)
Type: enhancement | Status: new
Priority: Medium | Milestone:
Component: Core Tor/Tor | Version:
Severity: Normal | Resolution:
Keywords: circpad-researchers-want, wtf-pad | Actual Points:
Parent ID: | Points:
Reviewer: | Sponsor:
-----------------------------------------------+------------------------
Comment (by pulls):
Replying to [comment:4 mikeperry]:
> Replying to [comment:3 pulls]:
> > The implementation now requires a client and relay trace (got a lazy
python script to simulate a relay trace from a client trace as well). My
biggest gripe right now is time. Every time a client/relay machine
triggers padding, a corresponding event has to be added to the
relay/client trace with an estimated time. This estimate will always be,
well, wrong. I'm not sure it's possible to make this estimate in such a
way that it'll fool time-based classifiers, even if we add in guard traces
for better estimates and patches as you mention Mike.
>
> To be clear: I believe this simulator will only be accurate enough to do
preliminary tuning of defenses against attacks, especially for expensive
classifiers. I think final attack and defense evaluation, and possibly
even some final tuning, should be done on the live network. At least until
we discover that for all of our tested attack+defense combinations, the
live network and the simulator agree.
>
> What do you mean by "wrong" though? We should try to make the simulator
as close as possible. We are aware of the circuitmux problem, as well as
delay introduced by libevent callbacks. These are both paths we hope we
can optimize, though. Are there others?
Agree, we're on the same page. By "wrong" I mean in addition to what you
mentioned inside of tor everything between traffic leaving tor at
client/relay until it ends up in tor at the relay/client: basically the
Internet! ;) We can never accurately capture this in a simulation now,
that's more something for Shadow++ to strive for.
> > Right now I think it might be best as a starting point to just try to
use the simulator to find optimal machines against attacks like Deep
Fingerprinting that ignores time. Once we have a better understanding of
how feasible and costly that is we can look more closely at how time
changes things.
>
> Do you mean ignores the time deltas between the client/middles and the
guard?
Ignores time completely (beyond the ordering of cells and their
directions). Deep Fingerprinting, like Wang's kNN and so on, operates on
pure cell/packet traces:
1
1
-1
-1
Etc, no time there at all. Since the simulator cannot get time exactly
right and some nasty deep learning machinery likely can pick-up on padding
cells being sampled from some non-real distribution given enough samples,
a first step is to get the simulation to produce correct cell traces with
high probability. Finding (reasonably efficient) machines that can defend
against attacks operating only on cells would be an awesome first step and
hopefully teach us a lot.
> > Any thoughts on this? Have I missed some other reason than time
estimates for including guard traces?
>
> Well, I have always assumed that the most realistic adversary for these
attacks is one that runs them from inside the Tor network, where they have
much higher resolution over circuit construction and usage, and have full
circuit multiplexing information.
>
> We can simulate such an adversary by looking at client traces, or guard
TLS traces, I suppose.
Thanks for clarifying, makes sense. With access to the client and
(padding) relay traces, would it be possible to get closer to an "ideal"
trace for an attacker (not necessarily the most realistic, but stronger)?
For example, observing the exact time that cells are sent at a client and
when cells are forwarded from relay (at the relay) to client seems close
to ideal, right? That way you minimize the network noise. Padding machines
that can defend from such an attacker should be able to deal with
attackers with mote noisy traces.
> > Also, if some other researcher working on this wants to collaborate
please reach out.
>
> I now have some time to help with this a bit for the next couple weeks.
Can you put your work in a branch on github?
Awesome, it's here: https://github.com/pylls/circpad-sim . Added you as
collaborator. Updated the README with the current state of things. Going
to work on input and output next so that I can clean up some of the debug
code. Will continue to work actively on it now sans trip Sunday-Wednesday.
--
Ticket URL: <https://trac.torproject.org/projects/tor/ticket/31788#comment:5>
Tor Bug Tracker & Wiki <https://trac.torproject.org/>
The Tor Project: anonymity online
_______________________________________________
tor-bugs mailing list
tor-bugs@xxxxxxxxxxxxxxxxxxxx
https://lists.torproject.org/cgi-bin/mailman/listinfo/tor-bugs