[Author Prev][Author Next][Thread Prev][Thread Next][Author Index][Thread Index]

Re: [tor-dev] SkypeMorph



On 12-03-28 02:28 AM, Roger Dingledine wrote:
On Mon, Mar 26, 2012 at 03:04:47PM -0400, Hooman wrote:
Can you give us some guesses about next steps for resolving these issues
(or explaining why they aren't actually as worrisome as they appear)?

A) It looks like the transport has no notion of adapting to network
conditions, i.e. congestion control. So it will basically fall apart on
a low-bandwidth or congested network.
True, but as mentioned in section 8.2 of the technical report, this
can be fixed by considering Skype video calls on different networks,
depending on the network status. (the way Skype bandwidth usage
varies with available bandwidth is studied, for example: http://www.tlc-networks.polito.it/oldsite/mellia/papers/skype_info08.pdf
)
Isn't that like saying TCP congestion control can be implemented by
sampling capacity and traffic load on a variety of networks, and then
hard-coding the TCP window and resend algorithms to suit the network
you think you're running on?

I'm not worried here so much about whether your flow adapts to network
conditions like a real Skype flow would (though I agree that's an
issue). I'm worried about whether your flow would fail to back off at
all in the face of congestion, leading to a) Skypemorph not getting its
packets through because so many of them get dropped, and b) Skypemorph
ruining the network it's running on.

B) It sends at a constant rate of 43KB/s in each direction all the
time. Even if users are willing to tolerate that, it doesn't scale on
the bridge/relay side if there are lots of users. I wonder how feasible
a "traffic shaping" approach would be (where the flow rate drops off
if there's no underlying traffic), and how much that would screw with
your statistics. Which leads to:
43KB/s is per connection, so each client gets this bandwidth, while
the bridge can have multiple connections.
Right. But if a bridge wants to handle 10 Skypemorph users, the bridge
needs to be sending out 430KB/s all the time. That means volunteer users
can't operate these bridges at home (unless they live in Japan, Korea,
or Sweden I guess). It also greatly increases the overall traffic cost
of running a bridge.

For example, during the February weekend when Iran blocked SSL, my
obfsproxy bridge was easily handling ~500 users at once. With Skypemorph
that's 172mbit/s of duplex traffic?
I will answer the first two questions here: We are going to get this fixed. So as I mentioned, we are going to do what Skype does: We will use different levels of bandwidth for the output of the SkypeMorph depending on network status (we can detect this the same way TCP detects congestions) or the amount of bandwidth the bridge is willing to dedicate to each client. Another way to do this is to limit the bandwidth provided to each client, as the number of clients increases.
C) The packet size and timing distributions only aim to match the
first-order properties of Skype. At the same time, DPI vendors have
already been in a battle with Skype traffic for a while now. How advanced
do you think DPI vendors are at detecting Skype-like traffic, and thus at
distinguishing your traffic from real Skype traffic? Similarly, how bad is
it that you don't follow through with the TCP side of the Skype handshake?
The TCP connections are more of control connections and they send a
small number of messages during the call and we actually have some
ideas on how to deal with this, like handing the sockets for these
connections to our software after we fake a call.
Ok.

What do you think about the "first-order properties" question about size
and timing (e.g. I bet real Skype traffic does not draw its packet size
and timing independently from the size and timing of the previous packet)?
Combined with the fact that DPI vendors have quite a bit of experience
targeting Skype traffic in particular, I worry that they've thought
about this specific question more than we have.
Yes, we can definitely go beyond first-order statistics. It should be fairly straight forward to do so.

D) The morphing output is basically identical to the naive shaping. Are
you sure you did it right?
So as mentioned in the report, the original traffic morphing does
not consider timing at all (which makes it less effective against
DPIs) and it aims at minimizing the overhead, ie the number of
padding bytes sent on the wire.
Right. Minimizing padding bytes on the wire is a big reason to like it.

When we introduced the inter-packet
timing feature, it was no longer possible to go with the same
construction, since packets may not be send right away. As a result
we tried a different approach for traffic morphing: we buffered
packets received from Tor, then when it is time to send the next
packet, we simply estimate the original packet size by a sample form
the Tor's packet size distribution. I know there are other ways this
can be done, but in our experiment we didn't observe any tangible
difference in the outcome.
Hrm. So that means your traffic morphing algorithm doesn't try to reduce
padding bytes? That makes your graph 5 make more sense. But is it really
accurate to call it morphing still? It would be great to explore that
tradeoff more.
We called it SkypeMorph since we are still using the morphing matrix. Although, I personally believe we can find a way to minimize the amount of padding while keeping the timing and sizes statistically indistinguishable from that of Skype's, the traffic morphing technique greatly depends on the characteristics of the source protocol (Tor) and it's not easy to guess the timing patterns of user's behind Tor. So if we use traces from web-browsing behind Tor as the input to our software, and our client uses Tor for downloading multimedia content, in this case traffic morphing would not perform very well.

--Roger

_______________________________________________
tor-dev mailing list
tor-dev@xxxxxxxxxxxxxxxxxxxx
https://lists.torproject.org/cgi-bin/mailman/listinfo/tor-dev

_______________________________________________
tor-dev mailing list
tor-dev@xxxxxxxxxxxxxxxxxxxx
https://lists.torproject.org/cgi-bin/mailman/listinfo/tor-dev