[Author Prev][Author Next][Thread Prev][Thread Next][Author Index][Thread Index]
Re: [tor-bugs] #30716 [Circumvention/Obfs4]: Improve the obfs4 obfuscation protocol
#30716: Improve the obfs4 obfuscation protocol
-------------------------------------------------+-------------------------
Reporter: phw | Owner: phw
Type: task | Status:
| assigned
Priority: High | Milestone:
Component: Circumvention/Obfs4 | Version:
Severity: Normal | Resolution:
Keywords: sponsor28, anti-censorship-roadmap- | Actual Points:
august |
Parent ID: | Points: 20
Reviewer: | Sponsor:
| Sponsor28-must
-------------------------------------------------+-------------------------
Changes (by phw):
* cc: dcf (added)
Comment:
Website fingerprinting attacks typically operate on traffic traces that
are frequently encoded as sequences of the form:
{{{
<time>,+/-<packet length>
}}}
`+<packet length>` refers to packets going from the client to the server
and `-<packet length>` refers to packets going from the server to the
client. For example:
{{{
1567548098,+1500
1567548098,+800
1567548099,-1500
1567548099,-1500
1567548100,-700
}}}
Interestingly, packet lengths may not even be necessary. In their
[https://arxiv.org/pdf/1801.02265.pdf CCS'18 paper], Sirinam et al. write
in Section 5.1.1:
> However, we performed preliminary evaluations to compare the WF attack
performance between using packet lengths and without packet lengths, i.e.,
only packet direction, as feature representations. Our result showed that
using packet lengths does not provide a noticeable improvement in the
accuracy of the attack. Therefore, we follow Wang et al.’s methodology and
consider only the direction of the packets.
The traffic trace above can therefore be reduced to:
{{{
+1
+1
-1
-1
-1
}}}
Note that obfs4 makes no attempt to defend against website fingerprinting
attacks. Its goal is to escape protocol classification but these two
problems (and their respective attacks) overlap to some extent, which is
why obfs4 would be better off with defences against such attacks.
[https://lists.torproject.org/pipermail/tor-dev/2017-June/012310.html As
dcf already pointed out], obfs4 only sends data when the application
(e.g., Tor) has data to send. Then, depending on what iatMode is used,
obfs4 may append padding to the application's data and add inter-arrival
delays. Coming back to the example above, obfs4 can only **extend** a
packet burst but not **break** a burst. That is, obfs4 can turn the packet
sequence
{{{
+1
+1
-1
-1
-1
}}}
into the sequence
{{{
+1
+1
+1 (padding packet, which extends a burst)
-1
-1
-1
}}}
but not into the sequence
{{{
+1
-1 (padding packet, which breaks a burst)
+1
-1
+1 (padding packet, which breaks a burst)
-1
-1
}}}
I spent some time looking into ways to fix this issue. It turns out that
we can add the ability to break packet bursts to obfs4 without losing
backwards compatibility, allowing a brand-new, burst-breaking obfs4 client
to talk to an old obfs4 server (however, see below for a caveat). I
implemented a simple proof-of-concept, for now called
[https://trac.torproject.org/projects/tor/wiki/doc/PluggableTransports/BabyNameBook
sharknado], in my
[https://dip.torproject.org/phw/obfs4/commit/8da050f29866444b9af685d277c20b7ab142593a
feature/30716 branch]. The idea is simple: instead of having obfs4 write
directly to its socket, it now writes to the `SharknadoConn` struct, which
implements the `net.Conn` interface. After each call to `Read`, there's a
1 in 10 chance to send padding, regardless of if the application has data
waiting or not.
There are several remaining challenges:
* Effectively breaking bursts may require the client and the server to
cooperate. For example, when the client receives the beginning of a burst,
the adversary (who's somewhere between the client and the server) may
already have seen the entire packet sequence, so we cannot break it
anymore. We may be able to address this by having the server send only a
few packets of its burst and then waiting until it received the client's
burst-breaking packets.
* We should find a way to make obfs4's packet sequences server-specific by
incorporating the server's shared secret into the sequence generation
process, just like it's done for packet lengths and inter-arrival times.
* We need to build an evaluation framework to understand what works and
what doesn't.
Any thoughts?
--
Ticket URL: <https://trac.torproject.org/projects/tor/ticket/30716#comment:10>
Tor Bug Tracker & Wiki <https://trac.torproject.org/>
The Tor Project: anonymity online
_______________________________________________
tor-bugs mailing list
tor-bugs@xxxxxxxxxxxxxxxxxxxx
https://lists.torproject.org/cgi-bin/mailman/listinfo/tor-bugs