[Author Prev][Author Next][Thread Prev][Thread Next][Author Index][Thread Index]

[freehaven-cvs] Revise related work, add start of statistical disclo...



Update of /home/freehaven/cvsroot/doc/pynchon-gate
In directory moria.mit.edu:/tmp/cvs-serv25113

Modified Files:
	pynchon.tex pynchon.bib 
Log Message:
Revise related work, add start of statistical disclosure, fix equations in design

Index: pynchon.tex
===================================================================
RCS file: /home/freehaven/cvsroot/doc/pynchon-gate/pynchon.tex,v
retrieving revision 1.25
retrieving revision 1.26
diff -u -d -r1.25 -r1.26
--- pynchon.tex	16 Sep 2004 20:05:52 -0000	1.25
+++ pynchon.tex	16 Sep 2004 21:06:33 -0000	1.26
@@ -8,6 +8,9 @@
 
 \setcounter{page}{1}
 
+\newcommand\SUBKEY{\mbox{Subkey}}
+\newcommand\UserID{\mbox{UserID}}
+
 \begin{document}
 
 \title{The Pynchon Gate}
@@ -134,35 +137,41 @@
 problem of {\it receiving} messages sent to a pseudonymous address.
 
 \subsubsection{Reply blocks and return addresses.}
-Chaum~\cite{chaum-mix} describes a method of using \emph{return addresses}
-with forward-secure mix-nets. However, the system relies upon all selected
-component nodes of the mix being operational in order for mail to be
-delivered. This makes the system too unreliable for practical
-use.\footnote{Forward-only messages through a mix-net, however, are
+In 1981, Chaum~\cite{chaum-mix} described a method of using \emph{return
+  addresses}
+in mix-nets: recipients encode a reply path, and allow senders to affix
+messages to the encoded path.  As the message modes through the network, the
+path is decoded and the message encoded at each hop, until an encoded message
+reaches its eventual recipient.  This system relies upon all selected
+component nodes of the chosen path remaining operational in order for mail to be
+delivered, which can make the system too unreliable for practical
+use if significant time elapses between path generation and message
+origination.\footnote{Forward-only messages through a mix-net, however, are
 sufficiently reliable. The client software is able to evaluate network
 health information~\cite{echolot} before sending a message, and thus can
 construct robust remailer chains based on the current health of the
 remailer network.}
 
-In addition to reliability issues, simple reply-block systems suffer from
+In addition to reliability issues, some implementations of these ``reply
+blocks'' suffer from
 a pseudonym management perspective. Cypherpunk nym servers based on the
 first generation implementation of Chaum's mix-nets (Type I
 remailers~\cite{hal-remailer}), such as {\tt
 alpha.c2.net}~\cite{alpha-faq} and {\tt
-nym.alias.net}~\cite{nym-alias-net}, implemented the concept of a central
+nym.alias.net}~\cite{nym-alias-net}, implemented a central
 reply-block repository which allowed pseudonym-holders to receive messages
-delivered to a traditional email address. Unfortunately, reply-block
-systems based on Type I remailers allow multiple-use reply blocks. Attacks
-on multiple-use reply blocks are discussed in~\cite{remailer-attacks}. For
-a system to permit multiple-use reply blocks, it must inherently risk
-replay-attacks~\cite{tcmay}. The Type I anonymous remailer system suffers
+delivered to a traditional email address. Unfortunately, Type I remailers
+allow multiple uses of their reply blocks, which are vulnerable to replay and
+flooding attacks as discussed in~\cite{remailer-attacks,tcmay}
+The Type I anonymous remailer system suffers
 greatly because of this. Type II (Mixmaster) and Type III
 (Mixminion~\cite{mixminion}) systems do not permit multiple-use reply
 blocks, and contain replay-attack protection mechanisms~\cite{replay}.
 
 \subsubsection{Single-use reply blocks.} 
 While the Type II system does not have any means of support for anonymous
-reply blocks, the Type III system introduces single-use reply blocks
+reply blocks, the Type III (Mixminion) system introduces single-use reply
+blocks
 (SURBs)~\cite{surb} as a means of avoiding the replay attack issues. The
 Type III protocol requires the recipient to create a large number of reply
 blocks to be used by those who wish to send her mail. In practice, this is
@@ -173,7 +182,7 @@
 set, which is a significant security improvement over Type I, and recent
 work has been done by Danezis and Laurie on attack-resistant anonymous
 packet formats suitable for reply messages~\cite{minx}. However, since
-reply blocks are still being used, the reliability issues remain.\footnote
+reply blocks are still being used, reliability issues remain.\footnote
 {If any given node in the pre-selected SURB is defunct at the time mail is
 set to be delivered, the mail will be lost.} Reply block systems are also
 susceptible to intersection attacks~\cite{disad-free-routes}. A global
@@ -181,29 +190,31 @@
 enough time and data, will be able to reliably determine who is talking to
 whom via statistical correlation~\cite{statistical-disclosure}.
 
-\subsubsection {Network-level client anonymity.}
+%XXXX cite underhill
 
+\subsubsection {Network-level client anonymity.}
 The ZKS Freedom Network~\cite{freedom2-arch} provided anonymous IP access
 to a POP3 server~\cite{freedom2-mail}, enabling its users to maintain
 pseudonyms using standard email protocols. Freedom was discontinued due to
-the high expense of operating the network, in particular the bandwidth
-costs. Other network-level anonymity systems, such as
-Pipenet~\cite{pipenet} and Onion Routing~\cite{goldschlag96} could be used
+high operating expenses, especially in bandwidth costs.
+Other network-level anonymity systems, such as
+Pipenet~\cite{pipenet}, Onion Routing~\cite{goldschlag96}, the Java Anon
+Proxy~\cite{jap}, or Tor~\cite{tor-design}, could be used
 in much the same fashion; unfortunately, they also suffer the same
-barriers to deployment~\cite{fiveyearslater}. The Java Anonymous
-Proxy~\cite{jap} has had greater adoption, but has suffered an anonymity
-compromise~\cite{jap-backdoor,jap-pr}. Low-latency anonymity systems such
-as these are also at greater risk of failure than the high-latency mixes,
-and are more susceptible to traffic analysis
-attacks~\cite{danezis-pet2004,gd-thesis,mixmaster-reliable}.
+barriers to deployment~\cite{fiveyearslater}.% The Java Anonymous
+%Proxy~\cite{jap} has had greater adoption, but has suffered an anonymity
+%compromise~\cite{jap-backdoor,jap-pr}.
+Low-latency anonymity systems such as these are also far more susceptible to
+traffic analysis methods such as end-to-end timing attacks than are
+high-latency mix
+networks~\cite{danezis-pet2004,gd-thesis,mixmaster-reliable}.
 
 \subsubsection{Network-level server anonymity.}
-
 The second generation implementation of Onion Routing~\cite {tor-design}
-implements the concept of rendezvous points~\cite {ian-thesis}, which
+implements rendezvous points~\cite {ian-thesis} that
 allow users to offer location-hidden services. A user wishing to
-anonymously receive messages could operate an email server in a
-location-hidden fashion. Messages would be delivered to the server over
+anonymously receive messages could use this to receive mail at a hidden
+location: Messages would be delivered to the server over
 the Onion Routing network, and successful delivery would not require the
 sender to know the IP address of the destination server.
 
@@ -212,7 +223,6 @@
 the previously mentioned concerns with these anonymity systems.
 
 \subsubsection{Re-encryption mixes.}
-
 Re-encryption mixes~\cite{universal} aim to improve the reliability of
 anonymous message systems. Recent work has shown that re-encryption mixes
 can be used to facilitate anonymous message replies~\cite
@@ -220,44 +230,50 @@
 re-encryption mixes do improve on the robustness of simple reply blocks in
 a Chaumian mix-net, reliability problems are still possible. Re-encryption
 mixes require that the security vs. reliability tradeoffs be made by the
-sender at the time that the message is sent. A more desirable property
+sender at the time that the message is sent.  A more desirable property
 would be to allow the recipient to make security determinations at the
 time the message is retrieved.
 
-Mix-nets inherently rely upon cryptographic primitives for their security.
-While the cryptographic primitives utilized in traditional Chaumian
-mix-nets are fairly well understood, the additional primitives in
-re-encryption mixes have received less rigorous analysis. Analysis of
-cryptographic primitives has a long history of being fraught with peril,
-and in many cases flaws remained unnoticed until years after a system
-first achieved wide deployment.\footnote{Re-encryption mixes rely on the
-ElGamal cryptosystem~\cite{elgamal}, aspects of which have been discovered
-to contain flaws~\cite{bleichenbacher,nguyen} that have
-significantly affected deployed cryptographic
-applications~\cite{pgp5-elgamal,gpg-compromised}. While these
-particular flaws are related to ElGamal signature schemes rather than the
-encryption techniques used by re-encryption mixes, general concerns about
-implementation security of the ElGamal cryptosystem remain.}
+%Mix-nets inherently rely upon cryptographic primitives for their security.
+%While the cryptographic primitives utilized in traditional Chaumian
+%mix-nets are fairly well understood, the additional primitives in
+%re-encryption mixes have received less rigorous analysis. Analysis of
+%cryptographic primitives has a long history of being fraught with peril,
+%and in many cases flaws remained unnoticed until years after a system
+%first achieved wide deployment.\footnote{Re-encryption mixes rely on the
+%ElGamal cryptosystem~\cite{elgamal}, aspects of which have been discovered
+%to contain flaws~\cite{bleichenbacher,nguyen} that have
+%significantly affected deployed cryptographic
+%applications~\cite{pgp5-elgamal,gpg-compromised}. While these
+%particular flaws are related to ElGamal signature schemes rather than the
+%encryption techniques used by re-encryption mixes, general concerns about
+%implementation security of the ElGamal cryptosystem remain.}
+%
+%% I think the thing above is kinda FUDdy -- the entire anonymity field is
+%% less well analyzed than the discrete log problem.
 
 \subsubsection{Broadcast messages and dead-drops.}
 Chaum discusses a traffic-analysis prevention method wherein all reply
 mail in the anonymous mail system is sent to all possible recipients. A
-more friendly optimization has already been attempted in the form of
+less invasive optimization has already been implemented in the form of
 Usenet mail drops~\cite{aam}: an anonymous remailer can be instructed to
 deliver mail to a newsgroup, rather than to an email recipient. Such mail
-could be encrypted with a recipient's private key, and left for her to
-collect from the newsgroup. If all users of the system are using the same
-newsgroup and are behaving in an identical manner (for instance, by
+can be encrypted to a recipient's private key, and left for her to
+collect from the newsgroup. If recipients use the same
+newsgroup and behave identically (for instance, by
 downloading the entire set of newsgroup messages daily), the possible
 statistical attacks on direct mail delivery of reply messages to
 individual email addresses are avoided. This solution also removes the
-necessity for a reply-block based system as the drop location could be
-agreed upon out-of-band.
+necessity for reply-blocks, as the drop location can be
+established upon out-of-band.
 
-This ``send everything everywhere'' approach suffers massive scalability
-problems. As the number of users in the system increases (and thus, the
-anonymity strength of the system also increases) the resource requirements
-become prohibitive. Users will be encouraged to ``cheat'' and only download
+Of course, this ``send everything everywhere'' approach suffers massive
+scalability
+problems. As the number of users in the system increases,
+% (and thus, the
+%anonymity strength of the system also increases)
+each user's bandwidth requirements become prohibitive.
+Users are thus encouraged to ``cheat'' and only download
 sections of the newsgroup that they are sure contain their messages, or
 not download on days that they don't expect messages. This allows an
 attacker to gather information about messages in which an individual has
@@ -267,7 +283,6 @@
 \subsection{Known attacks against pseudonymity systems}
 \label{subsec:known-attacks}
 %XXXX writeme
-
 We discuss the security implications in pesudonymity systems throughout
 this paper. Most attacks on pseudonymity systems fall into one of the
 following categories.
@@ -300,12 +315,52 @@
 
 \subsection{Statistical disclosure against reply-block-based nym servers}
 \label{subsec:disclosure}
+Nym servers based on reply blocks (discussed in section
+\ref{subsec:related-work} above) are currently the most popular option for
+receiving messages pseudonymously.  Nevertheless, they are especially
+vulnerable to end-to-end traffic analysis.
+
+Suppose an adversary is eavesdropping on the nym server, and on all
+recipients.  The attacker wants to know which user (call her Alice) is
+associated with a given pseudonym (say, {\tt nym33}).  The adversary can
+mount an {\it intersection attack}, by noticing that Alice receives more
+messages, on average, after the nym server has
+received a message for {\tt nym33} than when it has not.\footnote{This
+  task is especially easy if the adversary can distinguish reply messages
+  from non-reply messages, as is possible with Type I remailers.}
+Over time, the adversary will notice that this correlation holds for Alice
+but not for other users, and deduce that Alice is likely associated with {\tt
+  nym33}.
+
+Recent work~\cite{statistical-disclosure,e2e-traffic} has studied the use of an
+implementation of these intersection attacks called ``Statistical
+Disclosure,'' where an attacker compares network behavior when Alice has sent
+to network when she is absent in order to link an {\it anonymous} sender
+Alice to her regular
+recipients $\mbox{Bob}_1 ...  \mbox{Bob}_N$.   Against {\it pseudonymous}
+recipients, however, these attacks are significantly easier: in the anonymity
+case we assume that many senders may send to any given recipient
+$\mbox{Bob}_i$, but with pseudonymous delivery, we only one
+user sends or receives messages for a given pseudonym.  This additional
+linkability makes the attack converge more quickly.
+
+To examine this effect, we ran a modified version of the attack simulations
+described in~\cite{e2e-traffic}, assuming a single target pseudonym and $N$
+non-target pseudonyms providing cover.  In order to make the attacker's job
+as difficult as possible (and thus establish an upper bound for security), we
+assume that all users behave identically: they receive messages with equal
+probability according to independent geometric distributions in each unit of
+time (receiving no messages with probability $1-P_M$); they each use
+identical reply blocks with path length $\ell$ through mixes in a steady
+state that delay each message each round with probability $P_D$.
+
+% XXXX Describe results
 
 \section{The Pynchon Gate Design}
 \label{sec:design}
 % XXXX write stuff here
 
-\section{Overview and Rationale}
+\subsection{Overview and Rationale}
 The Pynchon Gate is a network of servers that provide anonymous message
 retrieval capabilities. The servers receive messages for many different
 pseudonym accounts via email\footnote{The servers could also receive
@@ -332,7 +387,9 @@
 the security properties of PIR is trivial in comparison to many
 cryptographic primitives.
 
+
 \section{Component Details}
+% Fold this into the previous section
 
 The Pynchon Gate consists of a number of logical components, some of which
 may or may not coexist on the same network device or as part of the same
@@ -358,19 +415,19 @@
 These keys are created at account creation time. The shared secret can be
 reset by the nym holder after creation.
 
-The shared secret is updated every cycle, such that, if S[i] is the shared
-secret in cycle i, then S[i+1] = H(S[i]|"NEXT CYCLE").
-
-From each S[i], the nymserver derives a set of sub-secrets for individual
-messages received that cycle.  The j'th sub-secret on day i is:
-  SUBKEY(j+1,i) = H(SUBKEY(j,i) | "NEXT SECRET") 
-  SUBKEY(0,i) = H(S[i] | "NEXT SECRET")
+The shared secret is updated every cycle, such that, if $S[i]$ is the shared
+secret in cycle $i$, then $S[i+1] = H(S[i]|\mbox{\tt "NEXT CYCLE"})$, where
+$|$ denotes concatenation.
 
-%XXXX Turn me into latex!
+From each $S[i]$, the nymserver derives a set of sub-secrets for individual
+messages received that cycle.  The $j$'th sub-secret on day i is
+  $\SUBKEY(j+1,i) = H(\SUBKEY(j,i) | \mbox{\tt "NEXT SECRET"})$,
+with
+  $\SUBKEY(0,i) = H(S[i] | \mbox{\tt "NEXT SECRET"})$
 
 We use a separate chain of keys for each cycle so that it is easier for a
 user to resynchronize after missing a few cycles.
-   
+
 Messages sent from the nym owner to the nym server include a simple MAC to
 authenticate them; they are encrypted at the anonymous mail forwarding
 system layer. Messages sent from the nym server to the nym owner are
@@ -379,8 +436,8 @@
 in the clear, and the messages for the nym user are encrypted and MAC'd
 using that iteration of the dynamic key.
 
-%XXXX Turn me into latex!
-      UserID[i] = H(S[i] | "USER ID")
+%XXXX Integrate me into the text
+  $\UserID{}[i] = H(S[i] | \mbox{\tt "USER ID"})$
 
 Dynamic key rotation allows us to achieve forward secrecy. Without dynamic
 key rotation it would be trivial for an attacker to archive all data sent
@@ -392,10 +449,11 @@
 In the interest of retaining little information for an attacker,
 implementations should discard old secrets \emph{as soon as they are no
 longer needed.} Thus, at the start of each cycle i, a nymserver should
-derive S[i+1], UserID[i], and SUBKEY(0,i), and immediately discard S[i].
-After using each SUBKEY(j,i), the nymserver should calculate SUBKEY(j+1,i)
-and discard SUBKEY(j,i).  After each cycle, the nymserver should discard
-the last SUBKEY, and UserID[i].
+derive $S[i+1]$, $\UserID{}[i]$, and $\SUBKEY(0,i)$, and immediately discard
+$S[i]$.
+After using each $\SUBKEY(j,i)$, the nymserver should calculate $\SUBKEY(j+1,i)$
+and discard $\SUBKEY(j,i)$.  After each cycle, the nymserver should discard
+the last $\SUBKEY(j,i)$, and $\UserID{}[i]$.
 
 % XXXX Clean up the above. I lifted this from the spec -- it should be
 % made pretty.  --Len.

Index: pynchon.bib
===================================================================
RCS file: /home/freehaven/cvsroot/doc/pynchon-gate/pynchon.bib,v
retrieving revision 1.13
retrieving revision 1.14
diff -u -d -r1.13 -r1.14
--- pynchon.bib	16 Sep 2004 20:01:45 -0000	1.13
+++ pynchon.bib	16 Sep 2004 21:06:33 -0000	1.14
@@ -505,3 +505,14 @@
   editor = {Rebecca N. Wright}, 
   publisher = {Springer-Verlag, LNCS 2742}, 
 }
+
+@InProceedings{e2e-traffic,
+   author = {Nick Mathewson and Roger Dingledine},
+   title = {Practical Traffic Analysis: Extending and Resisting Statistical Disclosure},
+   booktitle = {Proceedings of Privacy Enhancing Technologies workshop (PET 2004)},
+   year = {2004},
+   month = {May},
+   series = {LNCS},
+   www_section = traffic,
+   www_pdf_url = "http://freehaven.net/doc/e2e-traffic/e2e-traffic.pdf";,
+}
\ No newline at end of file

***********************************************************************
To unsubscribe, send an e-mail to majordomo@seul.org with
unsubscribe freehaven-cvs       in the body. http://freehaven.net/