[Author Prev][Author Next][Thread Prev][Thread Next][Author Index][Thread Index]

[tor-commits] [tech-reports/master] adding section on obfuscation techniques



commit f586165e9f9e0f5e1c2b62c85c1ac776248837b6
Author: A. Johnson <aaron.m.johnson@xxxxxxxxxxxx>
Date:   Tue Dec 23 10:00:21 2014 -0600

    adding section on obfuscation techniques
---
 2015/hidden-service-stats/hidden-service-stats.tex |   60 +++++++++++++++++++-
 1 file changed, 59 insertions(+), 1 deletion(-)

diff --git a/2015/hidden-service-stats/hidden-service-stats.tex b/2015/hidden-service-stats/hidden-service-stats.tex
index 7760e83..d4b586e 100644
--- a/2015/hidden-service-stats/hidden-service-stats.tex
+++ b/2015/hidden-service-stats/hidden-service-stats.tex
@@ -353,7 +353,8 @@ for the future that says \emph{why} they are a bad idea.
 We start with statistics that are not specific to the three roles of
 relays in the hidden-service protocol, but that apply to all of them.
 
-\subsubsection{Time from circuit extension to circuit purpose change}
+\subsubsection{Time from circuit extension to circuit purpose change} 
+\label{subsubsec:time_circ_ext_to_purpose_change}
 
 % (the following distinction cannot be made, AFAIK.  here's what happens:
 % we receive a CREATE (?) cell from another relay that establishes the
@@ -489,6 +490,7 @@ We would learn what fraction of clients and what fraction of services run
 older tor versions (0.2.3.x or older).
 
 \subsubsection{Time from circuit purpose change to tearing down circuit}
+\label{subsubsec:time_circ_purpose_change_to_teardown}
 
 \textbf{Details:}
 %
@@ -527,6 +529,7 @@ relay got chosen X times instead of the measured average Y.
 
 \subsubsection{Time from establishing introduction point to tearing down
 circuit (1.1.4.)}
+\label{subsubsec:time_intro_to_teardown}
 
 \textbf{Details:}
 %
@@ -545,6 +548,7 @@ available for a short time only, and what fraction is available most of
 the time.
 
 \subsubsection{Number of descriptor publish request (3.1.1.)}
+\label{subsubsec:num_descriptor_publish}
 
 \textbf{Details:}
 %
@@ -589,6 +593,7 @@ This is a bit related to differential privacy as we understand it, but
 much more basic.
 
 \subsubsection{Number of descriptor updates per service (3.1.2.)}
+\label{subsubsec:num_decriptor_updates}
 
 \textbf{Details:}
 %
@@ -1076,6 +1081,59 @@ The benefit gained from this statistic is not huge though.
 %
 No obvious risks.
 
+\section{Obfuscation methodology}
+The published statistics shouldn't reveal private information to an
+adversary when combined with plausible background knowledge. We will use
+techniques to provide uncertainty about any specific hidden service,
+client, or connection, while maintaining good accuracy in the aggregate
+statistics. These techniques include
+\begin{itemize}
+\item Releasing aggregate statistics over time, such as total counts or
+averages in a given period
+\item Adding noise (i.e. random inaccuracy)
+\item Limiting accuracy to a certain granularity via rounding (aka
+``binning'')
+\item Adding time-delay to the release of statistics such that the output
+doesn't reveal information about ongoing activity
+\item Using cryptographic techniques to hide the source of information,
+such as anonymizing reports from individual relays
+\end{itemize}
+
+
+\subsection{Adversary knowledge}
+We can expect that the adversary may know things such as
+\begin{itemize}
+\item The addresses of a large number of publicly-available services
+(e.g. by crawling the Web)
+\item A minimum amount of traffic received by a given hidden service
+(e.g. due to sending that traffic himself)
+\item The introduction points of a service (by obtaining the descriptor)
+\item The availability of the service (by attempting to connect
+periodically)
+\item Roughly the number of client connections and amount of client
+traffic (possibly leaked by the service itself, e.g. a web forum)
+\end{itemize}
+
+\subsection{Counts}
+
+\subsection{Distributions}
+For many statistics, it would be very helpful to understand the
+distribution of values. For example, such information about descriptor
+fetches could reveal if most hidden services are never used or if
+there are a few hidden services that constitute most HS activity.
+Releasing information about the distribution of statistics could be useful
+for the following statistics:
+\begin{itemize}
+\item Time from circuit extension to circuit purpose change
+(Sec.~\ref{subsubsec:time_circ_ext_to_purpose_change})
+\item Time from circuit purpose change to tearing down circuit
+(Sec.~\ref{subsubsec:time_circ_purpose_change_to_teardown}
+\item Time from establishing introduction point to tearing down
+circuit (Sec.~\ref{subsubsec:time_intro_to_teardown})
+\item Number of descriptor updates per service
+(Sec.~\ref{subsubsec:num_decriptor_updates})
+\end{itemize}
+
 \section{Recommendation}
 \label{sec:recommendation}
 



_______________________________________________
tor-commits mailing list
tor-commits@xxxxxxxxxxxxxxxxxxxx
https://lists.torproject.org/cgi-bin/mailman/listinfo/tor-commits