[Author Prev][Author Next][Thread Prev][Thread Next][Author Index][Thread Index]

[or-cvs] r15255: Finished results including discussion of service publication (projects/hidserv/trunk/doc)



Author: kloesing
Date: 2008-06-14 12:24:00 -0400 (Sat, 14 Jun 2008)
New Revision: 15255

Modified:
   projects/hidserv/trunk/doc/report.pdf
   projects/hidserv/trunk/doc/report.tex
Log:
Finished results including discussion of service publication times.

Modified: projects/hidserv/trunk/doc/report.pdf
===================================================================
(Binary files differ)

Modified: projects/hidserv/trunk/doc/report.tex
===================================================================
--- projects/hidserv/trunk/doc/report.tex	2008-06-14 16:11:37 UTC (rev 15254)
+++ projects/hidserv/trunk/doc/report.tex	2008-06-14 16:24:00 UTC (rev 15255)
@@ -5,14 +5,16 @@
 \usepackage{url}
 \begin{document}
 \title{Tor Hidden Service Performance Analysis}
+\author{The Tor Project\footnote{Please direct questions and comments to
+tor-assistants@xxxxxxxxxxxxxxx}}
 \date{June 15, 2008}
 \maketitle
 
 \begin{abstract}
-This document describes measurements of setting up and accessing a Tor
-hidden service. The objective is to spot performance bottlenecks in the
-hidden service protocol and suggest which parts should be changed in order
-to improve the overall performance of Tor hidden services.
+This document describes a performance analysis of setting up and accessing
+a Tor Hidden Service. The objective is to spot performance bottlenecks in
+the hidden service protocol and suggest which parts should be changed in
+order to improve the overall performance of Tor Hidden Services.
 \end{abstract}
 
 \section{Motivation}
@@ -56,6 +58,12 @@
 breaking?
 \end{description}
 
+\section{Hidden Service Protocol}
+
+\emph{TODO Christian: Your protocol description might also help for
+understanding service publication, so it might be useful to have it here in
+a separate section.}
+
 \section{Measurement Setup}
 
 \subsection{External View Measurements}
@@ -79,30 +87,67 @@
 
 \paragraph{Service Publication}
 
-There are at least three log statements in the server-side Tor process log
-that can be used to determine service publication: the first informs about
-starting up Tor and opening the log file, the second reports that Tor has
-enough directory information to build circuits, and the third announces
-that a hidden service descriptor has been uploaded to the hidden service
-directory:
+The process of service publication can best be analyzed by evaluating Tor's
+log statements:
 
 % log statements taken from test-id 1212519002008
+\begin{enumerate}
+\item
 \begin{verbatim}
 Jun 03 20:50:02.100 [notice] Tor 0.2.1.0-alpha-dev (r14739)
   opening new log file.
+\end{verbatim}
+\item
+\begin{verbatim}
 Jun 03 20:50:11.151 [notice] We now have enough directory
   information to build circuits.
-Jun 03 20:51:22.878 [info] connection_dir_client_reached_eof():
-  Uploaded rendezvous descriptor (status 200 ("Service
-  descriptor stored"))
 \end{verbatim}
+\item
+\begin{verbatim}
+Jun 03 20:50:12.697 [info] rend_services_introduce(): Giving
+  up on sabotage as intro point for stuptdu2qait65zm.
+\end{verbatim}
+\item
+\begin{verbatim}
+Jun 03 20:50:18.633 [info] rend_service_intro_established():
+  Received INTRO_ESTABLISHED cell on circuit 1560 for
+  service stuptdu2qait65zm
+\end{verbatim}
+\item
+\begin{verbatim}
+Jun 03 20:51:18.997 [info] upload_service_descriptor():
+  Sending publish request for hidden service
+  stuptdu2qait65zm
+\end{verbatim}
+\item
+\begin{verbatim}
+Jun 03 20:51:22.878 [info]
+  connection_dir_client_reached_eof(): Uploaded rendezvous
+  descriptor (status 200 ("Service descriptor stored"))
+\end{verbatim}
+\end{enumerate}
 
-Using these log statements both, \emph{initialization time} (time that is
-required to download enough directory information; here: 9.051 seconds) and
-\emph{service establishment time} (time that is used to establish
-introduction points and upload a descriptor; here: 71.727 seconds) can be
-calculated by subtracting the appropriate two timestamps.
+From these log statements the following times can be calculated by
+subtracting the appropriate timestamps:
 
+\begin{description}
+\item[Overall Service Publication] The only time that is relevant from
+a user perspective is the delay between starting Tor (1) until a hidden
+service is available for clients (6); here 80.778 seconds
+\item[Service Establishment] A certain part of the overall service
+publication time is required by Tor for downloading enough directory
+information to build circuits before being able to setup a hidden service.
+However, the focus here is to speed up the hidden service related parts of
+Tor. Therefore, service establishment time focuses only on the time between
+Tor knows enough directory information (2) until uploading the first hidden
+service descriptor (6); here: 71.727 seconds
+\end{description}
+
+The remaining events of giving up an introduction point candidate (3),
+successful establishment of an introduction point (4) and initiating upload
+of a hidden service descriptor (5) are used to further examine individual
+components of service establishment times.
+
 \paragraph{Connection Establishment}
 
 The connection time (from a user perspective) is the time that it takes
@@ -130,6 +175,10 @@
   connection." from source clientapp!
 \end{verbatim}
 
+A more in-depth measurement of the multitude of substeps involved in
+connection establishment is described below in
+Section~\ref{sec:internalview}.
+
 \paragraph{Message Transfer}
 
 The client process sends a numbered message to the server process every
@@ -175,7 +224,8 @@
   connected." from source clientapp!
 \end{verbatim}
 
-\subsection{Connection Establishment}
+\subsection{Internal View Measurements}
+\label{sec:internalview}
 
 \emph{TODO Christian: Describe setup here, possibly re-use parts of diploma
 thesis.}
@@ -188,18 +238,26 @@
 log files is available
 online.\footnote{\url{http://freehaven.net/~karsten/hidserv/test-env.tar.gz}}
 
+During evaluation it has turned out that there is a bug in Tor that leads
+to a delay in service publication (see below for details). This made it
+necessary to perform a second set of measurements between June 12, 9:22pm
+and Jun 13 3:31pm with a new test every single minute instead of every five
+minutes. The resulting 1,090 data samples are also available for
+download.\footnote{\url{http://freehaven.net/~karsten/hidserv/test-env2.tar.gz}}
+
 \emph{TODO Christian: Add meta-data of your measurements here; we should
 consider making your raw data available, too.}
 
 \subsection{Service Publication}
 
-From the 1,090 measured publication times, the 1\% highest values (698,
-901, 1214, 1220, 1222, 1224, 1228, 1241, 1290, 3073, and 3685 seconds) were
-considered as outliers and thereby discarded in the statistical analysis.
-(Nevertheless the corresponding test cases need to be analyzed in detail to
-find out the reasons for these high delays.) Figure~\ref{fig:publtime}
-shows the overall service publication times as a user experiences the
-process of starting up Tor until a hidden service is available for clients.
+Figure~\ref{fig:publtime} shows the overall service publication times as a
+user experiences the process of starting up Tor until a hidden service is
+available for clients. From the 1,090 measured publication times, the 1\%
+highest values (698, 901, 1214, 1220, 1222, 1224, 1228, 1241, 1290, 3073,
+and 3685 seconds) were considered as outliers and thereby discarded in the
+statistical analysis. (Nevertheless, especially outliers need to be
+analyzed separately, because they are accountable for a substantial amount
+of variance.)
 
 \begin{figure}
 \centering
@@ -215,11 +273,13 @@
 \label{fig:publtime}
 \end{figure}
 
-However, these values consist of at least two parts of which one is
-independent from hidden services: a) initialization time and b) service
-establishment time. Therefore, Figure~\ref{fig:esttime} shows only service
-establishment times. This time only the top two value had to be discarded
-as outliers (362 and 3665 seconds).
+However, these values still contain initialization times, i.e.\ the time
+that Tor needs to download enough directory information to build circuits.
+Figure~\ref{fig:esttime} contains only service establishment times that are
+directly related to establishing a hidden service. This time only the top
+two value had to be discarded as outliers (362 and 3665 seconds). These
+times will later be used to measure improvements of hidden service
+publication.
 
 \begin{figure}
 \centering
@@ -245,8 +305,8 @@
 there are four service establishment times below 30 seconds (13, 18, 27,
 and 28 seconds).
 
-It turned out that the reason for this is a bug in the code which is now
-fixed. See SVN revision r15113 for
+It turned out that the reason for this is a minor bug in the code which is
+now fixed. See SVN revision r15113 for
 details.\footnote{\url{http://archives.seul.org/or/cvs/Jun-2008/msg00231.html}}
 This is not meant as confirmation for the usefulness of the 30-second
 delay, but only to make the implementation consistent with the
@@ -255,9 +315,10 @@
 \paragraph{Slow Decrease of Frequencies}
 
 Figure~\ref{fig:esttime} shows an unexpectedly large number of service
-establishment times of 70 seconds or greater. From the histogram one could
-assume that establishment times would steadily decline from their maximum
-at 50 seconds which however is not the case.
+establishment times of 90 seconds or greater. The histogram exhibits a
+decline of establishment times from their maximum at 50 seconds to a very
+low frequency of values around 90 seconds. But there is still a significant
+number of test cases with establishment times greater than 90.
 
 An in-depth analysis of the log files has revealed an even more severe bug.
 While setting up a hidden service, some valid introduction circuits were
@@ -266,14 +327,15 @@
 SVN revision
 r15149.\footnote{\url{http://archives.seul.org/or/cvs/Jun-2008/msg00268.html}}
 
-Figure~\ref{fig:esttime2} shows service establishment times of a subsequent
-measurement with a bugfixed Tor version. Again, three values were discarded
-as outliers (3665, 3680, and 10931 seconds). These new results already show
-an improvement of mean times and standard deviations.
+This bugfix made it necessary to perform a second set of measurements with
+special focus on service publicaton. Figure~\ref{fig:esttime2} shows
+service establishment times with a bugfixed Tor version. Again, three
+values were discarded as outliers (3665, 3680, and 10931 seconds). The
+histogram features a clearly smoother decline of frequencies than the
+histogram of the first measurements. The mean has decreased to 56.73
+($-6.52$), median to 51.99 ($-6.28$), and standard deviation to 15.82
+($-2.85$).
 
-\emph{TODO: Does the bugfix improve publication times
-\textbf{significantly}?}
-
 \begin{figure}
 \centering
 \includegraphics[width=0.8\textwidth]{esttime2.png}
@@ -290,22 +352,84 @@
 
 \paragraph{Outliers}
 
-The outlier values 362 and 3665 in the original measurements appear to be
-unbelievably high for a service establishment time.
+The outlier values of 3665, 3680, and 10931 seconds in the second set of
+measurements using the bugfixed Tor version appear to be unbelievably high
+for a service establishment time. They further obey a certain pattern, as
+the conversion to hours clarifies: 1:00:04, 1:00:36, and 3:01:26 hours.
 
-\emph{TODO Karsten: What happened there?}
+A closer look into the log files reveals that the delay occurs in all three
+cases after Tor decided to upload a descriptor and up to the moment at
+which it is stored on the directory servers. The explanation for this
+behavior is that Tor attempts to upload a descriptor, fails, and waits for
+the next regular descriptor republication one hour later to perform a
+second upload attempt. As a solution to avoid these outliers, a
+significantly shorter timeout or exception handling routine for failed
+descriptor uploads appears useful.
 
-\paragraph{30-Second Delay}
+\paragraph{Exceeding the 30-Seconds Delay}
 
-%Even though these two bugfixes are probably a good improvement for service
-%publication times and hidden services in general, they prevent a deeper
-%analysis of the reasons for high service establishment times for the time
-%being. The next step will be to perform new measurements with a fixed Tor
-%version and further investigate those results.
+% see test IDs 1213300081565 for 34-seconds delay and 1213347721610 for
+% 41-seconds delay
+The current logic for publishing a hidden service descriptor is to wait for
+30 seconds after the last change to the set of introduction points. Though,
+in some test cases these 30 seconds were exceeded: In one test case, there
+is a 34 seconds gap inbetween two introduction establishments, but without
+an attempt to upload a descriptor. In another case, a descriptor is not
+uploaded until 41 seconds after an introduction point establishment.
 
-\emph{TODO Karsten: How can we show whether the 30-second delay is
-useful or not?}
+An inspection of the log files revealed that the abandoning of an
+introduction point candidate resets the 30-seconds counter, too, even
+though it would have no effect on the uploaded descriptor. Hence, given
+that the 30-seconds delay is a useful means to estimate when a descriptor
+is stable, the 30-seconds counter should only be reset when an actual
+change to the descriptor to be uploaded occurs.
 
+\paragraph{Initial Upload of Empty Descriptors}
+
+There is another effect of the current behavior to reset the 30-seconds
+counter when abandoning an introduction point candidate: In very rare cases
+the 30-seconds counter is initialized by such an abandoning event, but then
+Tor does not succeed in establishing an introduction point within the next
+30 seconds. The result is the upload of an empty hidden service descriptor,
+i.e.\ a descriptor containing zero introduction points. Under certain
+circumstances this might be considered an improvement for clients if the
+most recent descriptor originates from a previous run of Tor and is
+entirely wrong. However, in most cases it would be sufficient and even more
+useful to wait for the construction of a non-empty descriptor before
+uploading.
+
+\paragraph{Failing Introduction Point Establishments}
+
+% bug was introduced in r12900, fix should occur in rendservice.c somewhere
+% around line 791
+In some cases Tor establishes up to 13 introduction points before uploading
+the first descriptor. It appears rather unlikely, that 10 out of 13 circuit
+establishment fail. In fact, it turned out that there is another major bug
+in Tor that completely ignores introduction points when they were
+established by cannibalizing an existing circuit. Even though there may in
+general be other reasons for failing establishment of a circuit, this bug
+was the reason for all 10 failed establishments in the considered case.
+Fixing this bug will probably significantly speed up and reduce variance in
+service publication time.
+
+\paragraph{Usefulness of 30-Seconds Delay}
+
+The 30-seconds delay before publishing a descriptor unsurprisingly leads to
+a lower bound of service establishment time of 30 seconds. However, these
+30 seconds originally emerged from a guess of the developers. There is no
+proof that a shorter period might also suffice to upload a stable descriptor,
+i.e.\ one that does not have to be replaced shortly afterwards. The other
+way around it has not yet been evaluated whether a longer delay might
+significantly improve stability.
+
+One of the next steps in the attempt to speed up service publication is to
+measure successful introduction establishments during the first half hour
+(not stopping at the first descriptor publication). From these times one
+can determine the number of necessary descriptor publications for different
+delays than 30 seconds. Further, it might make sense to adapt the delay
+depending on the number of previously published descriptors or the extent
+of changes to the last published descriptor.
+
 \subsection{Connection Establishment}
 
 \begin{figure}
@@ -342,10 +466,11 @@
 \label{fig:restime}
 \end{figure}
 
-\emph{TODO Steven: Any insights from variable correlations here?}
-
 \emph{TODO Karsten: What to do with the results in
-Figures~\ref{fig:reqtime} and \ref{fig:restime}?}
+Figures~\ref{fig:reqtime} and \ref{fig:restime}? are messages ordered? what
+about the phenomenon of large response times and small request times for
+the same connection? Any insights from variable correlations here (add
+diagram)?}
 
 \subsection{Connection Durability}
 
@@ -387,7 +512,13 @@
 Ideas what changes are most likely to improve the overall performance.
 
 \begin{itemize}
-\item Fix more bugs and produce less new ones.
+\item Handle upload failures for each directory separately. (See Outliers)
+\item Restrict resetting 30 seconds delay to events that have an effect of
+the uploaded descriptor only.
+\item Prevent upload of empty descriptors if we can upload a non-empty one
+shortly after.
+\item Fix bug that completely ignores introduction points originating from
+cannibalized circuits.
 \item Think about 30-seconds delay when publishing descriptor.
 \item $\cdots$
 \end{itemize}