[Author Prev][Author Next][Thread Prev][Thread Next][Author Index][Thread Index]

Re: Anonymity-preserving collection of usage data of a hidden service authoritative directory



On Sun, Apr 29, 2007 at 05:09:55PM +0200, Karsten Loesing wrote:
>     http://88.84.144.63/hsusage-patch
> 
> It is my first C code, so when reviewing it, could you please tell me
> whatever I can do better the next time? I testet my code (of course
> using PuppeTor :) ), but maybe there is still some storage leak or
> whatever. Thanks!

Looks great as a first step. But since you ask :), here are some code
comments for a revision of the patch:

a) We should enable this only when a certain config option is set. Nick
suggests the name "HSAuthorityRecordStats". Adding a new config option
is easy -- look at http://archives.seul.org/or/cvs/Sep-2006/msg00015.html
for the basic two steps, and then the optional third step is to look at
options_validate() and add a line like
  if (options->HSAuthorityRecordStats && !options->HSAuthoritativeDir)
    REJECT("HSAuthorityRecordStats is set but we're not running as "
           "a hidden service authority.");
And then it needs an entry in the man page.

b) It needs a free_all() style function. See tor_free_all() in main.c,
and you can model it like rep_hist_free_all() if you want. The goal is
to clean up all your stuff on shutdown, so it's easier to run tools to
detect leaks.

c) We prefer /* comments */ rather than // comments, since we're aiming
to keep C compliant.

d) rend_store_desc() can be called from two different places in
directory.c -- one if it's a newly published descriptor, and one if
we're a HS client fetching a descriptor. I would suggest a new
argument to rend_store_desc called "published" which is 1 if it was
just published to us and 0 otherwise. Then you can do the right thing
inside rend_store_desc.

e) in main.c, use a #define for your "900", e.g. WRITE_HSUSAGE_INTERVAL,
so it fits with the other FOO_INTERVALs.

f) s/occurences/occurrences/g/. And then consider
s/occurrences/count/g. :)

g) Make your patch not complain when you build with './configure
--enable-gcc-warnings', or when you run 'make check-spaces'. I didn't
try the gcc-warnings, but I noticed a few places that check-spaces ought
to complain.

h) When comments for structs start with "A structure for", you can
usually take out that phrase and it will be just as good.

i) In hs_usage_note_publish_total() et al, you can just use time(NULL)
as an arg to hs_usage_add_service_related_observation() rather than
keeping a separate variable for it. In fact, it is probably best to
hand the time in as an arg to hs_usage_note_publish_total() -- the goal
is to make individual functions as independent as possible of their
calling environment, so later we could write some unit tests with these
functions, *tell* them what time they should think it is, and make sure
they behave correctly.

j) Your variables are uint64_t's just in case we get more than 4 billion
hidden service hits or updates in a 15 minute period? That's more than
4 million per second. I guess it doesn't hurt since this code is seldom
used, so it's fine to keep it there. But wow. :)

k) A minor issue -- in your comparisons toward the end of rephist,
e.g. "now>current_period->start_of_next_period", we try to add spaces
on either side of the >, because > and -> can look quite similar.

l) hs_usage_insert_value() makes me uncomfortable. For example, your
"create first elem" and "append to end" codeblocks are nearly the
same. But again, this stuff isn't called much, so it's just a question
of code bloat and readability.

> The refined specification and some implementation notes are at the
> bottom of this mail.

These look really good.

> Btw: When reading the code I found the command-line option
> "--ignore-missing-torrc" that I didn't find in the docs.

Yeah, we don't document our commandline switches like this very
well. This one was only added recently. Where are the rest of
them documented? :) We should add this one.

Thanks!
--Roger