[Author Prev][Author Next][Thread Prev][Thread Next][Author Index][Thread Index]

Re: [tor-dev] Proposal xyz : Count Unique IP addresses in an anonymous way



On Tue, 21 Mar 2017 16:06:59 +0000, Jaskaran Singh wrote:
...
> > On the other hand side you can indeed keep the filter rather small
> > because one bridge doesn't get that many collisions, and you don't
> > need to make it anywhere as big as to avoid collision with 2^32 entries.
> > Could also be dynamically sized depending on the number of clients seen
> > - you need aging anyway, so the next table can have a different size.
> > 
> I feel that this isn't a nice solution. Suppose you have 10 cells and 3
> hash functions at the beginning. Later when inputs exceed a threshold,
> you increase the size of bloom filter to make it to 20 cells.

10 is way too low for any population (if 'cell' means 'bit');
I played with a bit of code for that.

> 3 hash functions would map to the whole range which means the inputs
> that were mapped to 10 cells would now map to something completely
> different. Hence, error rate would be, I guess exactly 100%.

No, basically you need to retain the old bloom filter while seeding the
new one for the amount of time of your entry timeout.

(And given the other discussion regarding brute-forcing the 2^32 space,
either you need a really time-consuming hash, or you count on the
pre-seeding with random IP addresses as the only means to cover the
real ones.)

I wonder what would happen if you implement the aging by just
randomly clearing bits in the bloom filter at an appropriate rate.

Andreas

-- 
"Totally trivial. Famous last words."
From: Linus Torvalds <torvalds@*.org>
Date: Fri, 22 Jan 2010 07:29:21 -0800
_______________________________________________
tor-dev mailing list
tor-dev@xxxxxxxxxxxxxxxxxxxx
https://lists.torproject.org/cgi-bin/mailman/listinfo/tor-dev