[Author Prev][Author Next][Thread Prev][Thread Next][Author Index][Thread Index]

Re: [tor-dev] Revisiting prop224 client authorization



teor wrote:
> 
>> On 3 Nov. 2016, at 10:37, s7r <s7r@xxxxxxxxxx> wrote:
>>
>> I am very happy with the torspec patch.
>>
>> Not quoting entirely, only want to add something wrt randomizing the
>> value for fake clients based on David's and teor's comments:
>>
>> David Goulet wrote:
>> [SNIP]
>>>
>>> - I think "superencrypted" -> "super-encrypted" would be nicer as everything
>>>  in the descriptor as that separation of word. Or even "client-encrypted" if
>>>  we want to add extra semantic. No strong opinion apart from the "-" :).
>>>
>>
>> I prefer super-encrypted vs. client-encrypted.
>>
>>> - [XXX consider randomization of the value 16]
>>>
>>>  If it's fixed, we basically create bucket so a client can know that there
>>>  are 0-16 clients or 16-32 clients and so on.
>>>
>>>  If we randomize that value and let's say it's 7 then we have bucket of 7. If
>>>  that value is randomized _every_ new descriptor, we create multiple size of
>>>  buckets but over time someone could deduce (maybe) the low bound of clients
>>>  by observing all random values and thus assume there are 0-<low bound>.
>>>
>>>  I'm uncertain here what's best but seems that in any case, bucketing is
>>>  happening as we pad with fake "auth-client". So I would assume here, out of
>>>  my head to be safe, that we might want _all_ services to kind of look the
>>>  same thus a fixed value would make sense following that train of thought.
>>>
>>> I'm liking the rest here! We'll have to think also on some padding in the
>>> INTRODUCE1 cell to avoid leaking client auth is being used.
>>>
>>
>> This is true, we create buckets no matter what, but I think it's better
>> if one has to watch a hidden service for a lot more time to determine
>> the probable number rather than being able to tell from the first
>> descriptor that there are 0-16 clients, 16-32 clients and so on.
>>
>> I fully agree that randomizing _every_ new descriptor does not help and
>> probably in short time someone could deduce a possible number, but I am
>> slightly uncomfortable with a global fixed value for this. One more
>> idea, if it's not helpful we can just go ahead with a fixed value of 16.
>>
>> I think it's better if we pick a random number between 8 and 32 fake
>> clients and remember the picked value so it will be used for every new
>> descriptor until something in our setup changes or enough time has
>> passed. In order to know when to reset it, we save it (in our state)
>> along with:
>> 1. The number of real authorized clients when the random value was picked.
>> 2. Timestamp when the random value was picked + an end of life for the
>> random value.
>>
>> We reset the random value of fake authorized clients and also its end of
>> life when:
>>
>> a) number of real authorized clients in torrc changes from what we have
>> in our state.
>> b) end of life for the random value is reached. End of life will be
>> timestamp + a random period between 30 and 90 days.
>> c) obvious case when Tor is re-installed and old state is lost.
>>
>> We call this function on every HUP and (re)start. We can tune the
>> numbers 8 - 32 and period 30 - 90 days as you like.
>>
>> This way there are a lot of buckets and significantly more time needed
>> for an observer to deduce a probable number. It is quite possible one
>> can never deduce a "probable enough" number.
>>
>> We combine this with faking extra if needed in the encrypted portion to
>> the next multiple of 10k bytes.
>>
>> It's true that it won't help if the hidden service operator changes the
>> number of authorized clients every hour for a long period but in
>> practice this doesn't happen - number of authorized clients changes
>> rarely. And even in this scenario it still makes things a lot more
>> confusing.
>>
>> Compared to other parts of prop 224, this is easy to code and should be
>> worth the effort. What do you think?
> 
> If you want to do it this way, with noise and buckets, ask someone who is
> good at differential privacy to do the numbers for you, rather than guessing.
> 
> You'll need to know the level of activity you want to hide.
> 
> T
> 

As I said the numbers can be changed - I was illustrating an example. I
guessed some numbers that seamed reasonable to me so I could give an
example, and also because it's not a critical part. We only try to hide
the number of real authorized clients, or make it as hard as possible
for an observer to deduce a number close to the realistic number of
authorized clients, that's all.

Simply using the numbers that were guessed without deep knowledge in
differential privacy is a lot better than using a global fixed value of
16, but as I said this doesn't need to be a debate because I am not
against the fixed value, only saying it's better to randomize, if the
solution exists.

Attachment: signature.asc
Description: OpenPGP digital signature

_______________________________________________
tor-dev mailing list
tor-dev@xxxxxxxxxxxxxxxxxxxx
https://lists.torproject.org/cgi-bin/mailman/listinfo/tor-dev