[Author Prev][Author Next][Thread Prev][Thread Next][Author Index][Thread Index]
Re: [tor-bugs] #4957 [Metrics Data Processor]: Decide how to sanitize pluggable transport lines in bridge descriptors
#4957: Decide how to sanitize pluggable transport lines in bridge descriptors
------------------------------------+---------------------------------------
Reporter: karsten | Owner: karsten
Type: task | Status: new
Priority: normal | Milestone:
Component: Metrics Data Processor | Version:
Keywords: | Parent:
Points: | Actualpoints:
------------------------------------+---------------------------------------
Comment(by karsten):
Replying to [comment:2 asn]:
> Transport lines will look like this:
> {{{
> transport SP <methodname> SP <address:port> [SP arglist] NL
> }}}
> and there is also an optional field for supplemental data:
> {{{
> transport-info SP <methodname> [SP arglist] NL
> }}}
> I think you can ignore `transport-info` for now since it's not
implemented and there are no transports that need it yet.
So, it looks like the contents of `transport-info` lines will be no more
sensitive than the `[SP arglist]` part of `transport` lines, right? If we
want to keep `[SP arglist]` in `transport` lines, we can as well keep
`transport-info` lines, even if they're not in use yet.
> As far as sanitization is concerned, I'm not sure which approach is
better. I'm also not completely sure how bridge descriptors are used; I
assume they are used when analyzing bridge stats, and when a user wants to
look at the descriptor of her bridge in atlas. Are there other use cases?
Those are the two major use cases. I'm mainly interested in the bridge
stats part, though. It would be good to see how widely the different
transports are deployed and maybe be able to infer which of them are
blocked or not.
> Some sanitization approaches:
>
> a) No sanitization. Pluggable transports and their ports are dislosed to
people who know a bridge.
Note that everyone can learn the contents of sanitized bridge descriptors
by downloading the tarballs or rsync'ing them from metrics. It's not just
people who know a bridge who'll receive the sanitized descriptors.
If this a) includes leaving in the `address` part, I disagree. We should
sanitize the `address` part in the same way how we sanitize bridge IP
addresses. We can probably leave the `port` part in, because it ''might''
give us some hints whether a specific port works better than other ports
for a given transport.
What does the `arglist` tell us that would be useful for statistical
analysis? There are no shared secrets in that line, are there? If we
take out the `arglist` part, I think we already decide against keeping
`transport-info` lines in the future, because their only purpose seems to
be to add another `arglist` to an existing transport.
> b) Sanitization. Only display whether the bridge supports pluggable
transports or not. Or maybe the number of transports it supports. Or maybe
something else.
The simple fact that a bridge supports pluggable transports or the number
of supported transports seems hardly useful for statistical analysis.
What we ''could'' do is only keep `transport SP <methodname>` for each
transport that a bridge supports. But I don't see yet how the sanitized
address and (non-sanitized) port are sensitive information that we'd have
to remove.
> c) Paranoia. '''Don't''' display any pluggable transport-related
information.
That's bad, because we should come up with ''some'' stats to show how
successful pluggable transports are, if we can.
> If I were to select one I would probably go with a). It's good both for
analysis and for users who want to know more about their bridges.
I agree.
> I'm also not sold by the use case of a bridge operator who supports
multiple transports, has a public bridge, and wants to hide some of her
transports from her users. However, Tor users have many different use
cases and I only know of a few, so if others think that b) or c) (or d))
are more reasonable (or support a larger range of use cases) I'm OK with
it.
Okay. Here's what I'm going to do, unless you or somebody else tells me
it's a bad idea:
- Sanitize `transport` lines by sanitizing the `address` part similar to
how we sanitize other addresses and keeping the rest of the line
unchanged.
- Leave in `transport-info` lines without changing them at all.
Does that make sense? (Thanks!)
--
Ticket URL: <https://trac.torproject.org/projects/tor/ticket/4957#comment:3>
Tor Bug Tracker & Wiki <https://trac.torproject.org/>
The Tor Project: anonymity online
_______________________________________________
tor-bugs mailing list
tor-bugs@xxxxxxxxxxxxxxxxxxxx
https://lists.torproject.org/cgi-bin/mailman/listinfo/tor-bugs