[Author Prev][Author Next][Thread Prev][Thread Next][Author Index][Thread Index]
Re: [tor-bugs] #1839 [BridgeDB]: Rotate available bridges over time
#1839: Rotate available bridges over time
-----------------------------+-------------------------------------------
Reporter: arma | Owner: isis
Type: enhancement | Status: needs_review
Priority: blocker | Milestone:
Component: BridgeDB | Version:
Resolution: | Keywords: bridgedb-dist, bridgedb-0.3.2
Actual Points: | Parent ID:
Points: |
-----------------------------+-------------------------------------------
Comment (by isis):
Some IRC logs, because they started off with arma reviewing the design of
this ticket's implementation and recent changed in #4771, and then drifted
to other future BridgeDB/Metrics related tasks.
{{{
05:50 armadev | i was looking at #1839
05:50 -zwiebelbot:#tor-dev- tor#1839: Rotate available bridges over time -
https://bugs.torproject.org/1839
05:50 isis@ | oh great, thanks
05:50 armadev | i need to look more at the plan there, but i
continue to think that the strategy of "don't let an attacker learn very
many bridges in a given time period no matter how much effort they
put in" is a good one
05:50 armadev | but that made me remember another thing i was
wanting us to look at
05:51 armadev | which is #10 on https://blog.torproject.org/blog
/research-problems-ten-ways-discover-tor-bridges
05:51 armadev | right now, we have a hash ring design, where the
"address" of the requestor maps to a point on the hash ring, and we give
them the next k bridges?
05:51 * coderman_ wants more redteam arma blog
05:52 armadev | so that naturally will lead somebody who can
attack some points in this ring to learn all the rest of the bridges if
they do this attack
05:52 armadev | whereas we could imagine something other than a
hash ring, or rather, using the ring differently, to make it so all
bridges map to a small closed cycle
05:52 armadev | i haven't thought through the details and maybe
it cannot be made to work easily, but i wanted to raise the topic again.
05:55 isis@ | so the alternative that i could do would be to
have "consistent" hashrings, which is something used usually in backend
systems for data replication, where if the number of duplicates
N=3, then you end up with the resource places
into three buckets as-evenly-as-possible placed around the main hashring
05:56 isis@ | with a replication level of N=1, this would
result in each bridge being in its own little subgroup, and no others
05:57 isis@ | BridgeDB kind of has like four classes which try
to implement this concept, and IMO they all do a really bad job, with code
duplication, unused code, and half-implemented stuff all over
the place
05:57 isis@ | i am finishing up my branch which cleans up all
the hashring code today, it is for #12505
05:57 -zwiebelbot:#tor-dev- tor#12505: Refactor Bridges.py and Dist.py in
BridgeDB - https://bugs.torproject.org/12505
05:59 isis@ | anyway, if we did this, they we could easily say
"every distributor gets one main consistent hashring which is split into X
subhashrings. depending on what week it is, only one of those
subhashrings is available." then, in the
subhashring, rotate the clients around the ring with a different frequency
06:00 isis@ | does that sound like it would solve #10?
06:00 armadev | but it's still a ring. this is good for the
"change what you're giving out over time" feature, but no, i think it
doesn't address #10.
06:00 armadev | the issue is that my address maps to bridges 5,
6, and 7
06:00 armadev | and your address maps to bridges 7, 8, and 9
06:01 armadev | so if the adversary sees your behavior, it
learns about 7, and from 7 it learns about me, and then it learns about 5
and 6
06:01 armadev | it would be better, for #10, if every address
maps to a trio of bridges that are the same trio that other people get
when they're mapped there
06:01 armadev | rather than these partially overlapping sets
that we do now.
06:02 isis@ | ah, i see
06:02 isis@ | yes, the overlap has also bothered me, but i
wasn't thinking of the zig-zag problem
06:02 armadev | not sure this one needs to be solved now
06:02 isis@ | hmm. the overlap is a much more difficult one to
solve
06:02 armadev | and i think "different bridges at different
times" is a more important topic to do
06:03 armadev | the blog post describes a potential solution.
06:04 armadev | but 'more work remains' before that solution
will actually do what we want.
06:04 armadev | it's the sort of thing we should write up as a
math problem for somebody's grad class, and then sit back and wait
06:04 isis@ | the overlap is also even harder to solve
because, when BridgeDB parses new descriptors, it rebuilds all the
hashrings entirely, causing rings to add and lose bridges. however, for
the
bridges which remain, their place in the
hashring remains the same.
06:04 armadev | rather than get caught up in ourselves
06:05 armadev | huh. yeah.
06:05 armadev | which leads me to another topic that we should
be pondering:
06:05 armadev | all of these steps we take to make it less
likely for an attacker to Get All The Bridges lead to more bridges going
unused for some time periods
06:05 armadev | we should think about ways to tell the bridge
operator when they're in action, and when they're in reserve
06:05 armadev | so we can reassure them that being in reserve is
a great and valuable role.
06:06 armadev | (some people run a bridge for a day, then stop.
if they were in reserve the whole time, technically speaking, that wasn't
a great and valuable role after all.)
06:06 isis@ | so, e.g. if you ask for bridges right now
(without #1839 deployed) and you get bridges A, B, and C, and then
BridgeDB reparses and rebuilds, and B goes offline, then three hours later
you ask for more bridges, you'll likely get
bridges A, B, and D
06:07 armadev | perhaps you mean C goes offline?
06:07 armadev | otherwise, this sounds bad :)
06:08 isis@ | oh yeah. that. :)
06:08 armadev | hey, it's bridgedb, you never know
06:09 isis@ | haha, the thing is becoming a tiny bit more
well-behaved now
06:09 isis@ | just a tiny bit
06:09 isis@ | i think you once had a ticket for designing some
bridge statistics interface for BridgeDBâ
06:09 * isis is looking for it
06:09 isis@ | #7877
06:09 -zwiebelbot:#tor-dev- tor#7877: Web interface for looking up bridge
status? - https://bugs.torproject.org/7877
06:09 isis@ | why did that never happen?
06:10 isis@ | do we still want that to happen?
06:10 Yawning | hm
06:10 isis@ | or do we consider Globe to solve that problem?
06:10 armadev | didn't we do something related to #7877 in
globe?
06:11 armadev | except, i vaguely remember hearing from karsten
that he decided to drop that data point from the globe interface, because
i-don't-remember-why
06:11 armadev | it does seem a bit silly for bridgedb to grow a
new interface for users,
06:12 armadev | when it's already exporting stuff to globe and
globe is already an interface for users
06:12 armadev | but it might be wise for us to export a bit more
stuff from bridgedb to globe, so it can give that stuff to users
06:12 isis@ | once the database stuff for prop#226 is merged,
we get a pretty neat stucture to build statistics gathering and analysis
tools on top of
06:12 -zwiebelbot:#tor-dev- Prop#226: "Scalability and Stability
Improvements to BridgeDB: Switching to a Distributed Database System and
RDBMS" [OPEN]
06:12 armadev | and i guess, step zero is for globe to resume
giving out that info at all
06:13 isis@ | yeah, i suppose i could also more easily support
giving the metrics server access to certain queries, so that it benefits
from BridgeDB keeping state and all
06:14 isis@ | plus then metrics wouldn't have to do a bunch of
crazy reparsing and recalculation of any things which bridgedb already
does
06:16 armadev | yeah, hm, the 'pool assignment' entry on globe
appears empty
06:16 armadev | for e.g.
https://globe.torproject.org/#/bridge/1513028CD43BD34798D829719D76E6EC3F5391CA
06:17 armadev | #13921
06:17 isis@ | yeah, see #13921
06:17 -zwiebelbot:#tor-dev- tor#13921: Remove "bridge pool assignment" UI
element from Atlas/Globe - https://bugs.torproject.org/13921
06:17 isis@ | which replaces it in Globe with the `transport`
field instead
06:17 isis@ | showing which transports a bridge currently
supports
06:17 armadev | well, great, but that removes the thing i was
just talking about where we give feedback to the user about whether her
bridge is in action or what
06:18 armadev | which i think will become even more important
with #1839
06:22 isis@ | armadev: well, right, but then we should
probably do either #2755 orâ
06:22 -zwiebelbot:#tor-dev- tor#2755: Reconsider BridgeDB's pool
assignment file implementation and deployment -
https://bugs.torproject.org/2755
06:22 * isis can't find the other ticket
06:23 isis@ | i had a ticket that was for adding somewhere in
the bridge-extrainfo descriptor a line like `BridgeDistribution 0` or
`BridgeDistribution https`
06:23 armadev | isis: to me #2755 is more about documenting how
bridges were given out over the past, so we can match up load and blocking
measurements with distribution to find patterns.
06:33 * isis found the torrc `BridgeDistribution
https` tickets, they are #13727 and #13504
06:33 -zwiebelbot:#tor-dev- tor#13727: BridgeDB should not distribute Tor
Browser's default bridges - https://bugs.torproject.org/13727
06:33 -zwiebelbot:#tor-dev- tor#13504: Bridges in Tor Browser Bundles
should be public so that we have metrics on them -
https://bugs.torproject.org/13504
07:03 armadev | isis: so in summary (there are a lot of
tickets), where are we at with the goals of remembering how we gave out
bridges at which time, so we can use that to study the effectiveness of
bridge distribution strategies in the past? and
where are we at communicating to the operator what strategies we've used
recently to give out her bridge?
07:08 isis | currently, there is a pile of assignments.log
files which continued to be produced and never got synced to Metrics
07:09 isis | i could do #2755 soon, and ask karsten to allow
BridgeDB to start syncing to Metrics again
07:09 -zwiebelbot:#tor-dev- tor#2755: Reconsider BridgeDB's pool
assignment file implementation and deployment -
https://bugs.torproject.org/2755
07:09 * karsten looks at #2755
07:10 isis | or, if karsten likes, i can provide an interface
to BridgeDB's newer databases, so that the Metrics server can obtain data
without additional processing/storage
07:10 karsten | isis: or should we think about better usage
statistics here?
07:11 karsten | well, Metrics has only data that is archived by
CollecTor.
07:11 isis | sure, that sounds better than a string that
likely has no meaning to most operators
07:11 karsten | we could come up with better stats that are
collected by CollecTor and then displayed/processed by Metrics and/or
Onionoo.
07:11 armadev | if there is historical how-we-distributed-it-
when data that we have but we're not keeping, that's a bit sad
07:12 isis | i have #14453 and #10218 which are along those
lines
07:12 karsten | you mean past assignment.log files?
07:12 -zwiebelbot:#tor-dev- tor#14453: Implement statistics gathering for
number of Bridges-per-Transport in BridgeDB -
https://bugs.torproject.org/14453
07:12 -zwiebelbot:#tor-dev- tor#10218: Provide "users-per-transport-per-
country" statistics for obfsbridges - https://bugs.torproject.org/10218
07:12 armadev | (though ideally there is how-it-got-blocked-when
data somewhere out there, that we are not collecting and not keeping, and
we'd ideally like to have both.)
07:12 isis | karsten: yes, i have some past assignments.log
files
07:13 armadev | karsten: i think i don't mean statistics
summaries, but rather, than underlying data.
07:13 armadev | the sort of thing that researchers are going to
want, a year from now, when they ask how that blocking event happened and
which bridges it affected.
07:13 karsten | we could convert existing logs into the new
format.
07:14 karsten | the yet-to-be-designed format.
07:15 isis | the assignments.log files, did Metrics used to
sanitise them by replacing the fingerprints with hashed fingerprints?
07:15 karsten | isis: it seems #10218 is for little-t-tor, not
bridgedb.
07:15 karsten | yes, that's what it did. and it sorted them by
hashed fingerprint, so that the order didn't reveal anything.
07:16 karsten | maybe more.
07:16 isis | steps like those are something that BridgeDB
could easily do to begin with, if it would make the processing less
intense
07:16 karsten | that's right.
07:17 karsten | it totally should do those steps.
07:18 isis | and BridgeDB is parsing all the bridges into
stem classes anyway, and is going to store them as json in couchDB, if
that json is something more accessible
07:19 karsten | json is easier than inventing our own data
format, yes.
07:19 karsten | we still need to think what to put into the json
though.
07:19 Yawning | xmllllllll
07:19 karsten | xml in json, ok.
07:19 Yawning | :D
07:19 isis | one idea i had earlier was to allow collecTor to
have certain queries on the new database (or the output of the query and
some processing) for whatever statistics we wish to extract
07:21 karsten | ideally, collector would fetch a thing every
hour or so, verify it, and store it.
07:21 Yawning | armadev: would #15515 count as what you want out
of the defense at the intro point?
07:21 -zwiebelbot:#tor-dev- tor#15515: Don't allow multiple INTRODUCE1s on
the same circuit - https://bugs.torproject.org/15515
07:21 Yawning | or do you want something more sophisticated?
07:21 isis | karsten: i was just going to put everything in
the json, that way BridgeDB could do cooler stuff with detecting when
certain fields have changed
07:22 isis | karsten: verify means verify the descriptor
signatures?
07:22 karsten | ah, mostly that it's valid json and contains
certain required fields.
07:22 karsten | I think.
07:23 karsten | not sure about putting in everything, including
things that are already contained elsewhere,
07:23 karsten | but it might be possible to remove certain
fields while exporting to collector.
07:23 isis | i planned on writing protobufs to define what
data was valid for BridgeDB to be exporting
07:24 karsten | okay, happy to learn what exactly that means. :)
07:24 Yawning | it's google's serialization format
07:24 isis | https://developers.google.com/protocol-
buffers/docs/overview
07:24 Yawning | you feed a definition file into a code generator
and it outputs code that marshals/demarshalls stuffs
07:25 karsten | nice, ok.
07:25 isis | basically, i write a .proto file and it
generates python, java, c, and/or go
07:25 Yawning | https://capnproto.org/
07:25 Yawning | see also
07:25 Yawning | which is protobufs redesigned by the author
after he left google
07:25 Yawning | haven't used it, claims to be be better
07:26 isis | lol, i can't tell if they are joking
07:26 isis | "â% faster!!"
07:27 Yawning | heh
07:27 Yawning | if you read on they clarify what they mean
07:28 Yawning | ymmv, protobufs is a fine format to use
07:28 Yawning | and this thing may eat all ur dataz
07:28 karsten | isis: okay, want to start a list of things to
put into that json that are safe to be collected and published by
collector?
07:28 * isis totally thinks they are joking at
"Time-traveling RPC"
07:29 karsten | isis: and is this something for a tor proposal?
07:29 isis | but it is interesting and if they are not
totally lying their pants off, then SUBSCRIBE
07:29 Yawning | isis: the idea they're doingis actually p clever
07:29 karsten | isis: or as addition to bridgedb-spec (if that
exists)?
07:30 karsten | oh, yes, it still exists, I helped write it..
07:30 isis | karsten: currently, there is no proposal for
better bridge statistics
07:30 isis | although we could start one
07:30 isis | and bridgedb-spec.txt lives in the top-level of
tor-spec.git now
07:30 karsten | oh, nice.
07:30 karsten | you mean bridgedb has changed since Date: Fri
Jul 5 01:40:49 2013 +0000
07:31 karsten | I should git pull..
07:31 isis | hah, it's almost entirely rewritten
07:31 isis | i am finishing the final refactorings now
07:32 isis | which is why i will have time and ability to do
cool stuff, like the social distributor
07:32 isis | (and better bridge metrics, if we want that)
07:34 karsten | isis: just let me know if I can help with the
stats side of things. it would be useful for bridge operators
(onionoo/atlas/globe) and for sponsors (metrics).
07:35 isis | karsten: is that sponsor S, or sponsors in
general
07:35 karsten | sponsors in general. I don't know what S wants.
07:35 isis | karsten: ok, i will start making a proposal, and
ask you to review it
07:36 karsten | sounds great!
}}}
--
Ticket URL: <https://trac.torproject.org/projects/tor/ticket/1839#comment:10>
Tor Bug Tracker & Wiki <https://trac.torproject.org/>
The Tor Project: anonymity online
_______________________________________________
tor-bugs mailing list
tor-bugs@xxxxxxxxxxxxxxxxxxxx
https://lists.torproject.org/cgi-bin/mailman/listinfo/tor-bugs