[Author Prev][Author Next][Thread Prev][Thread Next][Author Index][Thread Index]
[or-cvs] r9884: Cobbled together a TODO file from sleepless ramblings. Made (in torflow/trunk: . TorCtl)
Author: mikeperry
Date: 2007-03-20 00:48:10 -0400 (Tue, 20 Mar 2007)
New Revision: 9884
Added:
torflow/trunk/TODO
Modified:
torflow/trunk/TorCtl/PathSupport.py
torflow/trunk/metatroller.py
Log:
Cobbled together a TODO file from sleepless ramblings. Made it coherent (I
hope).
Fixed a bug using OrderedExitGenerator. Cleared up a pending circuit issue
with newnym. Also made stream+circ id members more uniform.
Added: torflow/trunk/TODO
===================================================================
--- torflow/trunk/TODO 2007-03-20 03:26:51 UTC (rev 9883)
+++ torflow/trunk/TODO 2007-03-20 04:48:10 UTC (rev 9884)
@@ -0,0 +1,73 @@
+- Add an ORCONN_BW event to Tor to emit read/write info and also queue sizes
+ - See tordiffs/orconn-bw.diff but it probably should be a separate event,
+ not hacked onto ORCONN
+ - Use nodemon.py to rank nodes based on total bytes, queue sizes, and the
+ ratio of these two
+ - Does it agree with results from metatroller's bandwidth stats?
+
+- More NodeRestrictions/PathRestrictions in TorCtl/PathSupport.py
+ - BwWeightedGenerator
+ - NodeRestrictions:
+ - Uptime/LongLivedPorts (Does/should hibernation count?)
+ - Published/Updated
+ - GeoIP (http://www.maxmind.com/app/python)
+ - NodeCountry
+ - PathRestrictions:
+ - Family
+ - GeoIP (http://www.maxmind.com/app/python)
+ - OceanPhobicRestrictor (avoids Pacific Ocean or two atlantic crossings)
+ or ContinentRestrictor (avoids doing more than N continent crossings)
+ - EchelonPhobicRestrictor
+ - Does not cross international boundaries for client->Entry or
+ Exit->destination hops
+ - Perform statistical analysis on paths
+ - How often does Tor choose foolish paths normally?
+ - (4 atlantic/pacific crossings)
+ - What is the distribution for Pr(ClientLocation|MiddleNode,ExitNode)
+ and Pr(EntryNode|MiddleNode,ExitNode) for these various path choices?
+ - Mathematical analysis probably required because this is a large joint
+ distribution (not GSoC)
+ - Empirical observation possible if you limit to the top 10% of the
+ nodes (which carry something like 90% of bandwidth anyways).
+ - Make few million paths without actually building real
+ circuits and tally them up in a 3D table
+ - See PathSupport.py unit tests for some examples on this
+ - See also:
+ http://swiki.cc.gatech.edu:8080/ugResearch/uploads/7/ImprovingTor.pdf
+ - You can also perform predecessor observation of this strategy
+ empirically. But it is likely the GeoIP stuff is easier to implement
+ and just as effective.
+
+- Create a PathWatcher that StatsHandler can extend from so people can gather
+ stats from regular Tor usage
+
+- Use GeoIP to make a map of tor servers color coded by their reliability
+ - Or augment an existing Tor map project with this data
+
+- Add circuit prebuilding and port history learning for keeping an optimal
+ pool of circuits available for use
+ - Build circuits in parallel to speed up scanning
+
+- Rewrite soat.pl in python/C++ and leverage an html parser to extract
+ object/script tags to make a fingerprint of a dynamic page.
+ - Scan for changes to this fingerprint and also to any original embedded
+ objects
+ - Make a multilingual keyword list of commonly censored terms to google for
+ using this scanner
+ - Improve checking of changes to documents outside of Tor
+ - Improve SSL handling/verification. openssl client is broken.
+ - Parallelize scanning
+ - Improve interaction between soat+metatroller so soat knows
+ which exit was responsible for a given ip/url
+
+- Design Reputation System (not for GSoC)
+ - Emit some kind of penalty multiplier based on circuit/stream failure rate
+ and the ratio of directory "observed" bandwidth vs avg stream bandwidth
+ - Add keyword to directory for clients to use instead of observed
+ bandwidth for routing decisions
+ - Make sure scanners don't listen to this keyword to avoid
+ "Creeping Death"
+ - Queue lengths from the node monitor can also figure into this penalty
+ multiplier
+ - Figure out interface to report this and also BadExit determinations
+ - Probably involves voting among many scanners
Modified: torflow/trunk/TorCtl/PathSupport.py
===================================================================
--- torflow/trunk/TorCtl/PathSupport.py 2007-03-20 03:26:51 UTC (rev 9883)
+++ torflow/trunk/TorCtl/PathSupport.py 2007-03-20 04:48:10 UTC (rev 9884)
@@ -87,6 +87,9 @@
self.sorted_r = sorted_r
self.rewind()
+ def reset_restriction(self, rstr_list):
+ self.rstr_list = rstr_list
+
def rewind(self):
self.routers = copy.copy(self.sorted_r)
@@ -104,14 +107,14 @@
if pathlen == 1:
circ.exit = path_sel.exit_chooser(circ.path)
circ.path = [circ.exit]
- circ.cid = self.extend_circuit(0, circ.id_path())
+ circ.circ_id = self.extend_circuit(0, circ.id_path())
else:
circ.path.append(path_sel.entry_chooser(circ.path))
for i in xrange(1, pathlen-1):
circ.path.append(path_sel.middle_chooser(circ.path))
circ.exit = path_sel.exit_chooser(circ.path)
circ.path.append(circ.exit)
- circ.cid = self.extend_circuit(0, circ.id_path())
+ circ.circ_id = self.extend_circuit(0, circ.id_path())
return circ
######################## Node Restrictions ########################
@@ -436,6 +439,7 @@
if self.order_exits:
if self.__ordered_exit_gen:
exitgen = self.__ordered_exit_gen
+ exitgen.reset_restriction(self.exit_rstr)
else:
exitgen = self.__ordered_exit_gen = \
OrderedExitGenerator(80, sorted_r, self.exit_rstr)
@@ -457,7 +461,7 @@
class Circuit:
def __init__(self):
- self.cid = 0
+ self.circ_id = 0
self.path = [] # routers
self.exit = None
self.built = False
@@ -470,7 +474,7 @@
class Stream:
def __init__(self, sid, host, port, kind):
- self.sid = sid
+ self.strm_id = sid
self.detached_from = [] # circ id #'s
self.pending_circ = None
self.circ = None
@@ -603,19 +607,21 @@
self.new_nym = False
plog("DEBUG", "Obeying new nym")
for key in self.circuits.keys():
- if len(self.circuits[key].pending_streams):
+ if (not self.circuits[key].dirty
+ and len(self.circuits[key].pending_streams)):
plog("WARN", "New nym called, destroying circuit "+str(key)
+" with "+str(len(self.circuits[key].pending_streams))
+" pending streams")
unattached_streams.extend(self.circuits[key].pending_streams)
+ self.circuits[key].pending_streams.clear()
# FIXME: Consider actually closing circ if no streams.
self.circuits[key].dirty = True
for circ in self.circuits.itervalues():
- if circ.built and not circ.dirty and circ.cid not in badcircs:
+ if circ.built and not circ.dirty and circ.circ_id not in badcircs:
if circ.exit.will_exit_to(stream.host, stream.port):
try:
- self.c.attach_stream(stream.sid, circ.cid)
+ self.c.attach_stream(stream.strm_id, circ.circ_id)
stream.pending_circ = circ # Only one possible here
circ.pending_streams.append(stream)
except TorCtl.ErrorReply, e:
@@ -639,10 +645,10 @@
plog("NOTICE", "Error building circ: "+str(e.args))
for u in unattached_streams:
plog("DEBUG",
- "Attaching "+str(u.sid)+" pending build of "+str(circ.cid))
+ "Attaching "+str(u.strm_id)+" pending build of "+str(circ.circ_id))
u.pending_circ = circ
circ.pending_streams.extend(unattached_streams)
- self.circuits[circ.cid] = circ
+ self.circuits[circ.circ_id] = circ
self.last_exit = circ.exit
def circ_status_event(self, c):
@@ -658,16 +664,17 @@
if c.status == "EXTENDED":
self.circuits[c.circ_id].last_extended_at = c.arrived_at
elif c.status == "FAILED" or c.status == "CLOSED":
+ # XXX: Can still get a STREAM FAILED for this circ after this
circ = self.circuits[c.circ_id]
del self.circuits[c.circ_id]
for stream in circ.pending_streams:
- plog("DEBUG", "Finding new circ for " + str(stream.sid))
+ plog("DEBUG", "Finding new circ for " + str(stream.strm_id))
self.attach_stream_any(stream, stream.detached_from)
elif c.status == "BUILT":
self.circuits[c.circ_id].built = True
try:
for stream in self.circuits[c.circ_id].pending_streams:
- self.c.attach_stream(stream.sid, c.circ_id)
+ self.c.attach_stream(stream.strm_id, c.circ_id)
except TorCtl.ErrorReply, e:
# No need to retry here. We should get the failed
# event for either the circ or stream next
@@ -709,8 +716,16 @@
if s.strm_id not in self.streams:
plog("NOTICE", "Succeeded stream "+str(s.strm_id)+" not found")
return
- self.streams[s.strm_id].circ = self.streams[s.strm_id].pending_circ
- self.streams[s.strm_id].circ.pending_streams.remove(self.streams[s.strm_id])
+ if s.circ_id and self.streams[s.strm_id].pending_circ.circ_id != s.circ_id:
+ # Hrmm.. this can happen on a new-nym.. Very rare, putting warn
+ # in because I'm still not sure this is correct
+ plog("WARN", "Mismatch of pending: "
+ +str(self.streams[s.strm_id].pending_circ.circ_id)+" vs "
+ +str(s.circ_id))
+ self.streams[s.strm_id].circ = self.circuits[s.circ_id]
+ else:
+ self.streams[s.strm_id].circ = self.streams[s.strm_id].pending_circ
+ self.streams[s.strm_id].pending_circ.pending_streams.remove(self.streams[s.strm_id])
self.streams[s.strm_id].pending_circ = None
self.streams[s.strm_id].attached_at = s.arrived_at
elif s.status == "FAILED" or s.status == "CLOSED":
Modified: torflow/trunk/metatroller.py
===================================================================
--- torflow/trunk/metatroller.py 2007-03-20 03:26:51 UTC (rev 9883)
+++ torflow/trunk/metatroller.py 2007-03-20 04:48:10 UTC (rev 9884)
@@ -450,6 +450,7 @@
else: start_f = len(c.path)-1
# Count failed
+ # XXX: Differentiate between extender and extendee
for r in self.circuits[c.circ_id].path[start_f:len(c.path)+1]:
r.circ_failed += 1
if not reason in r.reason_failed:
@@ -533,12 +534,12 @@
return
# Verify circ id matches stream.circ
- if s.status not in ("NEW" or "NEWRESOLVE"):
+ if s.status not in ("NEW", "NEWRESOLVE", "REMAP"):
circ = self.streams[s.strm_id].circ
if not circ: circ = self.streams[s.strm_id].pending_circ
- if circ and circ.cid != s.circ_id:
+ if circ and circ.circ_id != s.circ_id:
plog("WARN", str(s.strm_id) + " has mismatch of "
- +str(s.circ_id)+" v "+str(circ.cid))
+ +str(s.circ_id)+" v "+str(circ.circ_id))
if s.status == "DETACHED":
if self.streams[s.strm_id].attached_at:
@@ -558,7 +559,7 @@
# Update strm_chosen count
for r in self.circuits[s.circ_id].path: r.strm_chosen += 1
- # Update bw stats
+ # Update bw stats. XXX: Don't do this for resolve streams
if self.streams[s.strm_id].attached_at:
lifespan = self.streams[s.strm_id].lifespan(s.arrived_at)
for r in self.streams[s.strm_id].circ.path:
@@ -623,6 +624,7 @@
else:
s.write("250 LASTEXIT=0 (0) OK\r\n")
elif command == "NEWEXIT" or command == "NEWNYM":
+ # XXX: Seperate this
clear_dns_cache(c)
h.new_nym = True # GIL hack
plog("DEBUG", "Got new nym")