[Author Prev][Author Next][Thread Prev][Thread Next][Author Index][Thread Index]

[or-cvs] [metrics/master] Remove long outdated TODO list.

Author: Karsten Loesing <karsten.loesing@xxxxxxx>
Date: Sat, 27 Mar 2010 10:34:39 +0100
Subject: Remove long outdated TODO list.
Commit: 855b66941479ad49fe6e5c28a2f8d5634a5f8748

 TODO |  190 ------------------------------------------------------------------
 1 files changed, 0 insertions(+), 190 deletions(-)
 delete mode 100644 TODO

diff --git a/TODO b/TODO
deleted file mode 100644
index 21c2b6a..0000000
--- a/TODO
+++ /dev/null
@@ -1,190 +0,0 @@
- - Not done
- * Top priority
- . Partially done
- o Done
- d Deferrable
- D Deferred
- X Abandoned
-Tasks for September or later:
- . Configure Mike's bandwidth scanner on gabelmoo
-   . Include measured bandwidths in votes
- . Exiting traffic by port
-*  . July 31: Evaluate data together with Steven. Write report on measured
-*    exit port data.
- - Alternative requirements for flags
-   - Display actual MTBF/WFU requirements for weakened requirements.
-     Evaluation has finished. Look at results and include them in report.
-*  - June 30: Write proposal for weakened requirements for being a Guard;
-*    see TODO.022.
- - Client requests to directories
-   - Compare bytes.txt output to dirreq download times
-   - Figure out why estimations are that far off. Where is the flaw in the
-     math?
- - Clients connecting to entry nodes
-*  - July 31: Write report on measured entry stats data.
- - Circuit build timeouts
-   - Find out why there are bumps at full seconds even though seconds start
-     at random times on the relays; possibly measure full distributions of
-     cell times in circuit queues, not just the deciles.
- - Evaluate Roger's reduced circuit window patch
-   - Switch to 1 MiB downloads
- - Work organization
-   - Add medium and low priority items from Roger's performance mail to
-     this list, too. This list contains only the two high priority items.
- - Directory archives
-   - Look at entropy of directory over the years. Right now the relay
-     choices are not uniform. It is way more likely that clients choose
-     fast relays than slow ones. If we re-normalize it, what is the
-     equivalent number of uniformly-weighted relays in the network?
-     mikeperry has some equations for this in his torflow, but it would be
-     interesting to see whether that number is going up over time, and how
-     it compares to number-of-relays and amount-of-bandwidth.
-   - How many of the German relays that have disappeared in 2008 were set
-     up at the end of 2007?
-   - Is the (major) reason for disappearing nodes in France in mid-2008
-     that OVH stopped supporting Tor relay operation?
-   - Examine bandwidth-per-relay ratios for various countries. Do changes
-     in bandwidth per country result from a few or a lot of relays joining
-     or leaving?
-   - Investigate very old Tor versions. Do these nodes have their contact
-     info set? A possible explanation for these nodes not being updated is
-     that they might run on nodes without knowledge of their owners.
-   - Compare descriptors collected on gabelmoo with those collected by
-     tor26. What fraction of descriptors is missing? Is it worth combining
-     both archives?
-   - Investigate whether the loss of German relays in 2008 was due to the
-     pervasive dynamic IP reachability testing bugs. How?
-   - Compare observed/history bandwidth by time of day to see if traffic is
-     underutilized at night and saturated during the day.
-   - For comparison of relays on dynamic IP addresses, don't count relays
-     that were up for only a short time; consider using a dynamic IP
-     database.
-   - Consider recording bandwidth usage on relays by putting 1 random
-     second of every 15-minute interval into extra-info documents, rather
-     than the sum of transported bytes. Suggestion by Roger/Steven.
- - Tor exit list
-   - Permit queries whether a certain IP was an exit for a certain target
-     at a certain time.
- - Client requests to directories
-   - Why do authorities (at least moria1 and moria2) see such a high
-     request-to-address ratio? Shouldn't clients ask at most once? A
-     possible explanation is that people are running Tor in a way where
-     their cache doesn't survive, maybe old-school Torpark variants or
-     something. Another explanation are people running relays that aren't
-     reachable so aren't ignored in the geoip stats. Further investigate.
-   - Figure out if there are better GeoIP databases available that focus
-     more on small countries and that are still affordable.
-   - Try to estimate the number of concurrent Tor users from active
-     circuits and the probability of clients picking a relay for their
-     circuits. This only requires that we know how many circuits users
-     build on average. Hmm.
-   - Investigate the algorithm in global_write_bucket_low() that contains
-     the priorization of some directory requests over others. This
-     algorithm was written when v1 was popular and v2 was new. Do the
-     conditions in that function require an update?
-   - Consider using a dynamic IP database to determine how many users are
-     on dynamic IP addresses.
-   - Investigate assumption that 1 IP address is equivalent to 1 user;
-     consider dynamic IP addresses and NAT, too.
-   - Add statistics to analyze failure types of directory requests and
-     include transmission times of failed requests exceeding a certain
-     threshold of 50% of all bytes.
- - Cells in circuit queues
-   - Extend statistics to medians, and 1st/9th deciles; problematic as
-     these statistics require keeping more history on Tor relays.
-   - Also extend statistics to outbuffer sizes.
-   - Investigate classification of circuits on a relay: Do most circuits
-     stay inactive, but a few become active, send their cells, become
-     inactive, get new cells and become active, and keep oscillating? Or
-     are there active circuits that just stay active for seconds at a time
-     because they cannot clear their queue?
-   - Investigate timing of circuits flushing their queues: For relays that
-     rate limit, what fraction of each second do they spend with empty
-     write buckets? The theory from earlier analyses is that for most
-     relays that rate limit, they have a full second's worth of data queued
-     up already, and at the top of each second, they pull off one second's
-     worth of bytes, send them, and then go dormant again until the next
-     second. Two approaches to fix this behavior are lowering the circuit
-     window sizes, so there's less data in flight on the network, and
-     reducing the granularity of the token bucket refills, so it sends
-     bytes more regularly throughout the second; but first the theory needs
-     to be confirmed.
-   - Another theory is that some relays refuse to read from a relay for
-     a period of multiple seconds. Can this be confirmed by the
-     measurements?
-   - Instrument edge streams and how they add cells to their circuits, and
-     how they flush them on the socks side.
- - Directory archives
-   - Do guards that have had the guard flag for a long time (weeks or
-     months) have more load than guards that just got their guard flag? Try
-     to find a possible correlation between advertised bandwidth and the
-     time a relay spent in the network with the Guard flag. (see 4.5 in
-     performance roadmap)
-   - Analyze 2004 and 2005 data, too.
- - Bridge archives
-   - Investigate bridge churn to determine how many bridges users need.
-   - Look through the bridge relay stats and see how much churn there is.
-     Roger is guessing that the 400-some bridges we have running by end of
-     June do not indicate that only 400-some people set up bridges. Rather
-     they indicate that only 400-some of them have their bridge still up
-     and reachable right now.
-   - Are bridges known to be available when users receive their addresses?
-     In one reported case, 9 bridges were unavailable only 2 hours after
-     receiving their addresses.
-   - Estimate how many bridge users we're skipping because only
-     super-stable bridges report any stats.
- - Measure throughput and latency between relays
-   - Implement opportunistic measuring of cell transfer times and bandwidth
-     between relays.
-   - Decide if statistics should be measured in the future in aggregate
-     form.
- - Measure throughput
-   . Write report on measured throughput data from torperf.
-   - Evaluate speedracer results.
-   - Passively measure throughput in Tor clients when configured.
-   - Improve usability so that non-developer users in countries like
-     Tunesia can measure throughput themselves. This can be speedracer,
-     torperf, or some other tool. Consider implementing as Vidalia plugin
-     once the plugin infrastructure is in place.
- - Measure latencies
-   - Evaluate circuit-build times in buildtimes data.
-   - Passively measure circuit-build times in clients when configured.
-   - Measure latencies as clients would experience them. Run a Tor client
-     somewhere that makes "typical" Tor circuits (i.e. just let Tor choose
-     its own paths, but set UseEntryGuards to 0, and only make requests to
-     port 80 so it builds circuits which exit there), and send pings every
-     so often, and track how long they take. One easy way to ping is to
-     make a request to an IP address that we know is refused by the last
-     hop's exit policy. Say, Then measure the time between
-     sending the connect cell, and receiving the end cell. We expect this
-     latency to go down over time, a) because we lower the circuit window,
-     and b) because Tor has on-average-better circuits based on Mike
-     Perry's plans.
- - Metrics portal
-   - Write down architecture for TorStatus extension.
-   - Implement extensions.
-   - Load directory archives into MySQL database and optimize database
-     schema so that evaluations are executed quickly.
-   - Set up extended TorStatus.