[Author Prev][Author Next][Thread Prev][Thread Next][Author Index][Thread Index]
[tor-commits] [stem/master] Rewriting the descriptor tutorial
commit a86d250b6702f7dae9ed74778a6e0a5cb2f9358a
Author: Damian Johnson <atagar@xxxxxxxxxxxxxx>
Date: Sun Mar 3 23:11:30 2013 -0800
Rewriting the descriptor tutorial
Replacing the 'Mirror, Mirror' tutorial with a new one that gives a better
overview of the various descriptors and how to get/use them. This keeps the old
example (listing the fastest exits), but otherwise is a full rewrite.
---
docs/tutorial/mirror_mirror_on_the_wall.rst | 167 +++++++++++++++++++------
docs/tutorial/the_little_relay_that_could.rst | 2 +-
2 files changed, 132 insertions(+), 37 deletions(-)
diff --git a/docs/tutorial/mirror_mirror_on_the_wall.rst b/docs/tutorial/mirror_mirror_on_the_wall.rst
index d9a4dd8..3679ec1 100644
--- a/docs/tutorial/mirror_mirror_on_the_wall.rst
+++ b/docs/tutorial/mirror_mirror_on_the_wall.rst
@@ -1,47 +1,159 @@
Mirror Mirror on the Wall
--------------------------
+=========================
-A script that tells us our contributed bandwidth is neat and all, but now let's figure out who the *biggest* exit relays are.
+* :ref:`what-is-a-descriptor`
+* :ref:`where-can-i-get-the-current-descriptors`
+* :ref:`where-can-i-get-past-descriptors`
+* :ref:`putting-it-together`
-Information about the Tor relay network come from documents called **descriptors**. Descriptors can come from a few things...
+.. _what-is-a-descriptor:
-1. The Tor control port with GETINFO options like **desc/all-recent** and **ns/all**.
-2. Files in Tor's data directory, like **cached-descriptors** and **cached-consensus**.
-3. The descriptor archive on `Tor's metrics site <https://metrics.torproject.org/data.html>`_.
+What is a descriptor?
+---------------------
-We've already used the control port, so for this example we'll use the cached files directly. First locate Tor's data directory. If your torrc has a DataDirectory line then that's the spot. If not then check Tor's man page for the default location.
+Tor is made up of two parts: the application and a distributed network of a few
+thousand volunteer relays. Information about these relays is public, and made
+up of documents called **descriptors**.
-Tor has several descriptor types. For bandwidth information we'll go to the server descriptors, which are located in the **cached-descriptors** file. These have somewhat infrequently changing information published by the relays themselves.
+There are several different kinds of descriptors, the most common ones being...
-To read this file we'll use the :class:`~stem.descriptor.reader.DescriptorReader`, a class designed to read descriptor files. The **cached-descriptors** is full of server descriptors, so the reader will provide us with :class:`~stem.descriptor.server_descriptor.RelayDescriptor` instances (a :class:`~stem.descriptor.server_descriptor.ServerDescriptor` subclass for relays).
+====================================================================== ===========
+Descriptor Type Description
+====================================================================== ===========
+`Server Descriptor <../api/descriptor/server_descriptor.html>`_ Information that relays publish about themselves. Tor clients once downloaded this information, but now they use microdescriptors instead.
+`ExtraInfo Descriptor <../api/descriptor/extrainfo_descriptor.html>`_ Relay information that tor clients do not need in order to function. This is self-published, like server descriptors, but not downloaded by default.
+`Microdescriptor <../api/descriptor/microdescriptor.html>`_ Minimalistic document that just includes the information necessary for tor clients to work.
+`Network Status Document <../api/descriptor/networkstatus.html>`_ Though tor relays are decentralized, the directories that track the overall network are not. These central points are called **directory authorities**, and every hour they publish a document called a **consensus** (aka, network status document). The consensus in turn is made up of **router status entries**.
+`Router Status Entry <../api/descriptor/router_status_entry.html>`_ Relay information provided by the directory authorities including flags, heuristics used for relay selection, etc.
+====================================================================== ===========
+
+.. _where-can-i-get-the-current-descriptors:
+
+Where can I get the current descriptors?
+----------------------------------------
+
+To work tor needs to have up-to-date information about relays within the
+network. As such getting current descriptors is easy: *just run tor*.
+
+Tor only gets the descriptors that it needs by default, so if you're scripting
+against tor you may want to set some of the following in your `torrc
+<https://www.torproject.org/docs/faq.html.en#torrc>`_. Keep in mind that these
+add a small burden to the network, so don't set them in a widely distributed
+application. And, of course, please consider `running tor as a relay
+<https://www.torproject.org/docs/tor-doc-relay.html.en>`_ so you give back to
+the network!
+
+::
+
+ # Descriptors have a range of time during which they're valid. To get the
+ # most recent descriptor information, regardless of if tor needs it or not,
+ # set the following.
+
+ FetchDirInfoExtraEarly 1
+
+ # If you aren't actively using tor as a client then tor will eventually stop
+ # downloading descriptor information that it doesn't need. To prevent this
+ # from happening set...
+
+ FetchUselessDescriptors 1
+
+ # Tor no longer downloads server descriptors by default, opting for
+ # microdescriptors instead. If you want tor to download server descriptors
+ # then set...
+
+ UseMicrodescriptors 0
+
+ # Tor doesn't need extrainfo descriptors to work. If you want tor to download
+ # them anyway then set...
+
+ DownloadExtraInfo 1
+
+Now that tor is happy chugging along up-to-date descriptors are available
+through tor's control socket...
+
+::
+
+ from stem.control import Controller
+
+ with Controller.from_port(control_port = 9051) as controller:
+ controller.authenticate()
+
+ for desc in controller.get_network_statuses():
+ print "found relay %s (%s)" % (desc.nickname, desc.fingerprint)
+
+... or by reading directly from tor's data directory...
+
+::
+
+ from stem.descriptor import parse_file
+
+ for desc in parse_file("/home/atagar/.tor/cached-consensus"):
+ print "found relay %s (%s)" % (desc.nickname, desc.fingerprint)
+
+.. _where-can-i-get-past-descriptors:
+
+Where can I get past descriptors?
+---------------------------------
+
+Descriptor archives are available on `Tor's metrics site
+<https://metrics.torproject.org/data.html>`_. These archives can be read with
+the `DescriptorReader <api/descriptor/reader.html>`_...
::
- import sys
from stem.descriptor.reader import DescriptorReader
+
+ with DescriptorReader(["/home/atagar/server-descriptors-2013-03.tar"]) as reader:
+ for desc in reader:
+ print "found relay %s (%s)" % (desc.nickname, desc.fingerprint)
+
+.. _putting-it-together:
+
+Putting it together...
+----------------------
+
+As discussed above there are three methods for reading descriptors...
+
+* With the :class:`~stem.control.Controller` via methods like :func:`~stem.control.Controller.get_server_descriptors` and :func:`~stem.control.Controller.get_network_statuses`.
+* By reading the file directly with :func:`~stem.descriptor.__init__.parse_file`.
+* Reading with the `DescriptorReader <api/descriptor/reader.html>`_. This is best if you have you want to read everything from a directory or archive.
+
+Now lets say you want to figure out who the *biggest* exit relays are. You
+could use any of the methods above, but for this example we'll use the
+:class:`~stem.control.Controller`. This uses server descriptors, so keep in
+mind that you'll likely need to set "UseMicrodescriptors 0" in your torrc for
+this to work.
+
+::
+
+ import sys
+
+ from stem.contorl import Controller
from stem.util import str_tools
-
+
# provides a mapping of observed bandwidth to the relay nicknames
def get_bw_to_relay():
bw_to_relay = {}
-
- with DescriptorReader(["/home/atagar/.tor/cached-descriptors"]) as reader:
- for desc in reader:
+
+ with Controller.from_port(control_port = 9051) as controller:
+ controller.authenticate()
+
+ for desc in controller.get_server_descriptors():
if desc.exit_policy.is_exiting_allowed():
bw_to_relay.setdefault(desc.observed_bandwidth, []).append(desc.nickname)
-
+
return bw_to_relay
-
+
# prints the top fifteen relays
-
+
bw_to_relay = get_bw_to_relay()
count = 1
-
+
for bw_value in sorted(bw_to_relay.keys(), reverse = True):
for nickname in bw_to_relay[bw_value]:
print "%i. %s (%s/s)" % (count, nickname, str_tools.get_size_label(bw_value, 2))
count += 1
-
+
if count > 15:
sys.exit()
@@ -64,20 +176,3 @@ To read this file we'll use the :class:`~stem.descriptor.reader.DescriptorReader
14. politkovskaja2 (24.93 MB/s)
15. wau (24.72 MB/s)
-This can be easily done through the controller too...
-
-::
-
- def get_bw_to_relay():
- bw_to_relay = {}
-
- with Controller.from_port(control_port = 9051) as controller:
- controller.authenticate()
-
- for desc in controller.get_server_descriptors():
- if desc.exit_policy.is_exiting_allowed():
- bw_to_relay.setdefault(desc.observed_bandwidth, []).append(desc.nickname)
-
- return bw_to_relay
-
-
diff --git a/docs/tutorial/the_little_relay_that_could.rst b/docs/tutorial/the_little_relay_that_could.rst
index e3f42d0..96216d1 100644
--- a/docs/tutorial/the_little_relay_that_could.rst
+++ b/docs/tutorial/the_little_relay_that_could.rst
@@ -1,5 +1,5 @@
The Little Relay that Could
----------------------------
+===========================
Let's say you just set up your very first `Tor relay
<https://www.torproject.org/docs/tor-doc-relay.html.en>`_ (thank you!), and now
_______________________________________________
tor-commits mailing list
tor-commits@xxxxxxxxxxxxxxxxxxxx
https://lists.torproject.org/cgi-bin/mailman/listinfo/tor-commits