[Author Prev][Author Next][Thread Prev][Thread Next][Author Index][Thread Index]

[tor-commits] [ooni-probe/master] Do a very big cleanup and refactor of all the code in the repo.



commit 7d6901f1552067bce9595db6a84f8f5245d8f28c
Author: Arturo Filastò <art@xxxxxxxxx>
Date:   Fri Nov 9 17:37:32 2012 +0100

    Do a very big cleanup and refactor of all the code in the repo.
    
    * Move daphn3 protocol to to-be-ported
    * Remove rfc3339 support, we will use seconds since epoch
    * Refactor code that should have been in nettests
    * Eliminate code duplication
    * Remove python metaclass virtuosism instanity
---
 CHANGES.yaml                     |    5 -
 HACKING                          |  141 +++++++++--------
 TODO                             |    2 +-
 before_i_commit.sh               |    2 +-
 nettests/core/http_host.py       |    3 +-
 ooni/config.py                   |    4 +-
 ooni/inputunit.py                |   12 ++
 ooni/lib/rfc3339.py              |  283 ----------------------------------
 ooni/nettest.py                  |   20 +--
 ooni/nodes.py                    |   24 ++--
 ooni/oonicli.py                  |   23 ++--
 ooni/protocols/daphn3.py         |  311 --------------------------------------
 ooni/reporter.py                 |   18 ++-
 ooni/runner.py                   |   16 ++-
 ooni/templates/httpt.py          |   44 +-----
 ooni/utils/__init__.py           |  104 +------------
 ooni/utils/date.py               |   30 ----
 ooni/utils/geodata.py            |   29 ++--
 ooni/utils/hacks.py              |   38 +----
 ooni/utils/net.py                |   47 ++++++-
 ooni/utils/otime.py              |    8 +
 ooniprobe.conf                   |    2 +-
 to-be-ported/protocols/daphn3.py |  311 ++++++++++++++++++++++++++++++++++++++
 23 files changed, 531 insertions(+), 946 deletions(-)

diff --git a/CHANGES.yaml b/CHANGES.yaml
deleted file mode 100644
index 86206f1..0000000
--- a/CHANGES.yaml
+++ /dev/null
@@ -1,5 +0,0 @@
-# Add each new entry to the top of the file.
-
-- version: 0.0.1
-  date:    Thu Jul  7 16:03:36 EDT 2011
-  changes: Initial version.
diff --git a/HACKING b/HACKING
index a0bf89c..2fdc392 100644
--- a/HACKING
+++ b/HACKING
@@ -34,80 +34,85 @@ Code Structure
 ---------
 
 - HACKING
-    The document you are currently reading.
+  The document you are currently reading.
 
-- oonib/
-    Contains the OONI probe backend to be run on the ooni-net
+- inputs/
+  Contains input files for tests.
 
-- oonid/
-    Contains the OONI daemon that can be used to interrogated from the cli to
-    run tests.
+- oonib/
+  Contains the OONI probe backend to be run on the ooni-net
 
 - ooni/
-    Contains the main ooni probe comand line client
-
-- ooni/assets/
-    Where we store all the asset files that are
-    used when running OONI tests.
+  Contains the main ooni probe comand line client
 
 - ooni/config.py
-    Parts of the code related to parsing OONI
-    configuration files and making them accessible
-    to other components of the software.
-
-- ooni/logo.py
-    File containing some funny ASCII art. Yes, we
-    do enjoy ASCII art and are not afraid to admit it!
-
-- ooni/nodes.conf
-    The configuration file for nodes. This contains the
-    list of network and code execution nodes that can be
-    used to run tests off of.
-
-- ooni/ooniprobe.py
-    The main OONI-probe command line interface. This is
-    responsible for parsing the command line arguments and
-    passing the arguments to the underlying components.
-
-- ooni/ooni-probe.conf
-    The main OONI-probe configuration file. This can be used
-    to configure your OONI CLI, tell it where it should report
-    to, where the asset files are located, what should be used
-    for control, etc.
-
-- ooni/plugoo/__init__.py
-    All the necessary "goo" for making OONI probe work. This
-    means loading Assets, creating Reports, running Tests,
-    interacting with Nodes.
-
-- ooni/plugoo/assets.py
-    This is a python object representation of the data that is
-    located inside the asset directory.
-
-- ooni/plugoo/nodes.py
-    The part of code responsible for interacting with OONI Nodes.
-    Nodes can be Network or Code Execution. Network nodes are
-    capable of receiving packets and fowarding them onto their
-    network. This means that to run a test on X network nodes you
-    consume X*test_cost bandwith. Code Execution nodes accept units
-    of work, they are therefore capable of receiving a set of tests
-    that should be completed by a set of Network nodes or run locally.
-
-- ooni/plugoo/reports.py
-    Takes care of transforming the output of a test into a report. This
-    may mean pushing the result data to a remote backend or simply writing
-    a local file.
-
-- ooni/plugoo/tests.py
-    The main OONI probe test class. This provides all the necessary scaffold
-    to write your own test based on the OONI paradigm.
-
-- ooni/oonitests/
-    Contains all the "offical" OONI tests that are shipped.
-
-- ooni/utils.py
-    Helper functions that don't fit into any place, but are not big enough to
-    be a dependency by themselves.
+  Parts of the code related to parsing OONI
+  configuration files and making them accessible
+  to other components of the software.
+
+- ooni/inputunit.py
+  In here we have functions related to the creation of input
+  units. Input units are how the inputs to be fed to tests are
+  split up into.
+
+- ooni/nettest.py
+  In here is the NetTest API definition. This is how people
+  interested in writing ooniprobe tests will be specifying
+  them.
+
+- ooni/nodes.py
+  Mostly broken code for the remote dispatching of tests.
+
+- ooni/oonicli.py
+  In here we take care of running ooniprobe from the command
+  line interface
+
+- ooni/reporter.py
+  In here goes the logic for the creation of ooniprobe
+  reports.
+
+- ooni/runner.py
+  Handles running ooni.nettests as well as
+  ooni.plugoo.tests.OONITests.
+
+- ooni/kit/
+  In here go utilities that can be used by tests.
+
+- ooni/lib/
+  XXX this directory is to be removed.
+
+- ooni/utils/
+  In here go internal utilities that are useful to ooniprobe
+
+- ooni/utils/geodata.py
+  In here go functions related to the understanding of
+  geographical information of the probe
+
+- ooni/utils/hacks.py
+  When some software has issues and we need to fix it in a
+  hackish way, we put it in here. This one day will be empty.
+
+- ooni/utils/log.py
+  log realted functions.
+
+- ooni/utils/net.py
+  utilities for networking related operations
+
+- ooni/utils/onion.py
+  Utilities for working with Tor.
+  XXX this code should be removed and merged into txtorcon.
+
+- ooni/utils/otime.py
+  Generation of timestamps, time conversions and all the rest
+
+- ooni/utils/txscapy.py
+  Tools for making scapy work well with twisted.
+
+- ooniprobe.conf
+  The main OONI-probe configuration file. This can be used
+  to configure your OONI CLI, tell it where it should report
+  to, where the asset files are located, what should be used
+  for control, etc.
 
 Style guide
 -----------
diff --git a/TODO b/TODO
index 63d950c..b8524f3 100644
--- a/TODO
+++ b/TODO
@@ -105,4 +105,4 @@ It's important to make the new code asych and based on Twisted.  It should
 respect the design goals of the new ooni-probe model. Also, importing new,
 non-standard libraries should be discussed first, if the new test is to be
 used in the core of OONI (packaging scapy and twisted already makes our
-codebase quite large).
\ No newline at end of file
+codebase quite large).
diff --git a/before_i_commit.sh b/before_i_commit.sh
index f08a5f5..4a37686 100755
--- a/before_i_commit.sh
+++ b/before_i_commit.sh
@@ -34,5 +34,5 @@ echo "If you do, it means something is wrong."
 echo "Read through the log file and fix it."
 echo "If you are having some problems fixing some things that have to do with"
 echo "the core of OONI, let's first discuss it on IRC, or open a ticket"
-
+less *yamloo
 rm -f *yamloo
diff --git a/nettests/core/http_host.py b/nettests/core/http_host.py
index e09fc84..b87594d 100644
--- a/nettests/core/http_host.py
+++ b/nettests/core/http_host.py
@@ -14,8 +14,7 @@ from ooni.templates import httpt
 
 class UsageOptions(usage.Options):
     optParameters = [
-                     ['url', 'u', 'http://torproject.org/', 'Test single site'],
-                     ['backend', 'b', 'http://ooni.nu/test/', 'Test backend to use'],
+                     ['backend', 'b', 'http://ooni.nu/test/', 'Test backend to use']
                     ]
 
 
diff --git a/ooni/config.py b/ooni/config.py
index f3d1a80..cec9146 100644
--- a/ooni/config.py
+++ b/ooni/config.py
@@ -8,7 +8,7 @@ import yaml
 
 from twisted.internet import reactor, threads
 
-from ooni.utils import date
+from ooni.utils import otime
 from ooni.utils import Storage
 
 def get_root_path():
@@ -24,7 +24,7 @@ def oreport_filenames():
     returns
     yamloo_filename, pcap_filename
     """
-    base_filename = "%s_"+date.timestamp()+".%s"
+    base_filename = "%s_"+otime.timestamp()+".%s"
     yamloo_filename = base_filename % ("report", "yamloo")
     pcap_filename = base_filename % ("packets", "pcap")
     return yamloo_filename, pcap_filename
diff --git a/ooni/inputunit.py b/ooni/inputunit.py
index 3b0c491..484631b 100644
--- a/ooni/inputunit.py
+++ b/ooni/inputunit.py
@@ -1,3 +1,15 @@
+#-*- coding: utf-8 -*-
+#
+# inputunit.py 
+# -------------
+# IN here we have functions related to the creation of input
+# units. Input units are how the inputs to be fed to tests are
+# split up into.
+#
+# :authors: Arturo Filastò, Isis Lovecruft
+# :license: see included LICENSE file
+
+
 class InputUnitFactory(object):
     """
     This is a factory that takes the size of input units to be generated a set
diff --git a/ooni/lib/rfc3339.py b/ooni/lib/rfc3339.py
deleted file mode 100644
index e664ce7..0000000
--- a/ooni/lib/rfc3339.py
+++ /dev/null
@@ -1,283 +0,0 @@
-#!/usr/bin/env python
-#
-# Copyright (c) 2009, 2010, Henry Precheur <henry@xxxxxxxxxxxx>
-#
-# Permission to use, copy, modify, and/or distribute this software for any
-# purpose with or without fee is hereby granted, provided that the above
-# copyright notice and this permission notice appear in all copies.
-#
-# THE SOFTWARE IS PROVIDED "AS IS" AND THE AUTHOR DISCLAIMS ALL WARRANTIES WITH
-# REGARD TO THIS SOFTWARE INCLUDING ALL IMPLIED WARRANTIES OF MERCHANTABILITY AND
-# FITNESS. IN NO EVENT SHALL THE AUTHOR BE LIABLE FOR ANY SPECIAL, DIRECT,
-# INDIRECT, OR CONSEQUENTIAL DAMAGES OR ANY DAMAGES WHATSOEVER RESULTING FROM
-# LOSS OF USE, DATA OR PROFITS, WHETHER IN AN ACTION OF CONTRACT, NEGLIGENCE OR
-# OTHER TORTIOUS ACTION, ARISING OUT OF OR IN CONNECTION WITH THE USE OR
-# PERFORMANCE OF THIS SOFTWARE.
-#
-'''Formats dates according to the :RFC:`3339`.
-
-Report bugs & problems on BitBucket_
-
-.. _BitBucket: https://bitbucket.org/henry/clan.cx/issues
-'''
-
-__author__ = 'Henry Precheur <henry@xxxxxxxxxxxx>'
-__license__ = 'ISCL'
-__version__ = '5.1'
-__all__ = ('rfc3339', )
-
-import datetime
-import time
-import unittest
-
-def _timezone(utc_offset):
-    '''
-    Return a string representing the timezone offset.
-
-    >>> _timezone(0)
-    '+00:00'
-    >>> _timezone(3600)
-    '+01:00'
-    >>> _timezone(-28800)
-    '-08:00'
-    >>> _timezone(-1800)
-    '-00:30'
-    '''
-    # Python's division uses floor(), not round() like in other languages:
-    #   -1 / 2 == -1 and not -1 / 2 == 0
-    # That's why we use abs(utc_offset).
-    hours = abs(utc_offset) // 3600
-    minutes = abs(utc_offset) % 3600 // 60
-    sign = (utc_offset < 0 and '-') or '+'
-    return '%c%02d:%02d' % (sign, hours, minutes)
-
-def _timedelta_to_seconds(timedelta):
-    '''
-    >>> _timedelta_to_seconds(datetime.timedelta(hours=3))
-    10800
-    >>> _timedelta_to_seconds(datetime.timedelta(hours=3, minutes=15))
-    11700
-    '''
-    return (timedelta.days * 86400 + timedelta.seconds +
-            timedelta.microseconds // 1000)
-
-def _utc_offset(date, use_system_timezone):
-    '''
-    Return the UTC offset of `date`. If `date` does not have any `tzinfo`, use
-    the timezone informations stored locally on the system.
-
-    >>> if time.localtime().tm_isdst:
-    ...     system_timezone = -time.altzone
-    ... else:
-    ...     system_timezone = -time.timezone
-    >>> _utc_offset(datetime.datetime.now(), True) == system_timezone
-    True
-    >>> _utc_offset(datetime.datetime.now(), False)
-    0
-    '''
-    if isinstance(date, datetime.datetime) and date.tzinfo is not None:
-        return _timedelta_to_seconds(date.dst() or date.utcoffset())
-    elif use_system_timezone:
-        if date.year < 1970:
-            # We use 1972 because 1970 doesn't have a leap day (feb 29)
-            t = time.mktime(date.replace(year=1972).timetuple())
-        else:
-            t = time.mktime(date.timetuple())
-        if time.localtime(t).tm_isdst: # pragma: no cover
-            return -time.altzone
-        else:
-            return -time.timezone
-    else:
-        return 0
-
-def _string(d, timezone):
-    return ('%04d-%02d-%02dT%02d:%02d:%02d%s' %
-            (d.year, d.month, d.day, d.hour, d.minute, d.second, timezone))
-
-def rfc3339(date, utc=False, use_system_timezone=True):
-    '''
-    Return a string formatted according to the :RFC:`3339`. If called with
-    `utc=True`, it normalizes `date` to the UTC date. If `date` does not have
-    any timezone information, uses the local timezone::
-
-        >>> d = datetime.datetime(2008, 4, 2, 20)
-        >>> rfc3339(d, utc=True, use_system_timezone=False)
-        '2008-04-02T20:00:00Z'
-        >>> rfc3339(d) # doctest: +ELLIPSIS
-        '2008-04-02T20:00:00...'
-
-    If called with `user_system_timezone=False` don't use the local timezone if
-    `date` does not have timezone informations and consider the offset to UTC
-    to be zero::
-
-        >>> rfc3339(d, use_system_timezone=False)
-        '2008-04-02T20:00:00+00:00'
-
-    `date` must be a `datetime.datetime`, `datetime.date` or a timestamp as
-    returned by `time.time()`::
-
-        >>> rfc3339(0, utc=True, use_system_timezone=False)
-        '1970-01-01T00:00:00Z'
-        >>> rfc3339(datetime.date(2008, 9, 6), utc=True,
-        ...         use_system_timezone=False)
-        '2008-09-06T00:00:00Z'
-        >>> rfc3339(datetime.date(2008, 9, 6),
-        ...         use_system_timezone=False)
-        '2008-09-06T00:00:00+00:00'
-        >>> rfc3339('foo bar')
-        Traceback (most recent call last):
-        ...
-        TypeError: Expected timestamp or date object. Got <type 'str'>.
-
-    For dates before January 1st 1970, the timezones will be the ones used in
-    1970. It might not be accurate, but on most sytem there is no timezone
-    information before 1970.
-    '''
-    # Try to convert timestamp to datetime
-    try:
-        if use_system_timezone:
-            date = datetime.datetime.fromtimestamp(date)
-        else:
-            date = datetime.datetime.utcfromtimestamp(date)
-    except TypeError:
-        pass
-
-    if not isinstance(date, datetime.date):
-        raise TypeError('Expected timestamp or date object. Got %r.' %
-                        type(date))
-
-    if not isinstance(date, datetime.datetime):
-        date = datetime.datetime(*date.timetuple()[:3])
-    utc_offset = _utc_offset(date, use_system_timezone)
-    if utc:
-        return _string(date + datetime.timedelta(seconds=utc_offset), 'Z')
-    else:
-        return _string(date, _timezone(utc_offset))
-
-
-class LocalTimeTestCase(unittest.TestCase):
-    '''
-    Test the use of the timezone saved locally. Since it is hard to test using
-    doctest.
-    '''
-
-    def setUp(self):
-        local_utcoffset = _utc_offset(datetime.datetime.now(), True)
-        self.local_utcoffset = datetime.timedelta(seconds=local_utcoffset)
-        self.local_timezone = _timezone(local_utcoffset)
-
-    def test_datetime(self):
-        d = datetime.datetime.now()
-        self.assertEqual(rfc3339(d),
-                         d.strftime('%Y-%m-%dT%H:%M:%S') + self.local_timezone)
-
-    def test_datetime_timezone(self):
-
-        class FixedNoDst(datetime.tzinfo):
-            'A timezone info with fixed offset, not DST'
-
-            def utcoffset(self, dt):
-                return datetime.timedelta(hours=2, minutes=30)
-
-            def dst(self, dt):
-                return None
-
-        fixed_no_dst = FixedNoDst()
-
-        class Fixed(FixedNoDst):
-            'A timezone info with DST'
-
-            def dst(self, dt):
-                return datetime.timedelta(hours=3, minutes=15)
-
-        fixed = Fixed()
-
-        d = datetime.datetime.now().replace(tzinfo=fixed_no_dst)
-        timezone = _timezone(_timedelta_to_seconds(fixed_no_dst.\
-                                                   utcoffset(None)))
-        self.assertEqual(rfc3339(d),
-                         d.strftime('%Y-%m-%dT%H:%M:%S') + timezone)
-
-        d = datetime.datetime.now().replace(tzinfo=fixed)
-        timezone = _timezone(_timedelta_to_seconds(fixed.dst(None)))
-        self.assertEqual(rfc3339(d),
-                         d.strftime('%Y-%m-%dT%H:%M:%S') + timezone)
-
-    def test_datetime_utc(self):
-        d = datetime.datetime.now()
-        d_utc = d + self.local_utcoffset
-        self.assertEqual(rfc3339(d, utc=True),
-                         d_utc.strftime('%Y-%m-%dT%H:%M:%SZ'))
-
-    def test_date(self):
-        d = datetime.date.today()
-        self.assertEqual(rfc3339(d),
-                         d.strftime('%Y-%m-%dT%H:%M:%S') + self.local_timezone)
-
-    def test_date_utc(self):
-        d = datetime.date.today()
-        # Convert `date` to `datetime`, since `date` ignores seconds and hours
-        # in timedeltas:
-        # >>> datetime.date(2008, 9, 7) + datetime.timedelta(hours=23)
-        # datetime.date(2008, 9, 7)
-        d_utc = datetime.datetime(*d.timetuple()[:3]) + self.local_utcoffset
-        self.assertEqual(rfc3339(d, utc=True),
-                         d_utc.strftime('%Y-%m-%dT%H:%M:%SZ'))
-
-    def test_timestamp(self):
-        d = time.time()
-        self.assertEqual(rfc3339(d),
-                         datetime.datetime.fromtimestamp(d).\
-                         strftime('%Y-%m-%dT%H:%M:%S') + self.local_timezone)
-
-    def test_timestamp_utc(self):
-        d = time.time()
-        d_utc = datetime.datetime.utcfromtimestamp(d) + self.local_utcoffset
-        self.assertEqual(rfc3339(d),
-                         (d_utc.strftime('%Y-%m-%dT%H:%M:%S') +
-                          self.local_timezone))
-
-    def test_before_1970(self):
-        d = datetime.date(1885, 01, 04)
-        self.failUnless(rfc3339(d).startswith('1885-01-04T00:00:00'))
-        self.assertEqual(rfc3339(d, utc=True, use_system_timezone=False),
-                         '1885-01-04T00:00:00Z')
-
-    def test_1920(self):
-        d = datetime.date(1920, 02, 29)
-        x = rfc3339(d, utc=False, use_system_timezone=True)
-        self.failUnless(x.startswith('1920-02-29T00:00:00'))
-
-    # If these tests start failing it probably means there was a policy change
-    # for the Pacific time zone.
-    # See http://en.wikipedia.org/wiki/Pacific_Time_Zone.
-    if 'PST' in time.tzname:
-        def testPDTChange(self):
-            '''Test Daylight saving change'''
-            # PDT switch happens at 2AM on March 14, 2010
-
-            # 1:59AM PST
-            self.assertEqual(rfc3339(datetime.datetime(2010, 3, 14, 1, 59)),
-                             '2010-03-14T01:59:00-08:00')
-            # 3AM PDT
-            self.assertEqual(rfc3339(datetime.datetime(2010, 3, 14, 3, 0)),
-                             '2010-03-14T03:00:00-07:00')
-
-        def testPSTChange(self):
-            '''Test Standard time change'''
-            # PST switch happens at 2AM on November 6, 2010
-
-            # 0:59AM PDT
-            self.assertEqual(rfc3339(datetime.datetime(2010, 11, 7, 0, 59)),
-                             '2010-11-07T00:59:00-07:00')
-
-            # 1:00AM PST
-            # There's no way to have 1:00AM PST without a proper tzinfo
-            self.assertEqual(rfc3339(datetime.datetime(2010, 11, 7, 1, 0)),
-                             '2010-11-07T01:00:00-07:00')
-
-
-if __name__ == '__main__': # pragma: no cover
-    import doctest
-    doctest.testmod()
-    unittest.main()
diff --git a/ooni/nettest.py b/ooni/nettest.py
index 6221a3f..896ae2a 100644
--- a/ooni/nettest.py
+++ b/ooni/nettest.py
@@ -1,15 +1,12 @@
 # -*- encoding: utf-8 -*-
 #
-#     nettest.py
-# ------------------->
+# nettest.py
+# ----------
+# In here is the NetTest API definition. This is how people
+# interested in writing ooniprobe tests will be specifying them
 #
-# :authors: Arturo "hellais" Filastò <art@xxxxxxxxx>,
-#           Isis Lovecruft <isis@xxxxxxxxxxxxxx>
-# :licence: see LICENSE
-# :copyright: 2012 Arturo Filasto, Isis Lovecruft
-# :version: 0.1.0-alpha
-#
-# <-------------------
+# :authors: Arturo Filastò, Isis Lovecruft
+# :license: see included LICENSE file
 
 import sys
 import os
@@ -22,8 +19,6 @@ from twisted.python import usage
 
 from ooni.utils import log
 
-pyunit = __import__('unittest')
-
 class NetTestCase(object):
     """
     This is the base of the OONI nettest universe. When you write a nettest
@@ -102,7 +97,6 @@ class NetTestCase(object):
 
     usageOptions = None
     requiredOptions = []
-
     requiresRoot = False
 
     localOptions = {}
@@ -159,7 +153,7 @@ class NetTestCase(object):
             raise usage.UsageError("No input file specified!")
 
         self._checkRequiredOptions()
-        
+
         # XXX perhaps we may want to name and version to be inside of a
         # different method that is not called options.
         return {'inputs': self.inputs,
diff --git a/ooni/nodes.py b/ooni/nodes.py
index 155f183..070ffe7 100644
--- a/ooni/nodes.py
+++ b/ooni/nodes.py
@@ -1,16 +1,14 @@
-#!/usr/bin/env python
-# -*- coding: UTF-8
-"""
-    nodes
-    *****
-
-    This contains all the code related to Nodes
-    both network and code execution.
-
-    :copyright: (c) 2012 by Arturo Filastò, Isis Lovecruft
-    :license: see LICENSE for more details.
-
-"""
+#-*- coding: utf-8 -*-
+#
+# nodes.py
+# --------
+# here is code for handling the interaction with remote
+# services that will run ooniprobe tests.
+# XXX most of the code in here is broken or not tested and
+# probably should be trashed
+#
+# :authors: Arturo Filastò, Isis Lovecruft
+# :license: see included LICENSE file
 
 import os
 from binascii import hexlify
diff --git a/ooni/oonicli.py b/ooni/oonicli.py
index 4b63a61..8e4fa14 100644
--- a/ooni/oonicli.py
+++ b/ooni/oonicli.py
@@ -1,16 +1,12 @@
-#!/usr/bin/env python
 # -*- coding: UTF-8
 #
-#    oonicli
-#    *********
+# oonicli
+# -------
+# In here we take care of running ooniprobe from the command
+# line interface
 #
-#    oonicli is the next generation ooniprober. It based off of twisted's trial
-#    unit testing framework.
-#
-#    :copyright: (c) 2012 by Arturo Filastò, Isis Lovecruft
-#    :license: see LICENSE for more details.
-#
-#    original copyright (c) by Twisted Matrix Laboratories.
+# :authors: Arturo Filastò, Isis Lovecruft
+# :license: see included LICENSE file
 
 
 import sys
@@ -27,7 +23,8 @@ from ooni import nettest, runner, reporter, config
 
 from ooni.inputunit import InputUnitFactory
 
-from ooni.utils import net, checkForRoot
+from ooni.utils import net
+from ooni.utils import checkForRoot, NotRootError
 from ooni.utils import log
 
 
@@ -118,9 +115,9 @@ def run():
     if config.privacy.includepcap:
         try:
             checkForRoot()
-        except:
+        except NotRootError:
             log.err("includepcap options requires root priviledges to run")
-            log.err("disable it in your ooniprobe.conf file")
+            log.err("you should run ooniprobe as root or disable the options in ooniprobe.conf")
             sys.exit(1)
         log.debug("Starting sniffer")
         sniffer_d = net.capturePackets(pcap_filename)
diff --git a/ooni/protocols/__init__.py b/ooni/protocols/__init__.py
deleted file mode 100644
index e69de29..0000000
diff --git a/ooni/protocols/daphn3.py b/ooni/protocols/daphn3.py
deleted file mode 100644
index 37c94c7..0000000
--- a/ooni/protocols/daphn3.py
+++ /dev/null
@@ -1,311 +0,0 @@
-import sys
-import yaml
-
-from twisted.internet import protocol, defer
-from twisted.internet.error import ConnectionDone
-
-from scapy.all import IP, Raw, rdpcap
-
-from ooni.utils import log
-from ooni.plugoo import reports
-
-def read_pcap(filename):
-    """
-    @param filename: Filesystem path to the pcap.
-
-    Returns:
-      [{"sender": "client", "data": "\x17\x52\x15"}, {"sender": "server", "data": "\x17\x15\x13"}]
-    """
-    packets = rdpcap(filename)
-
-    checking_first_packet = True
-    client_ip_addr = None
-    server_ip_addr = None
-
-    ssl_packets = []
-    messages = []
-
-    """
-    pcap assumptions:
-
-    pcap only contains packets exchanged between a Tor client and a Tor server.
-    (This assumption makes sure that there are only two IP addresses in the
-    pcap file)
-
-    The first packet of the pcap is sent from the client to the server. (This
-    assumption is used to get the IP address of the client.)
-
-    All captured packets are TLS packets: that is TCP session
-    establishment/teardown packets should be filtered out (no SYN/SYN+ACK)
-    """
-
-    """Minimally validate the pcap and also find out what's the client
-    and server IP addresses."""
-    for packet in packets:
-        if checking_first_packet:
-            client_ip_addr = packet[IP].src
-            checking_first_packet = False
-        else:
-            if packet[IP].src != client_ip_addr:
-                server_ip_addr = packet[IP].src
-
-        try:
-            if (packet[Raw]):
-                ssl_packets.append(packet)
-        except IndexError:
-            pass
-
-    """Form our list."""
-    for packet in ssl_packets:
-        if packet[IP].src == client_ip_addr:
-            messages.append({"sender": "client", "data": str(packet[Raw])})
-        elif packet[IP].src == server_ip_addr:
-            messages.append({"sender": "server", "data": str(packet[Raw])})
-        else:
-            raise("Detected third IP address! pcap is corrupted.")
-
-    return messages
-
-def read_yaml(filename):
-    f = open(filename)
-    obj = yaml.load(f)
-    f.close()
-    return obj
-
-class Mutator:
-    idx = 0
-    step = 0
-
-    waiting = False
-    waiting_step = 0
-
-    def __init__(self, steps):
-        """
-        @param steps: array of dicts for the steps that must be gone over by
-                      the mutator. Looks like this:
-                      [{"sender": "client", "data": "\xde\xad\xbe\xef"},
-                       {"sender": "server", "data": "\xde\xad\xbe\xef"}]
-        """
-        self.steps = steps
-
-    def _mutate(self, data, idx):
-        """
-        Mutate the idx bytes by increasing it's value by one
-
-        @param data: the data to be mutated.
-
-        @param idx: what byte should be mutated.
-        """
-        print "idx: %s, data: %s" % (idx, data)
-        ret = data[:idx]
-        ret += chr(ord(data[idx]) + 1)
-        ret += data[idx+1:]
-        return ret
-
-    def state(self):
-        """
-        Return the current mutation state. As in what bytes are being mutated.
-
-        Returns a dict containg the packet index and the step number.
-        """
-        print "[Mutator.state()] Giving out my internal state."
-        current_state =  {'idx': self.idx, 'step': self.step}
-        return current_state
-
-    def next(self):
-        """
-        Increases by one the mutation state.
-
-        ex. (* is the mutation state, i.e. the byte to be mutated)
-        before [___*] [____]
-               step1   step2
-        after  [____] [*___]
-
-        Should be called every time you need to proceed onto the next mutation.
-        It changes the internal state of the mutator to that of the next
-        mutatation.
-
-        returns True if another mutation is available.
-        returns False if all the possible mutations have been done.
-        """
-        if (self.step) == len(self.steps):
-            # Hack to stop once we have gone through all the steps
-            print "[Mutator.next()] I believe I have gone over all steps"
-            print "                          Stopping!"
-            self.waiting = True
-            return False
-
-        self.idx += 1
-        current_idx = self.idx
-        current_step = self.step
-        current_data = self.steps[current_step]['data']
-
-        if 0:
-            print "current_step: %s" % current_step
-            print "current_idx: %s" % current_idx
-            print "current_data: %s" % current_data
-            print "steps: %s" % len(self.steps)
-            print "waiting_step: %s" % self.waiting_step
-
-        data_to_receive = len(self.steps[current_step]['data'])
-
-        if self.waiting and self.waiting_step == data_to_receive:
-            print "[Mutator.next()] I am no longer waiting"
-            log.debug("I am no longer waiting.")
-            self.waiting = False
-            self.waiting_step = 0
-            self.idx = 0
-
-        elif self.waiting:
-            print "[Mutator.next()] Waiting some more."
-            log.debug("Waiting some more.")
-            self.waiting_step += 1
-
-        elif current_idx >= len(current_data):
-            print "[Mutator.next()] Entering waiting mode."
-            log.debug("Entering waiting mode.")
-            self.step += 1
-            self.idx = 0
-            self.waiting = True
-
-        log.debug("current index %s" % current_idx)
-        log.debug("current data %s" % len(current_data))
-        return True
-
-    def get(self, step):
-        """
-        Returns the current packet to be sent to the wire.
-        If no mutation is necessary it will return the plain data.
-        Should be called when you are interested in obtaining the data to be
-        sent for the selected state.
-
-        @param step: the current step you want the mutation for
-
-        returns the mutated packet for the specified step.
-        """
-        if step != self.step or self.waiting:
-            log.debug("[Mutator.get()] I am not going to do anything :)")
-            return self.steps[step]['data']
-
-        data = self.steps[step]['data']
-        #print "Mutating %s with idx %s" % (data, self.idx)
-        return self._mutate(data, self.idx)
-
-class Daphn3Protocol(protocol.Protocol):
-    """
-    This implements the Daphn3 protocol for the server side.
-    It gets instanced once for every client that connects to the oonib.
-    For every instance of protocol there is only 1 mutation.
-    Once the last step is reached the connection is closed on the serverside.
-    """
-    steps = []
-    mutator = None
-
-    current_state = None
-
-    role = 'client'
-    state = 0
-    total_states = len(steps) - 1
-    received_data = 0
-    to_receive_data = 0
-    report = reports.Report('daphn3', 'daphn3.yamlooni')
-
-    test = None
-
-    def next_state(self):
-        """
-        This is called once I have completed one step of the protocol and need
-        to proceed to the next step.
-        """
-        if not self.mutator:
-            print "[Daphn3Protocol.next_state] No mutator. There is no point to stay on this earth."
-            self.transport.loseConnection()
-            return
-
-        if self.role is self.steps[self.state]['sender']:
-            print "[Daphn3Protocol.next_state] I am a sender"
-            data = self.mutator.get(self.state)
-            self.transport.write(data)
-            self.to_receive_data = 0
-
-        else:
-            print "[Daphn3Protocol.next_state] I am a receiver"
-            self.to_receive_data = len(self.steps[self.state]['data'])
-
-        self.state += 1
-        self.received_data = 0
-
-    def dataReceived(self, data):
-        """
-        This is called every time some data is received. I transition to the
-        next step once the amount of data that I expect to receive is received.
-
-        @param data: the data that has been sent by the client.
-        """
-        if not self.mutator:
-            print "I don't have a mutator. My life means nothing."
-            self.transport.loseConnection()
-            return
-
-        if len(self.steps) == self.state:
-            self.transport.loseConnection()
-            return
-
-        self.received_data += len(data)
-        if self.received_data >= self.to_receive_data:
-            print "Moving to next state %s" % self.state
-            self.next_state()
-
-    def censorship_detected(self, report):
-        """
-        I have detected the possible presence of censorship we need to write a
-        report on it.
-
-        @param report: a dict containing the report to be written. Must contain
-                       the keys 'reason', 'proto_state' and 'mutator_state'.
-                       The reason is the reason for which the connection was
-                       closed. The proto_state is the current state of the
-                       protocol instance and mutator_state is what was being
-                       mutated.
-        """
-        print "The connection was closed because of %s" % report['reason']
-        print "State %s, Mutator %s" % (report['proto_state'],
-                                        report['mutator_state'])
-        if self.test:
-            self.test.result['censored'] = True
-            self.test.result['state'] = report
-        self.mutator.next()
-
-    def connectionLost(self, reason):
-        """
-        The connection was closed. This may be because of a legittimate reason
-        or it may be because of a censorship event.
-        """
-        if not self.mutator:
-            print "Terminated because of little interest in life."
-            return
-        report = {'reason': reason, 'proto_state': self.state,
-                'trigger': None, 'mutator_state': self.current_state}
-
-        if self.state < self.total_states:
-            report['trigger'] = 'did not finish state walk'
-            self.censorship_detected(report)
-
-        else:
-            print "I have reached the end of the state machine"
-            print "Censorship fingerprint bruteforced!"
-            if self.test:
-                print "In the test thing"
-                self.test.result['censored'] = False
-                self.test.result['state'] = report
-                self.test.result['state_walk_finished'] = True
-                self.test.report(self.test.result)
-            return
-
-        if reason.check(ConnectionDone):
-            print "Connection closed cleanly"
-        else:
-            report['trigger'] = 'unclean connection closure'
-            self.censorship_detected(report)
-
-
diff --git a/ooni/reporter.py b/ooni/reporter.py
index cdbf355..6d37838 100644
--- a/ooni/reporter.py
+++ b/ooni/reporter.py
@@ -1,3 +1,12 @@
+#-*- coding: utf-8 -*-
+#
+# reporter.py 
+# -----------
+# In here goes the logic for the creation of ooniprobe reports.
+#
+# :authors: Arturo Filastò, Isis Lovecruft
+# :license: see included LICENSE file
+
 import itertools
 import logging
 import sys
@@ -11,13 +20,12 @@ from yaml.emitter import *
 from yaml.serializer import *
 from yaml.resolver import *
 
-from datetime import datetime
 from twisted.python.util import untilConcludes
 from twisted.trial import reporter
 from twisted.internet import defer, reactor
 
 from ooni.templates.httpt import BodyReceiver, StringProducer
-from ooni.utils import date, log, geodata
+from ooni.utils import otime, log, geodata
 from ooni import config
 
 try:
@@ -160,7 +168,7 @@ class OReporter(YamlReporter):
         self.firstrun = False
         self._writeln("###########################################")
         self._writeln("# OONI Probe Report for %s test" % options['name'])
-        self._writeln("# %s" % date.pretty_date())
+        self._writeln("# %s" % otime.prettyDateNow())
         self._writeln("###########################################")
 
         client_geodata = {}
@@ -194,7 +202,7 @@ class OReporter(YamlReporter):
             client_geodata['countrycode'] = client_location['countrycode']
 
 
-        test_details = {'start_time': repr(date.now()),
+        test_details = {'start_time': otime.utcTimeNow(),
                         'probe_asn': client_geodata['asn'],
                         'probe_cc': client_geodata['countrycode'],
                         'probe_ip': client_geodata['ip'],
@@ -222,7 +230,7 @@ class OReporter(YamlReporter):
         self.writeReportEntry(report)
 
     def allDone(self):
-        log.debug("Finished running everything")
+        log.debug("Finished running all tests")
         self.finish()
         try:
             reactor.stop()
diff --git a/ooni/runner.py b/ooni/runner.py
index d8b7df8..238d7d5 100644
--- a/ooni/runner.py
+++ b/ooni/runner.py
@@ -2,12 +2,11 @@
 #
 # runner.py
 # ---------
-# Handles running ooni.nettests as well as ooni.plugoo.tests.OONITests.
+# Handles running ooni.nettests as well as
+# ooni.plugoo.tests.OONITests.
 #
-# :authors: Isis Lovecruft, Arturo Filasto
+# :authors: Arturo Filastò, Isis Lovecruft
 # :license: see included LICENSE file
-# :copyright: (c) 2012 Isis Lovecruft, Arturo Filasto, The Tor Project, Inc.
-# :version: 0.1.0-pre-alpha
 
 import os
 import sys
@@ -25,7 +24,7 @@ from ooni.nettest import NetTestCase
 
 from ooni import reporter
 
-from ooni.utils import log, date, checkForRoot
+from ooni.utils import log, checkForRoot, NotRootError
 
 def processTest(obj, cmd_line_options):
     """
@@ -42,7 +41,12 @@ def processTest(obj, cmd_line_options):
 
     input_file = obj.inputFile
     if obj.requiresRoot:
-        checkForRoot("test")
+        try:
+            checkForRoot()
+        except NotRootError:
+            log.err("%s requires root to run" % obj.name)
+            sys.exit(1)
+
 
     if obj.optParameters or input_file \
             or obj.usageOptions or obj.optFlags:
diff --git a/ooni/templates/httpt.py b/ooni/templates/httpt.py
index 1491cbc..4c42a3a 100644
--- a/ooni/templates/httpt.py
+++ b/ooni/templates/httpt.py
@@ -13,50 +13,10 @@ from twisted.internet import protocol, defer
 from twisted.internet.ssl import ClientContextFactory
 
 from twisted.web.http_headers import Headers
-from twisted.web.iweb import IBodyProducer
-
 from ooni.nettest import NetTestCase
 from ooni.utils import log
 
-useragents = [("Mozilla/5.0 (Windows; U; Windows NT 5.1; en-GB; rv:1.8.1.6) Gecko/20070725 Firefox/2.0.0.6", "Firefox 2.0, Windows XP"),
-              ("Mozilla/4.0 (compatible; MSIE 7.0; Windows NT 5.1)", "Internet Explorer 7, Windows Vista"),
-              ("Mozilla/4.0 (compatible; MSIE 7.0; Windows NT 5.1; .NET CLR 1.1.4322; .NET CLR 2.0.50727; .NET CLR 3.0.04506.30)", "Internet Explorer 7, Windows XP"),
-              ("Mozilla/4.0 (compatible; MSIE 6.0; Windows NT 5.1; .NET CLR 1.1.4322)", "Internet Explorer 6, Windows XP"),
-              ("Mozilla/4.0 (compatible; MSIE 5.0; Windows NT 5.1; .NET CLR 1.1.4322)", "Internet Explorer 5, Windows XP"),
-              ("Opera/9.20 (Windows NT 6.0; U; en)", "Opera 9.2, Windows Vista"),
-              ("Opera/9.00 (Windows NT 5.1; U; en)", "Opera 9.0, Windows XP"),
-              ("Mozilla/4.0 (compatible; MSIE 6.0; Windows NT 5.1; en) Opera 8.50", "Opera 8.5, Windows XP"),
-              ("Mozilla/4.0 (compatible; MSIE 6.0; Windows NT 5.1; en) Opera 8.0", "Opera 8.0, Windows XP"),
-              ("Mozilla/4.0 (compatible; MSIE 6.0; MSIE 5.5; Windows NT 5.1) Opera 7.02 [en]", "Opera 7.02, Windows XP"),
-              ("Mozilla/5.0 (Windows; U; Windows NT 5.1; en-US; rv:1.7.5) Gecko/20060127 Netscape/8.1", "Netscape 8.1, Windows XP")]
-
-class StringProducer(object):
-    implements(IBodyProducer)
-
-    def __init__(self, body):
-        self.body = body
-        self.length = len(body)
-
-    def startProducing(self, consumer):
-        consumer.write(self.body)
-        return defer.succeed(None)
-
-    def pauseProducing(self):
-        pass
-
-    def stopProducing(self):
-        pass
-
-class BodyReceiver(protocol.Protocol):
-    def __init__(self, finished):
-        self.finished = finished
-        self.data = ""
-
-    def dataReceived(self, bytes):
-        self.data += bytes
-
-    def connectionLost(self, reason):
-        self.finished.callback(self.data)
+from ooni.utils.net import BodyReceiver, StringProducer, userAgents
 
 class HTTPTest(NetTestCase):
     """
@@ -204,7 +164,7 @@ class HTTPTest(NetTestCase):
         return finished
 
     def randomize_useragent(self):
-        user_agent = random.choice(useragents)
+        user_agent = random.choice(userAgents)
         self.request['headers']['User-Agent'] = [user_agent]
 
     def build_request(self, url, method="GET", headers=None, body=None):
diff --git a/ooni/utils/__init__.py b/ooni/utils/__init__.py
index 9961e03..5947519 100644
--- a/ooni/utils/__init__.py
+++ b/ooni/utils/__init__.py
@@ -7,11 +7,7 @@ import os
 import logging
 import string
 import random
-
-try:
-    import yaml
-except:
-    print "Error in importing YAML"
+import yaml
 
 class Storage(dict):
     """
@@ -30,7 +26,6 @@ class Storage(dict):
         >>> o.a
         None
     """
-
     def __getattr__(self, key):
         try:
             return self[key]
@@ -56,99 +51,12 @@ class Storage(dict):
         for (k, v) in value.items():
             self[k] = v
 
-def checkForRoot(what):
-    if os.getuid() != 0:
-        raise Exception("This %s requires root to run" % what)
-
-def get_logger(config):
-    loglevel = getattr(logging, config.loglevel.upper())
-    logging.basicConfig(level=loglevel,
-                    format='%(asctime)s %(name)-12s %(levelname)-8s %(message)s',
-                    filename=config.logfile,
-                    filemode='w')
-
-    console = logging.StreamHandler()
-    console.setLevel(getattr(logging, config.consoleloglevel.upper()))
-    # Set the console logger to a different format
-    formatter = logging.Formatter('%(name)-12s: %(levelname)-8s %(message)s')
-    console.setFormatter(formatter)
-    logging.getLogger('').addHandler(console)
-
-    return logging.getLogger('ooniprobe')
-
-def parse_asset(asset):
-    parsed = Storage()
-    try:
-        with open(asset, 'r') as f:
-            for line in f.readlines():
-            # XXX This should be rewritten, if the values contain
-            #     #: they will be rewritten with blank.
-            # should not be an issue but this is not a very good parser
-                if line.startswith("#:"):
-                    n = line.split(' ')[0].replace('#:','')
-                    v = line.replace('#:'+n+' ', '').strip()
-                    if n in ('tests', 'files'):
-                        parsed[n] = v.split(",")
-                    else:
-                        parsed[n] = v
-
-                elif line.startswith("#"):
-                    continue
-                else:
-                        break
-    finally:
-        if not parsed.name:
-            parsed.name = asset
-        if not parsed.files:
-            parsed.files = asset
-        return parsed
-
-def import_test(name, config):
-    if name.endswith(".py"):
-        test = Storage()
-        test_name = name.split(".")[0]
-        fp, pathname, description = imp.find_module(test_name,
-                                            [config.main.testdir])
-        module = imp.load_module(name, fp, pathname, description)
-
-        try:
-            test.name = module.__name__
-            test.desc = module.__desc__
-            test.module = module
-        except:
-            test.name = test_name
-            test.desc = ""
-            test.module = module
-
-        return test_name, test
-
-    return None, None
+class NotRootError(Exception):
+    pass
 
-class Log():
-    """
-    This is a class necessary for parsing YAML log files.
-    It is required because pyYaml has a bug in parsing
-    log format YAML files.
-    """
-    def __init__(self, file=None):
-        if file:
-            self.fh = open(file)
-
-    def __iter__(self):
-        return self
-
-    def next(self):
-        lines = []
-        try:
-            line = self.fh.readline()
-            if not line:
-                raise StopIteration
-            while not line.startswith("---"):
-                lines.append(line)
-                line = self.fh.readline()
-            return lines
-        except:
-            raise StopIteration
+def checkForRoot():
+    if os.getuid() != 0:
+        raise NotRootError("This test requires root")
 
 def randomSTR(length, num=True):
     """
diff --git a/ooni/utils/date.py b/ooni/utils/date.py
deleted file mode 100644
index 25250a6..0000000
--- a/ooni/utils/date.py
+++ /dev/null
@@ -1,30 +0,0 @@
-from ooni.lib.rfc3339 import rfc3339
-from datetime import datetime
-
-class odate(datetime):
-    def __str__(self):
-        return "%s" % rfc3339(self)
-
-    def __repr__(self):
-        return "%s" % rfc3339(self)
-
-    def from_rfc(self, datestr):
-        pass
-
-def now():
-    return odate.utcnow()
-
-def pretty_date():
-    cur_time = datetime.utcnow()
-    d_format = "%d %B %Y %H:%M:%S"
-    pretty = cur_time.strftime(d_format)
-    return pretty
-
-def timestamp():
-    cur_time = datetime.utcnow()
-    d_format = "%d_%B_%Y_%H-%M-%S"
-    pretty = cur_time.strftime(d_format)
-    return pretty
-
-
-
diff --git a/ooni/utils/geodata.py b/ooni/utils/geodata.py
index bd61dfd..5c3c481 100644
--- a/ooni/utils/geodata.py
+++ b/ooni/utils/geodata.py
@@ -1,23 +1,22 @@
+# -*- encoding: utf-8 -*-
+#
+# geodata.py
+# **********
+# In here go functions related to the understanding of
+# geographical information of the probe
+#
+# :authors: Arturo Filastò
+# :licence: see LICENSE
+
 import re
-import pygeoip
 import os
-
-from ooni import config
-from ooni.utils import log
+import pygeoip
 
 from twisted.web.client import Agent
 from twisted.internet import reactor, defer, protocol
 
-class BodyReceiver(protocol.Protocol):
-    def __init__(self, finished):
-        self.finished = finished
-        self.data = ""
-
-    def dataReceived(self, bytes):
-        self.data += bytes
-
-    def connectionLost(self, reason):
-        self.finished.callback(self.data)
+from ooni.utils import log, net
+from ooni import config
 
 @defer.inlineCallbacks
 def myIP():
@@ -28,7 +27,7 @@ def myIP():
     result = yield myAgent.request('GET', target_site)
 
     finished = defer.Deferred()
-    result.deliverBody(BodyReceiver(finished))
+    result.deliverBody(net.BodyReceiver(finished))
 
     body = yield finished
 
diff --git a/ooni/utils/hacks.py b/ooni/utils/hacks.py
index e778540..4eef366 100644
--- a/ooni/utils/hacks.py
+++ b/ooni/utils/hacks.py
@@ -1,5 +1,10 @@
 # -*- encoding: utf-8 -*-
 #
+# hacks.py
+# ********
+# When some software has issues and we need to fix it in a
+# hackish way, we put it in here. This one day will be empty.
+# 
 # :authors: Arturo Filastò
 # :licence: see LICENSE
 
@@ -56,36 +61,3 @@ def patched_reduce_ex(self, proto):
         return copy_reg._reconstructor, args, dict
     else:
         return copy_reg._reconstructor, args
-
-class MetaSuper(type):
-    """
-    Metaclass for creating subclasses which have builtin name munging, so that
-    they are able to call self.__super.method() from an instance function
-    without knowing the instance class' base class name.
-
-    For example:
-
-        from hacks import MetaSuper
-        class A:
-            __metaclass__ = MetaSuper
-            def method(self):
-                return "A"
-        class B(A):
-            def method(self):
-                return "B" + self.__super.method()
-        class C(A):
-            def method(self):
-                return "C" + self.__super.method()
-        class D(C, B):
-            def method(self):
-                return "D" + self.__super.method()
-
-        assert D().method() == "DCBA"
-
-    Subclasses should not override "__init__", nor should subclasses have
-    the same name as any of their bases, or else much pain and suffering
-    will occur.
-    """
-    def __init__(cls, name, bases, dict):
-        super(autosuper, cls).__init__(name, bases, dict)
-        setattr(cls, "_%s__super" % name, super(cls))
diff --git a/ooni/utils/net.py b/ooni/utils/net.py
index 3fd4b41..d43261a 100644
--- a/ooni/utils/net.py
+++ b/ooni/utils/net.py
@@ -5,16 +5,55 @@
 # OONI utilities for networking related operations
 
 import sys
+from zope.interface import implements
+
+from twisted.internet import protocol
 from twisted.internet import threads, reactor
+from twisted.web.iweb import IBodyProducer
 
 from scapy.all import utils
 
 from ooni.utils import log, txscapy
 
-def getClientAddress():
-    address = {'asn': 'REPLACE_ME',
-               'ip': 'REPLACE_ME'}
-    return address
+userAgents = [("Mozilla/5.0 (Windows; U; Windows NT 5.1; en-GB; rv:1.8.1.6) Gecko/20070725 Firefox/2.0.0.6", "Firefox 2.0, Windows XP"),
+              ("Mozilla/4.0 (compatible; MSIE 7.0; Windows NT 5.1)", "Internet Explorer 7, Windows Vista"),
+              ("Mozilla/4.0 (compatible; MSIE 7.0; Windows NT 5.1; .NET CLR 1.1.4322; .NET CLR 2.0.50727; .NET CLR 3.0.04506.30)", "Internet Explorer 7, Windows XP"),
+              ("Mozilla/4.0 (compatible; MSIE 6.0; Windows NT 5.1; .NET CLR 1.1.4322)", "Internet Explorer 6, Windows XP"),
+              ("Mozilla/4.0 (compatible; MSIE 5.0; Windows NT 5.1; .NET CLR 1.1.4322)", "Internet Explorer 5, Windows XP"),
+              ("Opera/9.20 (Windows NT 6.0; U; en)", "Opera 9.2, Windows Vista"),
+              ("Opera/9.00 (Windows NT 5.1; U; en)", "Opera 9.0, Windows XP"),
+              ("Mozilla/4.0 (compatible; MSIE 6.0; Windows NT 5.1; en) Opera 8.50", "Opera 8.5, Windows XP"),
+              ("Mozilla/4.0 (compatible; MSIE 6.0; Windows NT 5.1; en) Opera 8.0", "Opera 8.0, Windows XP"),
+              ("Mozilla/4.0 (compatible; MSIE 6.0; MSIE 5.5; Windows NT 5.1) Opera 7.02 [en]", "Opera 7.02, Windows XP"),
+              ("Mozilla/5.0 (Windows; U; Windows NT 5.1; en-US; rv:1.7.5) Gecko/20060127 Netscape/8.1", "Netscape 8.1, Windows XP")]
+
+class StringProducer(object):
+    implements(IBodyProducer)
+
+    def __init__(self, body):
+        self.body = body
+        self.length = len(body)
+
+    def startProducing(self, consumer):
+        consumer.write(self.body)
+        return defer.succeed(None)
+
+    def pauseProducing(self):
+        pass
+
+    def stopProducing(self):
+        pass
+
+class BodyReceiver(protocol.Protocol):
+    def __init__(self, finished):
+        self.finished = finished
+        self.data = ""
+
+    def dataReceived(self, bytes):
+        self.data += bytes
+
+    def connectionLost(self, reason):
+        self.finished.callback(self.data)
 
 def capturePackets(pcap_filename):
     from scapy.all import sniff
diff --git a/ooni/utils/otime.py b/ooni/utils/otime.py
index 11f7be1..719230e 100644
--- a/ooni/utils/otime.py
+++ b/ooni/utils/otime.py
@@ -41,3 +41,11 @@ def utcPrettyDateNow():
 
 def timeToPrettyDate(time_val):
     return time.ctime(time_val)
+
+def timestamp():
+    cur_time = datetime.utcnow()
+    d_format = "%d_%B_%Y_%H-%M-%S"
+    pretty = cur_time.strftime(d_format)
+    return pretty
+
+
diff --git a/ooniprobe.conf b/ooniprobe.conf
index 1e76ad7..47f480a 100644
--- a/ooniprobe.conf
+++ b/ooniprobe.conf
@@ -20,7 +20,7 @@ advanced:
     # XXX change this to point to the directory where you have stored the GeoIP
     # database file. This should be the directory in which OONI is installed
     # /path/to/ooni-probe/data/
-    geoip_data_dir: /home/x/code/networking/ooni-probe/data/
+    geoip_data_dir: /usr/share/GeoIP/
     debug: true
     threadpool_size: 10
 
diff --git a/to-be-ported/protocols/daphn3.py b/to-be-ported/protocols/daphn3.py
new file mode 100644
index 0000000..37c94c7
--- /dev/null
+++ b/to-be-ported/protocols/daphn3.py
@@ -0,0 +1,311 @@
+import sys
+import yaml
+
+from twisted.internet import protocol, defer
+from twisted.internet.error import ConnectionDone
+
+from scapy.all import IP, Raw, rdpcap
+
+from ooni.utils import log
+from ooni.plugoo import reports
+
+def read_pcap(filename):
+    """
+    @param filename: Filesystem path to the pcap.
+
+    Returns:
+      [{"sender": "client", "data": "\x17\x52\x15"}, {"sender": "server", "data": "\x17\x15\x13"}]
+    """
+    packets = rdpcap(filename)
+
+    checking_first_packet = True
+    client_ip_addr = None
+    server_ip_addr = None
+
+    ssl_packets = []
+    messages = []
+
+    """
+    pcap assumptions:
+
+    pcap only contains packets exchanged between a Tor client and a Tor server.
+    (This assumption makes sure that there are only two IP addresses in the
+    pcap file)
+
+    The first packet of the pcap is sent from the client to the server. (This
+    assumption is used to get the IP address of the client.)
+
+    All captured packets are TLS packets: that is TCP session
+    establishment/teardown packets should be filtered out (no SYN/SYN+ACK)
+    """
+
+    """Minimally validate the pcap and also find out what's the client
+    and server IP addresses."""
+    for packet in packets:
+        if checking_first_packet:
+            client_ip_addr = packet[IP].src
+            checking_first_packet = False
+        else:
+            if packet[IP].src != client_ip_addr:
+                server_ip_addr = packet[IP].src
+
+        try:
+            if (packet[Raw]):
+                ssl_packets.append(packet)
+        except IndexError:
+            pass
+
+    """Form our list."""
+    for packet in ssl_packets:
+        if packet[IP].src == client_ip_addr:
+            messages.append({"sender": "client", "data": str(packet[Raw])})
+        elif packet[IP].src == server_ip_addr:
+            messages.append({"sender": "server", "data": str(packet[Raw])})
+        else:
+            raise("Detected third IP address! pcap is corrupted.")
+
+    return messages
+
+def read_yaml(filename):
+    f = open(filename)
+    obj = yaml.load(f)
+    f.close()
+    return obj
+
+class Mutator:
+    idx = 0
+    step = 0
+
+    waiting = False
+    waiting_step = 0
+
+    def __init__(self, steps):
+        """
+        @param steps: array of dicts for the steps that must be gone over by
+                      the mutator. Looks like this:
+                      [{"sender": "client", "data": "\xde\xad\xbe\xef"},
+                       {"sender": "server", "data": "\xde\xad\xbe\xef"}]
+        """
+        self.steps = steps
+
+    def _mutate(self, data, idx):
+        """
+        Mutate the idx bytes by increasing it's value by one
+
+        @param data: the data to be mutated.
+
+        @param idx: what byte should be mutated.
+        """
+        print "idx: %s, data: %s" % (idx, data)
+        ret = data[:idx]
+        ret += chr(ord(data[idx]) + 1)
+        ret += data[idx+1:]
+        return ret
+
+    def state(self):
+        """
+        Return the current mutation state. As in what bytes are being mutated.
+
+        Returns a dict containg the packet index and the step number.
+        """
+        print "[Mutator.state()] Giving out my internal state."
+        current_state =  {'idx': self.idx, 'step': self.step}
+        return current_state
+
+    def next(self):
+        """
+        Increases by one the mutation state.
+
+        ex. (* is the mutation state, i.e. the byte to be mutated)
+        before [___*] [____]
+               step1   step2
+        after  [____] [*___]
+
+        Should be called every time you need to proceed onto the next mutation.
+        It changes the internal state of the mutator to that of the next
+        mutatation.
+
+        returns True if another mutation is available.
+        returns False if all the possible mutations have been done.
+        """
+        if (self.step) == len(self.steps):
+            # Hack to stop once we have gone through all the steps
+            print "[Mutator.next()] I believe I have gone over all steps"
+            print "                          Stopping!"
+            self.waiting = True
+            return False
+
+        self.idx += 1
+        current_idx = self.idx
+        current_step = self.step
+        current_data = self.steps[current_step]['data']
+
+        if 0:
+            print "current_step: %s" % current_step
+            print "current_idx: %s" % current_idx
+            print "current_data: %s" % current_data
+            print "steps: %s" % len(self.steps)
+            print "waiting_step: %s" % self.waiting_step
+
+        data_to_receive = len(self.steps[current_step]['data'])
+
+        if self.waiting and self.waiting_step == data_to_receive:
+            print "[Mutator.next()] I am no longer waiting"
+            log.debug("I am no longer waiting.")
+            self.waiting = False
+            self.waiting_step = 0
+            self.idx = 0
+
+        elif self.waiting:
+            print "[Mutator.next()] Waiting some more."
+            log.debug("Waiting some more.")
+            self.waiting_step += 1
+
+        elif current_idx >= len(current_data):
+            print "[Mutator.next()] Entering waiting mode."
+            log.debug("Entering waiting mode.")
+            self.step += 1
+            self.idx = 0
+            self.waiting = True
+
+        log.debug("current index %s" % current_idx)
+        log.debug("current data %s" % len(current_data))
+        return True
+
+    def get(self, step):
+        """
+        Returns the current packet to be sent to the wire.
+        If no mutation is necessary it will return the plain data.
+        Should be called when you are interested in obtaining the data to be
+        sent for the selected state.
+
+        @param step: the current step you want the mutation for
+
+        returns the mutated packet for the specified step.
+        """
+        if step != self.step or self.waiting:
+            log.debug("[Mutator.get()] I am not going to do anything :)")
+            return self.steps[step]['data']
+
+        data = self.steps[step]['data']
+        #print "Mutating %s with idx %s" % (data, self.idx)
+        return self._mutate(data, self.idx)
+
+class Daphn3Protocol(protocol.Protocol):
+    """
+    This implements the Daphn3 protocol for the server side.
+    It gets instanced once for every client that connects to the oonib.
+    For every instance of protocol there is only 1 mutation.
+    Once the last step is reached the connection is closed on the serverside.
+    """
+    steps = []
+    mutator = None
+
+    current_state = None
+
+    role = 'client'
+    state = 0
+    total_states = len(steps) - 1
+    received_data = 0
+    to_receive_data = 0
+    report = reports.Report('daphn3', 'daphn3.yamlooni')
+
+    test = None
+
+    def next_state(self):
+        """
+        This is called once I have completed one step of the protocol and need
+        to proceed to the next step.
+        """
+        if not self.mutator:
+            print "[Daphn3Protocol.next_state] No mutator. There is no point to stay on this earth."
+            self.transport.loseConnection()
+            return
+
+        if self.role is self.steps[self.state]['sender']:
+            print "[Daphn3Protocol.next_state] I am a sender"
+            data = self.mutator.get(self.state)
+            self.transport.write(data)
+            self.to_receive_data = 0
+
+        else:
+            print "[Daphn3Protocol.next_state] I am a receiver"
+            self.to_receive_data = len(self.steps[self.state]['data'])
+
+        self.state += 1
+        self.received_data = 0
+
+    def dataReceived(self, data):
+        """
+        This is called every time some data is received. I transition to the
+        next step once the amount of data that I expect to receive is received.
+
+        @param data: the data that has been sent by the client.
+        """
+        if not self.mutator:
+            print "I don't have a mutator. My life means nothing."
+            self.transport.loseConnection()
+            return
+
+        if len(self.steps) == self.state:
+            self.transport.loseConnection()
+            return
+
+        self.received_data += len(data)
+        if self.received_data >= self.to_receive_data:
+            print "Moving to next state %s" % self.state
+            self.next_state()
+
+    def censorship_detected(self, report):
+        """
+        I have detected the possible presence of censorship we need to write a
+        report on it.
+
+        @param report: a dict containing the report to be written. Must contain
+                       the keys 'reason', 'proto_state' and 'mutator_state'.
+                       The reason is the reason for which the connection was
+                       closed. The proto_state is the current state of the
+                       protocol instance and mutator_state is what was being
+                       mutated.
+        """
+        print "The connection was closed because of %s" % report['reason']
+        print "State %s, Mutator %s" % (report['proto_state'],
+                                        report['mutator_state'])
+        if self.test:
+            self.test.result['censored'] = True
+            self.test.result['state'] = report
+        self.mutator.next()
+
+    def connectionLost(self, reason):
+        """
+        The connection was closed. This may be because of a legittimate reason
+        or it may be because of a censorship event.
+        """
+        if not self.mutator:
+            print "Terminated because of little interest in life."
+            return
+        report = {'reason': reason, 'proto_state': self.state,
+                'trigger': None, 'mutator_state': self.current_state}
+
+        if self.state < self.total_states:
+            report['trigger'] = 'did not finish state walk'
+            self.censorship_detected(report)
+
+        else:
+            print "I have reached the end of the state machine"
+            print "Censorship fingerprint bruteforced!"
+            if self.test:
+                print "In the test thing"
+                self.test.result['censored'] = False
+                self.test.result['state'] = report
+                self.test.result['state_walk_finished'] = True
+                self.test.report(self.test.result)
+            return
+
+        if reason.check(ConnectionDone):
+            print "Connection closed cleanly"
+        else:
+            report['trigger'] = 'unclean connection closure'
+            self.censorship_detected(report)
+
+



_______________________________________________
tor-commits mailing list
tor-commits@xxxxxxxxxxxxxxxxxxxx
https://lists.torproject.org/cgi-bin/mailman/listinfo/tor-commits