[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

[minion-cvs] Add list of introductory (heh) projects



Update of /home/minion/cvsroot/src/minion
In directory moria.mit.edu:/tmp/cvs-serv2109

Modified Files:
	HACKING 
Log Message:
Add list of introductory (heh) projects

Index: HACKING
===================================================================
RCS file: /home/minion/cvsroot/src/minion/HACKING,v
retrieving revision 1.14
retrieving revision 1.15
diff -u -d -r1.14 -r1.15
--- HACKING	8 Jan 2003 06:28:48 -0000	1.14
+++ HACKING	16 Jan 2003 05:57:01 -0000	1.15
@@ -1,4 +1,4 @@
-Hacking Mixminion                                  -*- Text -*-
+Hacking Mixminion
 
 Requirements:
         Python >= 2.0 (see PORTING NOTES below)
@@ -8,15 +8,191 @@
 
         A working /dev/urandom (see PORTING NOTES below)
 
-Things to hack:
-        See the TODO list.
+THINGS TO HACK:  (Good introductory projects that I won't get to myself for 
+  at least a version or two.)
+
+- We need a windows port!  (A client-only port would be a good start.)  Most
+  of the code is in Python, and what C we have is pretty standard, so I don't
+  anticipate too many problems.  Some known issues in the source are marked
+  with comments labeled WIN32.  Other areas of potential difficulty are:
+      - The build process.  (I don't know the first thing about how to build
+        Python modules under windows.  Python distutils should work, but I
+        don't know what compiler is appropriate.  Studying the Python
+        webpages may be useful.)
+      - Binaries and installer.  (We should provide two versions: one that
+        comes with a Python interpreter, and one that doesn't. They should
+        both be easy to generate.)
+      - The signal code may not be correct; Windows signals may work
+        differently in subtle ways.  [This is mostly a server issue, but the
+        client code uses SIGALRM to timeout dead connections.]
+      - The 'daemonize' code in ServerMain.py is almost certainly not
+        write on windows.
+      - We assume the existence of /dev/urandom or some other similar entropy
+        source; these do not exist on windows.  OpenSSL's RAND_* functions
+        may be helpful, but coming up with good seed information may be
+        tricky. Clients can use RAND_event, but servers will need to get
+        entropy from somewhere else.
+      - There will be other, as-yet-unexpected issues. :)
+  (Difficulty: moderate.  Invasiveness: moderate)
+
+- We need some distributed stress-test code.  Ideally, it should use SSH to
+  build Mixminion on a number of machines and start servers on those
+  machines, then run a bunch of clients to send messages through the system.
+  It should be possible to run the code with different network and server
+  configurations.  (Difficulty: moderate.  Invasiveness: none.)
+
+- The current implementations for the MBOX and SMTP modules open a new
+  connection to the local MTA for each outgoing message.  This is
+  inefficient; they should batch as much as possible.  (Difficulty: easy.
+  Invasiveness: slight.)
+
+- The current implementation for the MBOX and SMTP modules do not support
+  ESMTP (over TLS).  They should.  (Difficulty: easy.  Invasiveness: slight.)
+
+- In addition to the stress-test code above, we should have some automated
+  integration test code to start many servers with different configurations
+  on the same server, and test them with calls to the client code.  This can
+  probably share a good deal of logic with the stress-test code.  A nice
+  extra would be to allow testing multiple versions of Mixminion with
+  multiple versions of the client, and have multihost support. (Difficulty:
+  moderate. Invasiveness: none.)
+
+- It would be neat if all the boilerplate that servers spit out were
+  configurable via some kind of generic boilerplate mechanism, and stored in
+  separate files.  (Difficulty: easy.  Invasiveness: slight.)
+
+- We should throttle bandwidth under high load.  To implement this, every
+  connection needs to keep track of how many bytes it's sent or received,
+  and stop trying to read or write after a certain threshold is passed.
+  Every timeslice (say, 1 or 2 seconds), we reset the quota and restart the
+  connections.  A good implementation would support:
+      - Per-connection limits
+      - Total-bandwidth limits
+      - Noticing multiple connections from the same IP.
+      - Throttling delivery as well as MMTP
+      - Throttling CPU usage in the processing thread.
+  (Difficulty: moderate.  Invasiveness: moderate.)
+
+- TLS sessions might be beneficial, especially for server->server
+  connections.  The spec allows them, but only under specific circumstances.
+  If you're doing this, you should benchmark the code thoroughly to be sure
+  that you're actually improving performance. (Difficulty: easy/moderate,
+  depending on how much OpenSSL you know.  Invasiveness: moderate.)
+
+- Write us an init.d script.  (Difficulty: easy. Invasiveness: none.)
+
+- If you have access to a multiprocessor machine, it would be nice to make
+  good use of more than one CPU.  Right now, the network code runs in
+  parallel with the processing code, but the processing thread accounts (I
+  think) for most of the CPU use.  It would be nice to support multiple
+  processing threads and multiple network threads (round-robin, not
+  one-per-connection). (Difficulty: easy/moderate, depending on your 
+  knowledge of writing multithreaded code. Invasiveness: slight.)
+
+THINGS TO THINK ABOUT AND HACK: (Introductory projects that will take some
+  specification work.  Please, get your spec discussed on mixminion-dev and
+  checked into CVS *before* you submit an implementation for any of these!)
+
+- There should be a way for clients to set some (but not all) RFC822 email
+  headers.  Likely candidates are 'Subject' and 'Content-Type'.  Maybe there
+  should be *limited* support for setting 'From'.  [For example, if the user
+  wants to set 'From' to "xyzzy", the From line would appear as "From:
+  "Anonymous user claiming to be xyzzy" <nobody@your-remailer>".  Maybe.]
+  Whatever you choose, be sure to justify it.  (Spec difficulty: easy, but
+  you're likely to start a holy war and end up pleasing nobody.
+  Implementation difficulty: easy.  Invasiveness: slight.)
+
+- Maybe, there should be support for multiple exit addresses (cc, bcc, etc.)
+  This is prone to abuse... (Spec difficulty: easy, once you've figured out
+  how to limit abuse.  Implementation difficulty: easy.  Invasiveness:
+  slight.)
+
+- We should have an incoming email gateway for users to use reply blocks to
+  send messages anonymously without using Mixminion software.  (Spec
+  difficulty: easy. Implementation difficulty: moderate.  Invasiveness:
+  slight.)
+
+- Want a real challenge?  We have an allusive description for how to do
+  K-of-N fragmentation in our E2E-spec documentation.  Go flesh out the
+  description and implement it.  (Spec difficulty: moderate.  Implementation
+  difficulty: hard.  Invasiveness: some.)
+
+- Right now, we never generate link padding or dummy messages.  The code is
+  there, but it never gets triggered.  Specify when it gets triggered (and
+  justify why this improves anonymity).  (Spec difficulty: ????.
+  Implementation difficulty: easy once you know how.  Invasiveness: some.)
+
+- We could use IPv6 support.  The big specification problem here is routing:
+  an IPv4-only server simply cannot deliver to a server without an IPv4
+  address.  Any path-generation algorithm I can come up with has troublesome
+  anonymity implications.  If you can come up with an algorithm that
+  doesn't, the code should be pretty easy to do.  (Spec difficulty: ????. 
+  Implementation difficulty: easy.  Invasiveness: some.)
+
+NON-HACKING PROJECTS:
+
+- We need documentation.
+
+- We need a man page.
+
+- See the spec for open issues.
+
+- We need a better web page.
+
+- We need a logo.  (An alien with an eggbeater or an alien at a DJ's mixer
+  are two ideas.)
+
+- We need a bugtracker.
+
+HARD HACKING:
+
+- Do you want to dive into Python and OpenSSL internals?  Right now, we allow
+  all our data to get swapped out to disk.  That's no good!  Now, you _could_
+  install an OS that encrypts your swap, but that's not an option for
+  everybody.  Write code to use memlock to protect sensitive data structures
+  (keys, packets, etc) as necessary.  [If you're feeling unsubtle, use
+  memlockall to keep *all* data from getting swapped out... but watch out for
+  ticked-off admins!]  (Difficulty: easy/hard depending on whether you take
+  the easy way out with memlockall, or whether you actually do the right
+  thing.  Invasiveness: all over the place.)
+
+- Want a real challenge?  Right now, we store sensitive files right on the
+  file system, and do our best to overwrite them when they need to be
+  deleted... but go read the comment in Common.py to see why this doesn't
+  really work.  We *could* have people use encrypted filesystems.... but
+  that's not an option for everyone.  Here's what you do:  Write code for a
+  generic encrypted filestore, and have Mixminion use that as appropriate
+  instead of the filesystem directly.  This will be easier than writing a
+  real encrypted filesystem, since:
+      - All of the files are the same size (within a few K).
+      - There are no hard or soft links
+      - There are no attributes: no ownership, no modes, no special files, no
+        atime/mtime/ctime... just name and size.
+      - There is no arbitrary nesting of directories; the set of directories
+        is very small.
+      - No more than one process needs to be able to access the filestore at
+        a time.  (Though multiple threads might need to.)
+  Your code should probably be generic enough for other Python projects to
+  use. :)  (Difficulty: hard.  Invasiveness: moderate.)
+
+- Port the code to use NSS or libgcrypt/GNUTLS instead of/in addition to
+  OpenSSL.  (The OpenSSL license conflicts with the GPL, and makes us
+  unlinkable with GPL'd code under some circumstances.)  Note that NSS does
+  not, today, have any support for server-side DHE;  if you're going to go
+  down the NSS route, you should contribute an implementation for server-side
+  DHE.  (Difficulty: hard.  Invasiveness: moderate.)
+
+- We need automatic key rotation and key generation.  (Difficulty:
+  hard. Invasiveness: moderate.)
+
+- We could use a nymserver.
 
 DESIGN PRINCIPLES:
     - It's not done till it's documented.
     - It's not done till it's tested.
     - Don't build general-purpose functionality.  Only build the
       operations you need.
-    - "Premature optimization is the route of all evil." -Knuth
+    - "Premature optimization is the root of all evil." -Knuth
       Resist the temptation to optimize until it becomes a necessity.
 
 CODING STYLE:
@@ -90,3 +266,10 @@
       (nickm@freehaven.net).
 
 --Nick
+
+(for emacs)
+  Local Variables:
+  mode:text
+  indent-tabs-mode:nil
+  fill-column:77
+  End: