[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

[freehaven-dev] Freehaven Simulation





WHAT :
-----

This message outlines a model for the Free Haven protocol and a possible
simulation. It stems from a conversation Nathan, Roger, and I had at the
end of today's meeting. 

WHY :
----

There are many values in the Free Haven protocol which are set by
individual servnet nodes or by individual users on an arbitrary basis,
such as the expiration time for a particular share. We currently have no
clue what the effect of these settings are on our servnet's reliability.
This is a major problem, especially since we need to recommend a set of
"default" settings. One possible way to determine these "default" settings
is to run a bunch of simulations using a model and see what happens.

In addition, a few weeks back Rivest pointed out that we could use a model
for the Free Haven protocol to derive proofs of robustness for specific
adversary settings. So models are a good thing and we should work on them.
:-)


SIMPLIFYING ASSUMPTIONS FOR FIRST ROUND :
----------------------------------------

This is a set of simplifying assumptions for a first model. The
goal is to "isolate" the process of trading and the notion of a node
adversary. The resulting model will be used to create a simulation of the
servnet. 

* The servnet is represented as a fully connected graph.
 
* Each node in the graph is a servnet node. Each edge is a means for 
  two nodes to reach each other. Combined with the previous, this
  means everybody can reach everybody. 

* The edges are perfectly reliable. No packets are lost or dropped.

* Time is divided into discrete "steps." The model starts at step 0 and 
  runs for an arbitrary number of steps. 

* Each node has zero or more shares. No shares are created during the 
  simulation. 

* The initial placement of shares is arbitrary (set by the user of the
  sim). Maybe it's confined to a bunch of start nodes.

* Some arbitrary percentage P of the nodes are controlled by an adversary.

* This percentage is fixed at network construction time. That is, we have
  a static adversary. 

* The adversary's only behavior is to drop packets on the floor, always. 
  The adversary does *not* decide whether or not to drop something - it 
  just drops all incoming packets.

* All nodes trust all other nodes completely and forever. That is, we
  don't model the effect of the trust network in the first version. 
  Note that this means we will get wildly different answers for constants
  and robustness in the first version than we will in the final one. 
  This will allow us to quantitatively estimate the importance of the 
  trust network!!

* All files use the same number of shares. All files use a n/2 threshold. 
  (i.e. 15 out of 30 required to reconstruct). 

* All shares are the same size. 

* Nodes have unlimited storage space for shares.  

* No shares ever expire during the time of the simulation. 

* Nodes will trade for anything. 

* All trades are always successful. 

Now the update looks like this :

At each step, each node N does the following :

	1) N decides with probability p1 "Do I want to trade?" 
	If N does not want to trade, continue.

	2) If N wants to trade, then 

		2.1) N selects uniformly at random from the set of all other
		servnet nodes a destination D
	
		2.2) N selects uniformly at random from the set of all its
		shares a share S1

		2.3) D selects uniformly at random from the set of all its
		shares a share S2

		NOTE 1 : What if either N or D have no shares?? My
		inclination is to just trade for a "null share." It
		doesn't make sense to forbid trades, or else trading
		is quickly restricted to a very small set of nodes. 
		What do we do in the real protocol, again?

		2.4) The nodes N and D exchange their shares S1 and S2

	3) If D is an adversary node, then it destroys S1. If D is not
	an adversary node, then it keeps S1.

		NOTE 2 : What should an adversary node trade out??
		In the model as just described, it never has any shares
		to trade (by definition!). Should it always trade out
		null shares, or should it make up special 
		marked "adversary shares" so we can see how far they
		propagate through the network?

	4) Continue. 

The simulation parameters :

	* The percentage of adversary nodes.

	* The probability p1 that each node will trade. 

	* The initial distribution of shares. 


QUESTIONS TO ASK :
-----------------

* After T time steps with percentage P of evil nodes, how many shares
  have been destroyed?

(from now on, the "with percentage P" is implied)

* After T time steps, how many files are no longer recoverable (i.e. 
  more than half of their shares destroyed)?

* What is the first time step at which a share is destroyed?

* What is the first time step at which a file becomes unrecoverable?
  i.e. more than half of its shares are destroyed?

* What are the mean, median, mode, and range of these fun statistics?
  Do they follow any recognizable distribution?
  What do their graphs look like?
* What's the *variance* of these statistics? 



Questions for Future Models :

i.e. "what can we learn by relaxing the assumptions?"

* How does the expiration date affect the picture?

	* "Quit before you're fired" -- can you set an expiration date
	such that for percentage of adversaries P, the file will expire
	BEFORE the adversaries are able to destroy it, on average?
	
	The benefit to expiring naturally is that you always know when 
	a file expires natrually. So you reconstruct it and put it back
	in, and "start the clock" over again. In contrast, you don't know
	necessarily if someone is attacking you. 

* What happens if some nodes trade more often than others?

* What happens if the adversary doesn't _always_ drop on the floor, but
  only does so if it's a file he doesn't like?

* What happens if the adversary only drops on the floor with some 
  (arbitrary) probability?

* What does the trust network do to all these statistics? (note : putting
  the full trust network into the model sounds like way too much work to 
  do before end of term. still something to do in the future, since
  then we can see how it helps.)


WHERE TO GO FROM HERE :
----------------------

* Nathan -- you said doing this first simulation might look cool. This
would be really really really really really neat if you could do it. 
Please don't hesitate to e-mail with questions / musings / status. 

* Everybody -- identify stupid assumptions I've made. Let me know which
  are more important to relax than others. 

* Everybody -- Figure out cute ways to mathematically describe what we're
  currently putting into simulations. Statistics is our friend, so is 
  combinatorics...

* MORE STATISTICS and THINGS WE WANT TO KNOW -- we want concrete METRICS
  for ROBUSTNESS. What does it mean to "resist attack" ????????

* MORE QUESTIONS TO ASK FROM THE MODEL. 


Thanks, 
-David