[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [freehaven-dev] Some possible weaknesses?

On Tue, 1 Feb 2000, Michael J. Freedman wrote:

> originates.   In a very simple case, given 5 servers:  4 store 1 MG, 1
> stores 50 MB.  The very large files floating around the servnet very likely
> originated from the last server.

It seems to me that this points out a possible problem with introducing a
file to the servnet : what happens when the introducing node _can't_ 
trade away any of the newly formed shares? In this case,
the share size will be >1 MB for some files, and so all
shares of such files would have to stay on the 50MB server. 

Should the introduction fail, rather than allow a file to circulate which
can easily be traced? if the introduction does fail, do we care about the
DoS attack of killing trading requests until an introducing node kills a
new file? could we prevent such an attack by making trading messages 
indistinguishable from other messages, and so making such a selective DoS
impossible? does the introducing node announce that the introduction 
failed, or does it wait for the original owner to notice the file is gone? 

This also touches tangentially on another point : do we allow servers
to hold multiple shares of the same file? What happens if we allow
share sizes to vary?

As currently stated in 5.3, the protocol takes a file F and breaks
it into N shares f_1 , f_2 .... f_n. Any k of these shares is sufficient
to reconstruct F. So they are all of equal "size" or "importance".
If you have two of these f_i shares , then you in some sense have a single
share which has twice the "size" of the other shares. When I say "share
sizes vary", I mean that we can create shares whose importance to
a reconstruction of F can be expressed as something like a percentage.
So we might say "f_1 is 20% important, f_2 is 40% important, " and so on
-- any combination of shares with 100% or more importance reconstructs.

I think we may be able to use varied share sizes

a) to take advantage of better-performing servnet nodes
b) to express trust or lack of trust in a particular servnet node
   by giving it more or less importance in the eventual reconstruction

Here's an example for a) :

Suppose we have 10 servnet nodes. One of them, node N, has much better
latency and more disk space than the others. In the current system, we
might break up a file into 10 shares, such that 5 are needed to
reconstruct, and distribute one share to each node.

Now at reconstruct time, N reports quickly. 3 other nodes report almost as
quickly. The next node, however, is very long in responding, and holds
up the entire reconstruction process.

What if we broke the file into 11 shares, such that any 5 are needed to
reconstruct, and gave 2 shares to node N ? 

Now reconstruction happens as soon as N and 3 other servers report, and
so may be much quicker. At the same time, we can still tolerate the
failure of *any* 5 out of the 10 nodes. The cost is that we use more
of N's disk space...but N has a lot of that, so we're OK. 

Another situation might be when we have a few high-bandwidth, high
disk-space nodes serving data in legally unfriendly countries. Smaller
"backup" shares could be kept on lots of bandwith-starved, low disk space
nodes in more, uh, congenial legal climates. When the high-bandwidth
nodes disappear, reconstruction automatically uses these "backup" shares.

b) I haven't thought about much yet. would it be useful?

-David Molnar