[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

[freehaven-dev] retrieval issues


I'm currently doing research in existing P2P System's methods of information 
retrieval for my diploma. While it seems that most of the systems take the 
most simple approach of using a string or substring-match - No system I've 
seen so far supports Meta-Data - it seems that freehaven has a potential 
design flaw that is related to the retrieval technique.

Freehaven does not forward search requests to avoid the 
gnutella-broadcast-flooding of a network. So it depends on a fully connected 
network to be sure, that your shares will be traded everywhere and more 
important, can be retrieved immediatly after injection.

The protocol has no clear way how to make sure full connectivity is enforced. 
It suggests servers to become "Introducers", but these are free to trade 
their knowledge of network topology - if they dont, you are left alone in the 
dark until someone decides to trade with you. A very unlikely issue, since 
you are unknown. The automatic traders will almost never choose you as a 
primary trade-partner, so your career as a well behaved freehaven stalls from 
the beginning.

While this might be ok - it will take much more time to attack the network 
through fake-servers - it starts concentrating the network around the 
This concentration raises the chance for an introducer to do successful 
buddy-attacks and it gives a potential attacker the chance that shutting down 
only a small number of introducers will hurt content, since many shares are 
traded between the introducers.

Another point that made me think is the potential resource issue: If the 
network needs to be fully connected, every machine needs to know every other. 
Since the network needs a high number of servers in the servnet to fulfill 
the anonymity claim, this leads to a high storage capacity on the server to 
keep the servnet in memory.

The routing issue might be solved, if the server himself decides whom to 
broadcast next. Only change in the protocol would be to return a message: 
"Sorry dude, what you need is not here, but you may ask these servers." The 
data part of the reply contains a number of servers, the queried server 
knows. The reader can then decide which servers on the list have already been 
queried and which are missing.


PS: If you know any papers about p2p information retrieval - please let me 
know, thanks....

Thomas Strauß    | "He breathed in the chill kelp-and-salt scent of the beach;
LeipzigerStr 61  |  the intense familiarity of the scent triggered a million
66113 Saarbrücken|  memories at once, and he knew he was home."
+49-681-5892772  |                    (aus "Green Mars", Kim Stanley Robinson)