[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
[freehaven-dev] retrieval issues
Hi,
I'm currently doing research in existing P2P System's methods of information
retrieval for my diploma. While it seems that most of the systems take the
most simple approach of using a string or substring-match - No system I've
seen so far supports Meta-Data - it seems that freehaven has a potential
design flaw that is related to the retrieval technique.
Freehaven does not forward search requests to avoid the
gnutella-broadcast-flooding of a network. So it depends on a fully connected
network to be sure, that your shares will be traded everywhere and more
important, can be retrieved immediatly after injection.
The protocol has no clear way how to make sure full connectivity is enforced.
It suggests servers to become "Introducers", but these are free to trade
their knowledge of network topology - if they dont, you are left alone in the
dark until someone decides to trade with you. A very unlikely issue, since
you are unknown. The automatic traders will almost never choose you as a
primary trade-partner, so your career as a well behaved freehaven stalls from
the beginning.
While this might be ok - it will take much more time to attack the network
through fake-servers - it starts concentrating the network around the
introducers.
This concentration raises the chance for an introducer to do successful
buddy-attacks and it gives a potential attacker the chance that shutting down
only a small number of introducers will hurt content, since many shares are
traded between the introducers.
Another point that made me think is the potential resource issue: If the
network needs to be fully connected, every machine needs to know every other.
Since the network needs a high number of servers in the servnet to fulfill
the anonymity claim, this leads to a high storage capacity on the server to
keep the servnet in memory.
The routing issue might be solved, if the server himself decides whom to
broadcast next. Only change in the protocol would be to return a message:
"Sorry dude, what you need is not here, but you may ask these servers." The
data part of the reply contains a number of servers, the queried server
knows. The reader can then decide which servers on the list have already been
queried and which are missing.
Regards,
Thomas
PS: If you know any papers about p2p information retrieval - please let me
know, thanks....
--
Thomas Strauß | "He breathed in the chill kelp-and-salt scent of the beach;
LeipzigerStr 61 | the intense familiarity of the scent triggered a million
66113 Saarbrücken| memories at once, and he knew he was home."
+49-681-5892772 | (aus "Green Mars", Kim Stanley Robinson)
public_key.asc