[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [freehaven-dev] Searching Free Haven



On Fri, Mar 30, 2001 at 02:12:41PM +0200, Stefan Karrmann wrote:
> I've read some of the documentation at freehaven.net, but haven't found
> a proposal how to SEARCH free haven.

You're right. The only answer we have for that currently is "keep a
directory somewhere outside the system." Since Free Haven servers can
learn about a PK_doc from each share they have, it's possible for them
to request those files and then manually figure out what they are. (Or
have a convention in the file that says what it is in a computer-readable
format, or however else you want to do it.)

We've mostly been ignoring the search issue, figuring that it's hard
enough to get the other goals we're aiming for.
 
> Optimal would be a distributed search architecture, i.e. every search
> node knows every free haven document (=> exhaustive search possible!)
> and may trade a search request (=> balance search load).
> 
> A client can lookup a regular expresion and the server ansewers with
> a list of URNs and samples, e.g. nothing, a thumbnail or 5 seconds of
> sound or video.

That's a good idea. But it's a bit more convenience that we can afford
to provide these days. Have you considered layering this onto, say, Mojo
Nation or Gnutella? It should work very very well in the Mojo Nation
environment, since you can pay a bit more mojo for a more informative
response to the query.
 
> Further, I want to suggest to incoporate caching into free haven. As
> long as a server has free free haven storage, it should use it for caching.
> This may be triggered by a second fetch request of the same document in
> a given time span. Maybe it is only requested for caching with some (low)
> probability. (e.g.: if (rand () > 0.8) cachit (); )

Agreed. Yet we have to weigh the space used by caching against the space
used by "backup shares" -- we already specify that when you trade away a
share, it would be nice to keep a copy of it for a while. Some compromise
needs to be considered there.

In the big picture: the current Free Haven design is bad, for two reasons:
1) We broadcast queries. That just sucks. It's bad.
2) We rely on a secure mix-net, when there is no wide deployment of such a
mix-net now.

1 we can solve with a distributed lookup system such as Chord or what
the Oceanstore system uses. 2 we are working to solve here, with the
Tarzan design. Once it becomes clearer how the solutions to 1 and 2 will
affect design decisions, we'll probably end up with a new Free Haven
design that is more realistic but still tries to achieve our goals
(dynamic network, robust network, anonymity/pseudonymity for all parties,
publisher-specified document lifetimes).
 
--Roger