[Author Prev][Author Next][Thread Prev][Thread Next][Author Index][Thread Index]

Re: [tor-talk] Funded search engine for onionspace?



Leeroy, to avoid being indexed by Googlebot et al, place the appropriate
/robots.txt at your root.  It's described in the FAQ.

http://www.onion.city/faq.html

As a historical note, the reason Aaron and I chose Tor2web's URL design was
so search engines would automatically see any /robots.txt an onionsite
specifies.


-V

On Fri, Feb 13, 2015 at 3:30 PM, l.m <ter.one.leeboi@xxxxxxxx> wrote:

>
> >Alas no.  I'm aware this is suboptimal.  I see GOOG search engine as
> a
> >temporary-ladder just to get the ball rolling.  I am open to using
> any
> >other index.  For what it's worth I'm very pleased with GOOG's
> >performance---right now it's searching an index of 650k onion pages
> and the
> >number grows every day.
>
> If you instead use a google search appliance couldn't you use google
> engine for indexing without having to use google itself? Wouldn't that
> also avoid the problem of google queries being associated with the
> client making the request?
>
> >Although we technically could read provided passwords, we don't keep
> logs
> >of passed traffic.  However, I understand that many users don't
> understand
> >the tor2web threat model.  But this is the same as all Tor2web nodes,
> yes?
> >This is not at all unique to OnionCity.  As far as I know all Tor2web
> nodes
> >allow form submissions.
>
> What is unique to onion.city is that access to someonion.onion.city
> occurs using http and doesn't redirect to the .onion if Tor is in use.
> That the tor2web mirror might snoop is implicit--that the exit (if
> using tor) might also snoop is more of a concern.
>
> >You mentioned it'd be better to have it randomly pick among the
> available
> >Tor2web nodes instead of everything going through OnionCity.  This
> breaks
> >the GOOG search engine which only wants to return "canonical" URLs.
> We
> >could talk about making OnionCity a DNS round-robin akin to how
> Tor2web.org
> >currently works, but then I'm just replicating Tor2web.
>
> The ability of tor2web to provide mirrors should be optional. If you
> only know one mirror and that mirror cannot service the request then
> how are you going to get any of the other mirrors? Google engine can
> return related addresses in an order based on the success of loading
> the mirror itself. If onion.city always works it will tend to precede
> tor2web.org. If onion.city goes down (having search front-end separate
> from tor2web mirror) the search engine can reorder the result to
> improve the success of the first click.
>
>   >Right now I aggregate existing lists of onion sites and put them
> into the
> >site map.
>   >* https://ahmia.fi/onions/
>   >* http://skunksworkedp2cg.onion.city/sites.txt
>   >* http://xlmvhk3rpdux26dz.onion.city/
>   >* http://kkkkkku5juzqh33a.onion.city/
>
> If google is itself handling the indexing won't that cause a problem
> for sites in those lists, which are normally okay with being indexed,
> just not by googlebot? I for one couldn't care less about being
> indexed by ahmia.fi but it'll be a cold day in hell before I let
> googlebot. Precisely because of how easy it is to link the search to
> the requester.
> --leeroy
> --
> tor-talk mailing list - tor-talk@xxxxxxxxxxxxxxxxxxxx
> To unsubscribe or change other settings go to
> https://lists.torproject.org/cgi-bin/mailman/listinfo/tor-talk
>
-- 
tor-talk mailing list - tor-talk@xxxxxxxxxxxxxxxxxxxx
To unsubscribe or change other settings go to
https://lists.torproject.org/cgi-bin/mailman/listinfo/tor-talk