[Author Prev][Author Next][Thread Prev][Thread Next][Author Index][Thread Index]

Re: [tor-talk] tor2web - How can I get my hidden service indexed?



On 2/14/13 10:28 PM, tor@xxxxxxxxxxxxxxxxxx wrote:
> On 14/02/13 10:55, Fabio Pietrosanti (naif) wrote:
>
> > Tor2web software by default: - setup a robots.txt to prevent search
> > engine scraping - block a wide set of Crawler UA to further block
> > search activities - prevent hotlinking (from an internet resource
> > to do <img src="https://blahblah.tor2web.org/image.jpg";>
>
> This seems like a strange default to me.
The hotlinking is required to avoid having people linking "highly
controversial material" on public internet forum, using the few Tor2web
proxy as a sort of "Content Delivery Network" .
> I can see why people would
> want to create hidden services that can be discovered using ordinary
> channels on the Internet such as search engines.
>
> If a hidden service operator actually wanted to block search engines,
> they'd know to create their own robots.txt file, or to add appropriate
> meta tags to their HTML, or to simply block based on the User-Agent
> header...
Tor2web 3.0 beta1 is not a final solution, it would require a lot of
additional code and features to make it really flexible (like permitting
a TorHS operator to configure this robots.txt behavior).

However in the meantime there's a simple reason, survival of services,
to avoid the general indexing of Tor2web exposed Tor Hidden Services.

Given the experience, it's much more difficult to keep running a Tor2web
server, rather than keep running a Tor Exit Node with completely open
exit-policy.

If you enable "google indexing", the amount of complaints that you will
receive will exponentially increase, quickly creating serious issue in
being able to keep the Tor2web proxy running. :\

Fabio

_______________________________________________
tor-talk mailing list
tor-talk@xxxxxxxxxxxxxxxxxxxx
https://lists.torproject.org/cgi-bin/mailman/listinfo/tor-talk