Re: [tor-talk] tor2web - How can I get my hidden service indexed?

On 2/14/13 10:28 PM, tor@xxxxxxxxxxxxxxxxxx wrote:
> On 14/02/13 10:55, Fabio Pietrosanti (naif) wrote:
> > Tor2web software by default: - setup a robots.txt to prevent search
> > engine scraping - block a wide set of Crawler UA to further block
> > search activities - prevent hotlinking (from an internet resource
> > to do <img src="https://blahblah.tor2web.org/image.jpg";>
> This seems like a strange default to me.
The hotlinking is required to avoid having people linking "highly
controversial material" on public internet forum, using the few Tor2web
proxy as a sort of "Content Delivery Network" .
> I can see why people would
> want to create hidden services that can be discovered using ordinary
> channels on the Internet such as search engines.
> If a hidden service operator actually wanted to block search engines,
> they'd know to create their own robots.txt file, or to add appropriate
> meta tags to their HTML, or to simply block based on the User-Agent
> header...
Tor2web 3.0 beta1 is not a final solution, it would require a lot of
additional code and features to make it really flexible (like permitting
a TorHS operator to configure this robots.txt behavior).

However in the meantime there's a simple reason, survival of services,
to avoid the general indexing of Tor2web exposed Tor Hidden Services.

Given the experience, it's much more difficult to keep running a Tor2web
server, rather than keep running a Tor Exit Node with completely open

If you enable "google indexing", the amount of complaints that you will
receive will exponentially increase, quickly creating serious issue in
being able to keep the Tor2web proxy running. :\


