[Author Prev][Author Next][Thread Prev][Thread Next][Author Index][Thread Index]

Re: [tor-talk] Craigslist now giving Tor the slows, lol



On Sat, Jun 14, 2014 at 12:38 AM, Mirimir <mirimir@xxxxxxxxxx> wrote:
>> No more variance = tor issue.
>> Still variance = IP <--> CL stack/path issue, or CL issue alone.
> As I understand it, you're just getting the HTML. I'm getting the entire

It was the first time I saw any site serving slow to some tor exits.
So I removed all variables and went for a single url fetch to confirm...
no recursion, redirects, embedded elements, robots.txt, or anything else.
I'm waiting for a slow affected exit operator to get back to me about
test to eliminate unlikely possibility of tor software itself.

> page, or at least whatever Midori grabs while pretending to be Firefox.
> For example, I get http://xvideos.com/ with numerous (X-rated) images ;)
>
> Also, I was hitting sites at 1-2 minute intervals

This may actually be far less than overall fetch rate from tor users
to the top50, and certainly insignificant to the sites daily hit count.
Someone needs to research overall exit traffic sometime too.

> craigslist, the greatest loading time was about 500 seconds. So perhaps

If other sites are loading similarly slow it may be possible to find out
why or what is being used to do it. CL never replies to support queries.

> 30-60 minutes to 20-40 minutes. That may reduce page-size variance.

A lot of the top50 use dynamic 'content' so it is expected on those,
unless fetching single elements.

> There's also wkhtmltopdf. Maybe it does a better job, being lighter even
> than Midori. But I worry that it also may look less like a browser than
> command-line Midori.

I'm not too worried about emulation/hiding unless it affects the results
being studied. ie: content/blocking differences depending on supplied
User-agent.

> Once I work out kinks, and collect enough data, I'll write this up
> somewhere with results for all 50 top sites.

Good, we are doing some generic things it seems. And should not
use this CL specific thread subject anymore for it :)
-- 
tor-talk mailing list - tor-talk@xxxxxxxxxxxxxxxxxxxx
To unsubscribe or change other settings go to
https://lists.torproject.org/cgi-bin/mailman/listinfo/tor-talk