[Author Prev][Author Next][Thread Prev][Thread Next][Author Index][Thread Index]

Re: [tor-talk] Craigslist now giving Tor the slows, lol

On 06/13/2014 08:43 PM, grarpamp wrote:
> On Fri, Jun 13, 2014 at 12:54 AM, Mirimir <mirimir@xxxxxxxxxx> wrote:
>> http://bayimg.com/lAoiMAAfL
>>> For three of the ten exits used (default client choices) in the first
>>> test series, http://craigslist.org/ loaded in about eight seconds (30-50
>>> Kbps). For the other seven, it took several minutes (~1 Kbps).
>>> I don't see any obvious correlation with blacklist status.
> Since you seem to be letting tor pick its exits, sampling for
> days might give a wider representative spread of exits. And
> could plot things for each exit over time.

Yes, I'm letting Tor pick. I want to get results that are directly
relevant for actual users. There will still be some artifacts due to
console-based snapshotting, however. And yes, I'll be running this for a
while, and perhaps multiple VMs in parallel. So far, I'm collecting data
for the top 50 websites. I will look at behavior over time.

> I see major prevalence of variation in your returned page lengths
> in bytes, almost every exit varied. Only about 1% of my single fetch
> across 1200+ exits varied from the exact normal byte count. It should
> be determined whether tor software somehow causes this when
> carrying 'slow' packet streams. By running in a loop the fetch from
> bound to the exit IP of an exit relay affected by both slowblocking,
> and showing byte variance.

As I understand it, you're just getting the HTML. I'm getting the entire
page, or at least whatever Midori grabs while pretending to be Firefox.
For example, I get http://xvideos.com/ with numerous (X-rated) images ;)

Also, I was hitting sites at 1-2 minute intervals, and successive hits
sometimes used the same exit (and perhaps the same circuit). Excluding
craigslist, the greatest loading time was about 500 seconds. So perhaps
I was overlapping too much at times. I've increased sleep between site
loads to 8-12 minutes, and decreased sleep between 50-site runs from
30-60 minutes to 20-40 minutes. That may reduce page-size variance.

> No more variance = tor issue.
> Still variance = IP <--> CL stack/path issue, or CL issue alone.
> Will look at your Midori tool and maybe more of this type of project
> sometime later.

There's also wkhtmltopdf. Maybe it does a better job, being lighter even
than Midori. But I worry that it also may look less like a browser than
command-line Midori.

Once I work out kinks, and collect enough data, I'll write this up
somewhere with results for all 50 top sites.
tor-talk mailing list - tor-talk@xxxxxxxxxxxxxxxxxxxx
To unsubscribe or change other settings go to