[Author Prev][Author Next][Thread Prev][Thread Next][Author Index][Thread Index]

Re: [tor-dev] GSoC 2021 - Alexa Top Sites Captcha and Tor Block Monitoring #Update



On Mon, Jul 12, 2021 at 05:01:35PM +0530, Apratim Ranjan Chakrabarty wrote:
> ** Looking forward for suggestions and comments as to how to improve on it.
> Also materials like research paper in this domain would be helpful **

Section IV-C of the ICLab paper has discussion of block page detection.
The first pass is regex for known block pages, but there is also
clustering by similar HTML structure and text.
https://censorbib.nymity.ch/#Niaki2020a
https://github.com/net4people/bbs/issues/52

The 2016 "Do You See What I See?" study seems to be in line with your
project. "The second-class treatment of anonymous users ranges from
outright rejection to ... imposing hurdles such as CAPTCHA-solving....
Our study draws upon ... scans of the home pages of top-1,000 Alexa
websites through every Tor exit..." Section V-A has to do with scans of
top-ranked sites.
https://www.ndss-symposium.org/wp-content/uploads/2017/09/do-you-see-what-i-see-differential-treatment-anonymous-users.pdf
https://archive.org/details/ndss16doyousee
_______________________________________________
tor-dev mailing list
tor-dev@xxxxxxxxxxxxxxxxxxxx
https://lists.torproject.org/cgi-bin/mailman/listinfo/tor-dev